How Will Visual And Voice Search Evolve?

"We could go from heat mapping clicks on a page to now 'emotion mapping,'" says Bing Ads' Purna Virji.

The recent update of Bing’s visual search comes just as the role of  voice-activated searches through digital assistants like Amazon’s Alexa, Apple’s Siri, Google Assistant, and Microsoft’s own Cortana, are  starting to become more mainstream  among consumers and marketers.

At the moment, the marketing applications of these two search methods are fairly nascent. We checked in with Purna Virji, Senior PPC Training Manager at Bing Ads, to offer a word and a glimpse at the potential of visual and voice search for brands.

GeoMarketing: We recently wrote about the enhancements to Bing Visual Search and what it means from a marketing perspective?

Purna Virji: Let’s be clear first: Visual search is not really being monetized just yet. What Google is doing and Bing is piloting is showing shopping ads in the image results page. Could this continue to evolve? If the demand comes from the client, then yes.

Visual search isn’t anything new; I believe Google started doing it back in 2003. Now, we find it across search engines and in apps like Amazon and CamFind, since it’s gotten more powerful. The advances in artificial intelligence have made a great difference in what visual search can accomplish.

The ability for the machine to look at an object in a photo and recognize it in a way a human can- understanding context and emotion for example — is a fairly recent development. To do visual search in the past was limited. But now, the potential is practically limitless.

What are some examples of that potential?

Discovery is a clear example. Getting your product in the front of people through visual search is unlike anything we’ve done before. If I’m walking down the street and I see a pair of shoes that look great, I might be too uncomfortable to ask a stranger where they got it. But with a quick photo of the shoes and will tell me what the brand is, where to buy it, how much it costs. The world becomes your showroom and every person becomes your model.

In addition to changing the nature of discovery, visual search helps you look for something online when you don’t have the exact words for it. You can take a photo of any plant, for example, and find out what it is.

Basically, it cuts out of a lot of steps in the search process – and that’s what we’re all trying to do. We’re trying to make finding what you want and discovering new things as easy and convenient as possible.

Purna Virji, Bing Ads

Last year, we talked to Bing about the growing use of voice search. As visual search rises as well, what is the impact likely to be on text search?

I think all these channels work well together and will actually complement each other. A prediction that Andrew Ng had made when he was still with Baidu was that that “by 2020, 50 percent of all search will be image or voice.” Typing will likely never go away. But now, we have more options.

Just like mobile didn’t kill the desktop, apps didn’t kill the browser, the mix of visual, voice, and text will combine in ways that are natural extensions of user behavior. We’ll use those tools depending on the specific need and situation at the moment.

For example, you could “show” Cortana a picture of a dress in a magazine via your phone camera and say “Hey Cortana, I’d love to buy a dress like this,” and she can go find where to buy it online. In this way, you used voice and images to find what you were looking for.

Voice search tends to reflect a need for immediacy, and is therefore seen as particularly important for local businesses, as people want to fulfill a task or interest nearby. What impact might visual search have on local marketing?

Our data scientists at Bing did do a survey of 2,002 people and asked them what they used voice commands for. The top things tended to be related to questions seeking “quick facts.” But they are also- to a smaller extent—using it for more complicated tasks such as ordering food

When it comes to commerce, though, it is clear that voice and visual search really can be great for local as well as impulse buying.

What other trends do you think will influence visual search as a marketing vehicle?

In reading the news, it seems likely that social media will benefit from this technology.

Snapchat, for example, has talked about serving ads based on what people take photos of. If I’m taking a Snap of a fancy pair of shoes, I could start to see ads about similar shoes. If someone shows an interest in something that specific, it’s helpful to deliver ads based on that experience. And Pinterest has their “Shop the Look” product offering as well.

Visual intelligence can also allow for deeper insights. As the artificial intelligence is trained to recognize and understand human emotion, the uses for marketers can be tremendous.

Combine it with a search functionality and you could do searches for something as specific as “Show me photos of Justin Trudeau looking very happy.”

Imagine how an ad agency could use that.

Say they wanted to assemble a virtual focus group to look at a commercial. You could layer on this emotion recognition ability to track and measure how people are reacting to the content. We could go from heat mapping clicks on a page to now “emotion mapping.”

About The Author
David Kaplan David Kaplan @davidakaplan

A New York City-based journalist for over 20 years, David Kaplan is managing editor of A former editor and reporter at AdExchanger, paidContent, Adweek and MediaPost.