Amazon’s Dan Quigley: Voice Is Now Many Consumers’ Preferred Interface
It's still 'day one' for the connected device space, Quigley insisted — but Alexa is getting smarter all the time.
Since the launch of Amazon Echo in 2015, Amazon has asserted early dominance over the voice-activated connected device space. Now, the company has gotten visual, bringing the Echo Show into the mix — all while continuing to make Alexa “smarter” when it comes to recognizing what people mean, not just what they say.
At Parks Associates’ CONNECTIONS conference last week, GeoMarketing caught up with Dan Quigley, STO, senior manager for Alexa Smart Home, about why it’s still “day one” for the industry — and how intelligent assistants are helping to drive IoT device adoption across the board.
GeoMarketing: You’ve launched the Echo Show, and we’re also looking at the release of new Alexa-powered devices that will let users make voice calls. It’s a busy time! Broadly speaking, where are you today in the journey towards refining this technology and reaching mass market adoption?
Dan Quigley: If you talk to anyone at Amazon, you’re going to hear this a lot: We’re still at day one.
Just recently — in fact, just less than two years ago — voice recognition passed the threshold that makes it a viable user interface. That’s all accuracy, and it’s accuracy along two vectors: The first vector is just automated speech recognition, and that’s taking what people say and turning it into text. The other is the improvement of the voice signal, which comes from a couple of different sources, but the predominant one for Amazon is what we call far-field microphone technology. That far-field microphone technology and the associated audio processing removes noise and does echo cancellation.
Additionally, before we even sold the first Echo, there was an effort to go out and collect millions of utterances in order to improve the statistical models. The idea is that it’s something that gets smarter over time — the AI gets smarter over time — but the other thing that improves is our understanding of what people mean when they say things. A great little example is that in some parts of the country they’ll say, ‘close the light,’ and that means ‘turn off the light.’ We have to learn how to recognize that. [That’s a big part] of the progress that we’ve made, and it’s what we’re continuing to improve and make ‘smarter.’
Voice activated connected device usage jumped 130 percent over the past year, so we are seeing some momentum. It may still be “day one,” but where are we in terms of this being technology that consumers are really adopting in mass?
Well, I believe we’ve crossed the chasm and we’re now in the early majority. When you think of voice as an interface, think of even five years ago: If you bought a car and it had a voice interface to it, the experience that you had was not as good. You’d say, ‘Call Mom.’ Then it would respond, ‘There are three Toms in here. Which one did you want?’ Right? And that caused friction, because the [interface] did not do what I said or even what I meant in that particular case.
I think… the accuracy has moved to a threshold that makes it viable. Consumers are very sensitive when things don’t work. If you go buy something and you spend your hard-earned money for it, and if it doesn’t work, you’re not a happy customers. And so as that continues to improve, and customers [know] that it works, we’ll continue to see exponential growth. It’s happening now.
Do you see the growing adoption of intelligent personal assistants like Alexa driving adoption of other connected IoT products that the assistant can interact with — like smart lights, smart appliances, etc? Are those influencing each other? Or not?
I think about it like the solar system: Every second of every day, the moon is orbiting the planet, the planet is orbiting the sun, which is in the galaxy, and they’re all interconnected, right?
So, to say that one influences another — it’s absolutely that case, right? I also think that because you have a central service that can speak to many different devices or data streams, that also helps adoption because now you have a single interface, and the customer doesn’t have to do much of anything really to connect to these things. It’s ease of use.
But I wouldn’t say that voice is necessarily the primary influencer of home technology. I would just say it is one influencer. Although, currently, voice is preferred. Consumers are reporting that that’s their preferred interface as of now because it’s easier since speech recognition, etc has improved.
As consumer adoption continues, what do marketers or business enterprises need to be thinking about here? Voice is already impacting search. How do businesses need to be thinking about how to interact with these intelligent assistants? Where are we going?
Voice is a great interface for some interactions. But if you walk into a room and say, “What can I do in this room?” that kind of response may be better served by putting something up on a screen, for example, because remembering a long spoken list of things is difficult for anybody, especially someone my age.
I think businesses need to be thinking about multiple modalities. I also think that voice provides an interface to the underserved community; think of blind people, for example. Voice is a brand new portal for them to use. Now we’re able to serve those people.
Essentially, I think that businesses need to be thinking that as these systems grow and improve on their capabilities, businesses to include all modalities in the way that they develop products. It’s no longer just about what goes on a screen, or what you can put on a page. It’s theoretically limitless; I mean, imagine if [the conference hall] we’re in now had microphones that would be able to react to my voice. What could I do? I could say ‘I’m too cold,’ I could ask for directions. I think we’ll see a lot of possibilities there in the near future.