VUI: How to design for an invisible interface
Editor's note: The following is a guest post from Danielle Reubenstein, executive creative director at Possible Mobile, an app marketing and consulting firm.
To those in the know, VUI is pronounced “vooey,” and it stands for voice user interface. It even has its own bad joke. Because voice control lacks a screen, there’s nothing to touch or look at. It’s an interface without a “face.”
If you think it’s a gimmick, however, think again. The promise of VUI is huge. If we create intelligent auditory interfaces that understand speech and context, we’ll be able to deliver what people ask for, hands-free, and with almost no effort.
Of course, we aren’t there yet, but we also aren’t as far away as you might think. Wolfram, Siri, Watson, Alexa and Google Assistant all have VUIs that are starting to make conversational computing possible. But much like people, they are not the same, and we should know the differences between them if we want to understand the best practices for them as a whole.
For a quick breakdown, Siri and Wolfram are step-siblings with a shared lineage. Both are glorified search bars that serve users by supplying answers to queries. Siri can take on extra tasks by accessing other apps and software development kits (SDKs), but everything else becomes a web search.
Next, we have the conversationalists — Alexa and Google Home — both of which power numerous devices and applications. These two can, to a limited degree, talk to you and provide functionality. They both adeptly handle search, play audio and answer questions in a conversational way. That said, there are several differences between them. Alexa currently has an SDK and is open to third-party developers, while Google plans to do the same soon. Google Home can use its understanding of data to be personalized and predictive. Alexa is not as user-centric and doesn't rely on that type of personalization.
Moving forward, we should expect VUIs to become both more conversational and widespread. This brings up the inevitable question: How can brands get into the act? How do we make experiences that are useful and wanted, rather than annoying?
Here we have an emerging set of guidelines requiring that any assistant be conversational, natural, simple and habitual.
Human beings do not usually dictate, they converse. As a result, brands should work toward making the tone of their interactions informal and conversational while not straying too far from the personality of the assistant itself. The more you make users deviate from their normal conversational patterns, the more clunky the interaction will be.
In addition, you have to remember the limits of conversational interaction. “Seven, plus or minus two” is how renowned psychologist George Miller put it in his classic research paper. We only have a limited number of things we can keep in mind, so individual answers to queries or interactions shouldn't be too long.
Most VUIs require some unnatural communication, like a wake word, so that they know they’re being spoken to. However, you should make sure that your application allows users to make requests that are as natural as possible. You shouldn't only account for the most common way people ask for something, but also as many variations and word combinations as you can find. If you ask an app to show you a recipe for chicken à la king, it should show you one. But if you say, “Tell me how to make chicken à la king,” or “Find me Chef John’s chicken à la king,” both of those should work too.
We all take a mental leap when we talk to a computer, undergoing a training phase in which we learn how to use an app. That’s why great voice apps always start out simple, doing one thing well. Then, with every iteration or upgrade, you can add a new function. It’s a "crawl walk run" mentality. And remember: Even as an app grows increasingly complex, information hierarchies should remain shallow, as users can easily get lost.
Because your app is "invisible," it can easily be forgotten. Instead, make it something people want to use every day. The list skill on Alexa works well because it is something you might constantly use.
Creating an interface without a face may seem a complex undertaking, but with good experience design and a willingness to take baby steps, we’ll be well on our way to reaching the full potential of VUI. We'll one day be able to give orders, receive information and get things done effortlessly. For now, let’s work together and work smart, one app at a time.