HCDE 451 E5: Voice Interaction

IoT Google Home

Kay Waller
10 min readNov 10, 2020

As part of a prototyping class at the University of Washington we were tasked with creating voice interaction scripts and flowcharts for an IoT device.

The Design

Since the focus was on prototyping voice interactions and less so on the device itself, I picked an already popular device: the Google Home. The purpose of this device is to aid users in various tasks by using voice and conversational UI to communicate.

The Google Home.

Google Home users can essentially be anyone, but smart in-home devices are especially popular among younger generations. I wanted my interactions to be better defined and tailored to real experiences, so I decided to narrow down my user group to college students with household roommates. Due to a lack of secondary research, I made my best guess/assumption as to my user’s main goal: Ask and receive relevant information as well as use capabilities to enhance social conversations. I believe this encompasses “real,” informative questions and the silly ones that arise in a household of college students.

The environment and settings this device would be used in (based on my narrowed down user group of college roommates) would be the kitchen and living room spaces were the Google Home can be activated almost anywhere in the house. It would be near the kitchen table where lots of conversations take place. However, this accessibility is balanced with increased noise levels and multiple people talking at the time. Frequent errors that could occur are noise, distinguishing Google Home-directed speech from roommate-directed speech, wifi, and various outside sounds (construction, etc.).

The purpose of my investigation into voice UI is to determine usability. Are the exchanges conversational? Is the error handling helpful and used at the appropriate time? Does the user know how to interact with the device to its full potential?

The Prototype

To test the usability of the device, I made five sample dialogues with key tasks. For the key tasks, I tried to create realistic scenarios that encapsulated both serious inquiries and funny ones to see if the device was usable in both contexts.

Dialog 1- Weather

Roommate: Hey Google, what’s the weather?

VUI: The weather in Seattle is currently 51 degrees Fahrenheit and partly cloudy, with a high of 57 degrees and a low of 38 degrees. (waits 1 second) Would you like to know the weather for a specific time?

Roommate: Mmmm, specific time. Tell me the weather for 1.

VUI: The weather for Seattle at 1pm today is 56 degrees Fahrenheit.

Roommate: Tell me the chance of rain for the whole afternoon.

VUI: The chance of rain this afternoon is an average of 70%. (waits 1 second) Is there anything else you would like to know? I can tell you the forecast or the weather in another city.

Roommate: Uhhh, no. Unless you can tell me what shirt to wear hahaha.

VUI: Would you like me to search that for you?

VUI: Roommate: Nah, I’m good.

VUI: Have a good morning, bye!

_________________________________________________________________

Functionalities- Tell current weather or forecast for cities.

Constraints- The device assumes the user wants to know the weather for their location (because this is most often the case). If this assumption is incorrect, the user will have to specify their city. More constraints are shown as errors in the red boxes below.

Rationale- The script increases discoverability by listing options after 1 second of no user response, and is able to carry on fairly well a conversation about the nuances in weather due to keyword recognition for hours of the day, types of weather (rain, sun, etc.) and knowledge of locations. Keywords are shown below in transition states (arrows) between boxes below.

Dialog flow chart for a weather conversation.

Dialog 2- Meal Prep Inspiration

Roommate: Hey Google, I need ideas for my meal prep next week.

VUI: Do you want me to search the web for meal prep ideas?

Roommate: Yes, but I want them to be easy.

VUI: I will search for easy meal prep ideas. (waits 1 second before completing search)

Roommate: Wait, also, I need them to be cheap ideas.

VUI: I will search google for easy cheap meal prep ideas. Would you like to hear the top 3 recipes I found?

Roommate: Yes please.

VUI: The first recipe is chicken salad bowls with lime chipotle seasoning. The second recipe is chicken tikki masala. The third recipe is chorizo stuffed bell peppers. Would you like one of these?

Roommate: Not really.

VUI: I will show you more options. Are you ready for the next 3 options I found?

Roommate: Yeah.

VUI: The first recipe is arugula and pesto pasta. The second recipe is mole enchiladas. The third recipe is crockpot Asian sesame — — -

Roommate: I like the mole enchiladas. Can you send the recipe to my phone?

VUI: Yes. Is the phone number ending in two-zero-two-zero good for you?

Roommate: Yes.

VUI: Sending the recipe for mole enchiladas. Is there anything else I can do for you? I can read the ingredient list, find recipes just like it, or find other recipes.

Roommate: No thanks.

VUI: Enjoy, bye!

_________________________________________________________________

Functionalities- Looks up recipes on a user-tailored search term for inspiration, ingredients, or steps.

Constraints- Auditory processing is more constrained than visual processing, and as a result the device only gives three options at a time. Simple googling searching on a phone or laptop might be quicker. More constraints are shown as errors in the red boxes below.

Rationale- The script increases discoverability by listing options after 1 second of no user response, and is able to dynamically add to a search query with adjectives that the user wants out of their meal prep recipes. This feature makes the device more conversational. The user’s privacy is protected by only using the last four digits of their phone number. Keywords are shown below in transition states (arrows) between boxes below.

Dialog flow chart for a meal prep conversation.

Dialog 3- Fun Fact of the Day

Roommate: Hey Google, what’s the fun fact of the day?

VUI: The fun fact of November 7th is that otters have the densest fur of any mammal.

Roommate: Ahhhh, that’s why they’re so soft! Are there any otters in Washington?

VUI: There are otters in Washington. (waits 1 second) Would you like to know more? I can tell you what otters eat or what their habitat is.

Roommate: I want to know if the zoo has otters.

VUI: Would you like me to check if the Woodland Park Zoo has otters?

Roommate: Yes.

VUI: Here’s what I found. The Woodland Park Zoo does have an otter exhibit. (waits 1 second) Would you like to purchase tickets to go see them?

Roommate: No, thanks! I will just watch a live otter cam instead.

VUI: Sounds good. Can I do anything else for you?

Roommate: Nope.

VUI: See you tomorrow for the next fact. Bye!

_________________________________________________________________

Functionalities- Relays the fun fact of the day and answers any related questions the user asks after that.

Constraints- This conversation is constrained by the API itself, but also by knowledge that the user wants to know that has no definite answer on Google (Ex: What type of algae do the otters eat in California?). More constraints are shown as errors in the red boxes below.

Rationale- The script increases discoverability by listing options after 1 second of no user response and is able to carry on a brief conversation about the topic of that day’s fun fact. Keywords are shown below in transition states (arrows) between boxes below.

Dialog flow chart for querying the fun fact of the day.

Dialog 4- Daydreaming about buying plane tickets

Roommate: Hey Google, I want to go to Maui. What are the prices of tickets right now?

VUI: The average price of a ticket to Maui from SeaTac is $560. (waits 1 second) Would you like me to look for prices on a certain day?

Roommate: Can you look for January 20th?

VUI: Here’s what I found. On January 20th, plane tickets from SeaTac to Maui are projected to be $490.

Roommate: That’s so expensive. What about from Portland?

VUI: Plane tickets from Portland to Maui on January 20th are projected to be $504. (waits 1 second) Would you like the details of this flight?

Roommate: No, I can’t afford it anyways.

VUI: Would you like me to notify you if these tickets drop in price?

Roommate: Yes, please notify me if the flights from SeaTac drop in price. I don’t need the Portland ones.

VUI: Would you like to be notified by text?

Roommate: Yeah, sure.

VUI: Is the phone number ending in 2020 good for you?

Roommate: Yeah that will work.

VUI: Great! I will notify you if the flight from SeaTac to Maui on January 20th drops in price. Is there anything else I can do?

Roommate: Nope, that’s it! Thanks!

VUI: Bye!

_________________________________________________________________

Functionalities- Searches flights based on a variety of parameters, gives flight details, and can book tickets.

Constraints- This device assumes the major airport nearest to the user’s location is their default one, but this could cause problems in cities that have many major airports (Ex: San Francisco). More constraints are shown as errors in the red boxes below.

Rationale- The script increases discoverability by listing options after 1 second of no user response and is able to satisfy the user’s daydreaming craving of going to Maui. Again, the system protects privacy by only repeating the last four digits of the user’s phone number. Keywords are shown below in transition states (arrows) between boxes below.

Dialog flow chart for having a conversation about plane tickets.

Dialog 5- Playing music

Roommate 1: Hey, Google! Can you play my Spotify playlist “halloweeeeeeenies”?

VUI: I will need to use your Spotify credentials on your device to access your account. Is this okay?

Roommate 1: Yeah that’s fine.

VUI: Thank you. Playing “halloweeeeeeenies” now. Let me know if you would like me to skip a song, change the volume, or play another playlist or song.

Roommate 1: OMG I love this song!!!

Roommate 2: I’m not a huge fan of it…… Can we change it?

VUI: Would you like me to change the song?

Roommate 2: Yes!

VUI: *changes song*

_________________________________________________________________

Functionalities- Searches flights based on a variety of parameters, gives flight details, and can book tickets.

Constraints- This device depends on playlist and song recognition, but sometimes these aren’t full/real words. Additionally, there can be one song name for five or so different songs with different artists. Knowing which one the user wants will be a constraint. More constraints are shown as errors in the red boxes below.

Rationale- The script protects user privacy by asking for permission to connect to their music app of choice. The device’s ability to understand “turn it up” or “change the song” in the context of playing music increases the conversational feel of the device. Keywords are shown below in transition states (arrows) between boxes below.

Dialog flow chart for having the device play music.

The full prototype flows can be seen here.

Analysis

To refine my dialogs I read them through with Kevin, one of my classmates. He thought I did discoverability well embedded in the script, and thought each scenario seemed like a relevant task for my user group. He thought the last four digits of the phone number request was effective privacy design, and he liked the error handling I already had. Some things that needed improving were error handling and missing conversational elements. Things like different questions to ask and awkward writing came up frequently, and I was able to fix those things with his help. Overall, his feedback made my scripts and voice interactions more resistant to de-railing.

In evaluating whether I reached my goal of usability, I think I would need further testing to see how my design stacks up with more user testing. However, based on Kevin’s positive feedback (and help) in the four areas of discoverability, error handling, privacy, and conversational tone, I think that I am on the right track.

There are some things I want to consider for next iteration to improve my interactions:

  1. I want to do a deeper dive into word choice and punctuation, potentially using the program shown by Damien in lecture.
  2. Investigate device personalization to increase usage for key tasks (Ex: can the device say reminders for the facts of the day?)
  3. Investigate the desire for a device that prompts with questions a couple seconds after the VUI initial response. Waiting a couple seconds to ask the user another question could be good for discoverability, but it could be annoying for those wishing to end the conversation.

There are also some things I wish I did differently in my process of making the scripts:

  1. I wish I read each line out loud after I wrote it before testing with Kevin, because I probably would have caught some of the more obvious speaking errors.
  2. I wish I took time to have a more basic, solid understanding of voice UI before I went in to script writing.

Overall, it was a fascinating introduction to voice UI! It’s definitely more difficult and nuanced than I imagined, but it was nice to explore the basics in a prototyping environment.

--

--

Kay Waller
0 Followers

An aspiring UX practitioner studying Human Centered Design & Engineering at the University of Washington.