Moe's Voice Interaction

Moe’s Voice Interaction

Exploring how voice interaction technology can be incorporated into Moe’s Southwest Grill’s ecosystem


Client: Moe’s Southwest Grill

Timeframe: August 2018 - December 2018

Teammates: Kelsie Belan, Rachel Feinberg, Bang Tran, myself

My roles: UX researcher and designer

Platform: Mobile/Smart speaker (Alexa)



A snap from one of Moe’s restaurants

We worked with Moe’s Southwest Grill, a fast-casual Mexican restaurant chain based primarily out of Atlanta, Georgia. Our client wanted to figure out the contexts in which people use or will be using voice interaction technology. They desired to figure out how people would use voice if it were an option, and how it should work. In other words, they were clueless about how they could use voice.


How do we ascertain how a fast-casual restaurant chain can incorporate a nascent technology like voice interaction into their brand?

My contributions

During the research phase, I conducted secondary research, helped formulate questions for the initial and follow-up surveys, created visualizations for the survey data and also the empathy maps. I helped perform competitive analysis and task analysis. I came up with one of the three concepts and illustrated a sketch for it. For the testing, I formulated questions for the second user-based test and switched between being a notetaker and facilitator/interviewer for different sessions.


What I learned

In an exploratory project, it can be disorienting at times to find the right direction to head in. If at times it means that we have to pivot, then so be it; as long as our decision is backed up by the research data. Additionally, collaborating with a client can be time-consuming. We should keep that in mind while scheduling research activities. No amount of research can be enough; it’s about choosing the right methods within the constraints that we have and making the most out of them. Moreover, designing and testing for a nascent technology such as voice interaction can be hard.


Secondary research

No one on the team had worked on voice tech before, and the project was exploratory in nature. So we examined the current landscape in the industry for voice recognition technologies to gain background information on the space and reviewed Moe’s to understand its brand culture.

Synthesis of findings - Documented information regarding the evolution of voice interaction, its current state, leaders in the space, and how brands can find their own “voice,” and the things customers love about Moe’s.







Areas explored


Visualizations of some of the findings from survey data

We conducted surveys around three areas - experiences with the Moe’s ecosystem as well as voice interaction technologies and online food ordering in general. We first formulated our research questions following which we devised the survey questions. We then piloted the survey with fellow researchers and refined it based on their feedback before publishing it. We used Qualtrics for the pilot and final surveys.

Justification -

  • Quickly collect a significant amount of qualitative and quantitative data

  • Identify interested customers for further research

  • Find out at a high level about customers’ experiences (attitudes, preferences, motivations, perceptions) with different domains

  • Gathering data on a vast breadth of topics would help us establish a foundation to build on and guide more targeted, in-depth research methods later on

Analysis -

Given that our data was categorical, we expressed our results in terms of the numbers or percentages of people who chose those options. Using Qualtrics’s report visualizations, we created visualizations of specific findings to illustrate and help us better understand them.

Synthesis of findings -

To step into the mindset of our users and sum up our findings from survey data, we created empathy maps capturing their perspectives in the Moe’s restaurant setting and when using voice interaction. Empathy maps helped us identify critical user needs for our sample.

Empathy map capturing the in-store experiences of Moe’s customers

Empathy map capturing the experiences of Moe’s customers with voice interaction in general


Critical user needs


Competitive Analysis with Task Analysis

We performed a "competitive analysis" and task analyses on the voice interaction applications of a few restaurants (Dominos, Starbucks, and Wingstop) to understand the nuanced differences between their food ordering. To explore the vast world of Alexa skills and determine if there is a way for Moe’s to leverage their functions, features, or design choices in their voice interaction technology, we also analyzed a few popular skills (The Magic Door, Thunderstorm sounds, Jeopardy). These techniques allowed us to better navigate the uncertainty of our project by tying ourselves to real-world, existing applications.

A peek into the competitive analysis - Dom, Domino’s ordering assistant bot allows you to use both, text and voice interaction to place orders

Justification -

  • “Competitive analysis” provides an excellent foundation of existing information, allows us to be exposed to things that are popular within the realm of voice interaction and serves as a platform to inform further design and research

  • Task analysis helps us understand the different steps involved in carrying out various tasks involved as well as features, functions, and pros/cons


Synthesis of findings


But wait….

We realized that ordering food through voice interaction isn’t something people used or wanted. This was especially apparent through the horrible reviews of other voice ordering applications we looked at. We discerned that our approach was wrong. Using the top-down approach, we were trying to fit the Moe’s experience into an application of voice that people aren’t using. We also needed to consider how voice could fit into Moe’s, and thus, we decided to combine our top-down approach with a bottom-up approach and explore all of the possibilities in the middle by understanding what features people already find useful and enjoyable about voice, and then tailor those to the Moe’s experience.


Follow-up survey



We conducted a follow-up survey to understand what things people use voice interaction for, the contexts in which they use it and why.

Synthesis of findings -

  • They used voice for entertainment-related tasks like playing music and other simple tasks like checking the weather, asking questions, texting etc

  • They used it when it’s faster, easier and more convenient and because it affords a hands-free interaction

  • Car and home were the two main contexts in which people used voice

  • No one used voice to order food



Based on the user needs and implications identified by our research, we brainstormed three concepts.

  1. Multi-modal food ordering - A voice ordering application with a visual component that guides the user, integrated into the Moe’s app. Users can interact in different ways, either through voice alone or refer the visual element as well, should they wish to.

  2. Trivia game - A trivia game themed around Moe’s brand (an Alexa skill) that users interact with through voice . Users can compete against others to win coupons.

  3. Burrito Quest - An interactive voice story themed around Moe’s brand where the user takes an active role in the story, to be “played with” a Cowboy named Moe who is looking for a golden burrito in the southwest US. Users could win a burrito and earn Moe’s rewards points.

A sketch for the multi-modal food ordering concept

A script for the Trivia concept

A script for the Burrito Quest concept


Conceptual feedback

We conducted feedback sessions with four potential end users where they ranked their favorite concepts, talked about what they generally liked, what features were good and what could be improved. We randomized the order in which ideas were presented to prevent order effects. Based on the analysis of the feedback received, we decided to go ahead with the Burrito Quest.

Takeaways -

Me conducting a feedback interview

  1. Multi-modal food ordering - Three participants ranked multi-modal food ordering as their least favorite concept. They liked that it’s a simulation of the Moe’s in-store experience. They hated that it makes the process of ordering online tougher as more time is spent listening and responding.

  2. Trivia - Overall, participants liked the experience, particularly the musical components and liked that they could earn reward points. However, they found the percentile ranking system to be confusing.

  3. Burrito Quest - Two participants ranked it as their most favorite concept while other participants rated it as their second favorite. They enjoyed the experience and liked that they could define their path in the story. However, they found the length of the script in between interactions to be a concern.


Wizard of Oz prototype


What is it?

An Alexa skill (once developed) that is based on an interactive “choose your own adventure” story


Who is it for?

Moe’s customers who like stories and rewards points and want to try something novel


How does it help Moe’s?

A great marketing stunt and brand reinforcement for Moe’s as it fits well with the brand idea of being fun


We created a Wizard of Oz voice prototype for the Burrito Quest concept by recording voice clips as well as by incorporating sound effects, a mini-game and music. We used Apple Keynote to put the prototype together. To feasibly evaluate the interactivity of the narrative, we created a script of interactions that would provide an illusion of choice. In a fully functional prototype, ie an Alexa skill, each decision made would carry weight and impact the storyline.

A part of the final script

Flow chart depicting the illusion of choice

To check the prototype, please click on the button below and download a copy.


All of our tests were moderated and conducted in-person.

User-based test: Phase 1

A clip from one of the sessions showing a participant laughing while going through the interactive story

We conducted user-based tests with four potential end users and had them engage with our prototype through the Wizard of Oz technique. One of our team members (playing as the Alexa Speaker) controlled it as the participants interacted with the “smart speaker,” which was a speaker puck (designed very similarly to an Amazon Echo) plugged into a laptop with the audio files. After the participants interacted with the prototype, we asked them a series of questions that probed into parameters like engagement, enjoyment, and understandability.

Takeaways -

Speaker puck used to simulate a smart speaker

  1. It seemed that all of our users were not exactly sure what the rewards points are or how they worked.

  2. Some users felt that the narrative parts were too long and that shorter narrative sections and increasing the number of interactions could be more engaging.

  3. Some users felt that it was difficult to imagine the story with just the voice and no visual, and that some visual could help ground the story and make it even more robust.

  4. A few users felt that the song was too long in the duel mini-game.


To address one of the concerns raised during the tests, we created a visual handout that would serve as a companion to the interactive story.


User-based test: Phase 2

To better understand the concerns raised by our participants during the first phase of testing as well as check their validity, we devised a new set of questions for the second phase of user-based testing and conducted it with five potential end users. We supplemented the questions with a bunch of self-report metrics like level of engagement, perceived duration of different interaction parts, as well as perceived usefulness of the visual handout. To help guide their interaction, we provided the participants with a visual handout. To help them answer the questions, we also provided them with the story flow outline and entire script.

Takeaways -

  1. 4 out of 5 participants found themselves very engaged during the interaction. This is supported by the fact that all the participants expressed that they enjoyed the interaction in general and we observed them laughing and smiling at various points throughout the demo.

  2. All participants found the duration of the song in the mini-game to be too long.

  3. Other Likert scales like the duration of narrative parts and level of usefulness of handout had fairly distributed mixed responses which goes to show that the reactions were based more on personal preferences rather than stating anything in particular about the usability of the prototype.

  4. The rewards system was unclear for all of our participants. All of them suggested providing a more explicit description of how much each reward point is worth and in general, making it more apparent how the rewards system works.

  5. Participants also shared a few suggestions including allowing the user to backtrack their actions to encourage further exploration of the storyline, making it clear that their progress can be saved and that whenever they return to play the game again, they should be a presented with a recap of their progress till date.

A snapshot of one of the participants during the user testing sessions

Story flow outline shared to help participants answer the questions

Heuristic evaluation

An expert conducting heuristic evaluation

We wanted to ensure that our prototype followed the standards set forth by the industry for voice interaction technologies. We referred to Amazon’s Alexa design guide and chose a set of heuristics that were relevant and applicable for the evaluation of our prototype. We conducted heuristic evaluation with three experts who all had some experience in designing voice interaction technologies. We provided the experts with the same materials as the ones during the second user-based tests as well as the set of heuristics, descriptions and a spreadsheet to list the feedback.

Specific takeaways -

  1. The experts found the interactions to be straightforward overall, which makes it easy to progress through the story.

  2. Earcons were generally well received, and improved the experience.

  3. While some of them didn’t find the visual handout to be necessary, all of them thought that it enhanced their experience.

  4. The two most violated heuristics were “Questions” and “Be Brief.”

  5. The narration could see some improvement regarding eliminating redundant information or reducing its length.

Next steps

We can collaborate with experts in the realms of voice interaction and storytelling to further refine our prototype and also consult with our client, Moe’s Southwest Grill to ensure that the prototype’s style still resonates with the brand (given that they are now rebranding) if and when they launch it.