By: Joseph Saperstein On April 30, 2017

Transforming the Karsh Hagan website into an Alexa Voice Bot

Our quest to build a Karsh Hagan Alexa voice bot

The future of the Internet is about making human connections. Technology is beginning to allow us to do that in an increasingly authentic way. Whether on Slack, Facebook Messenger, Amazon Echo, or another platform, chatbots enable users to communicate with computers via voice or text commands. In the spirit of innovation and exploration, the creative technology team at Karsh Hagan embarked on a quest to create a voice-driven bot to learn more about how to apply the technology in meaningful ways.

 

What kind of bot should we build?

That answer should be obvious, right? It’s just like building an app or a website. Well, not quite. We started out by brainstorming what might be useful to our clients. Today's leading companies are using bots to deliver a wide range of information: from news, education and games, to lifestyle content and the weather. We decided we wanted to pull data from our website that visitors could interact with via voice. It would essentially be a Q&A bot that quickly answers questions, enhancing or replicating the website experience. We thought it was a pretty good idea.

 

Choosing the Alexa platform as our foundation

Once we had decided on an educational customer service bot, we had to consider what platform to use. We discussed Facebook Messenger and Slack. Google Home was just coming out at the time, but ultimately we decided to go with the Amazon Echo as an initial implementation. It doesn't require a login like Facebook, it doesn't require having a user download a program like Slack, and it speaks, which is more in line with where we think bots will be in the future versus seeing text on a screen. Amazon also provides a cloud based hosting solution with its Lambda server-less computing platform, Lex bot-building software, and Alexa Skill tooling, making it easy to get started. And, luckily, our Karsh Hagan website is built on a Node.js-based CMS with a Mongo database. It's all written in JavaScript which is one of the languages supported by Amazon’s bot-building tools. This made it easy to write API endpoints that return data in the form of JSON from the site. 

Now we have a bot concept, the data, and a platform. The big question? How are we going to get this thing to act like a human? 

 

The challenge of creating a conversational experience

A Conversational User Experience is an experience crafted around our own natural language, which is pretty complicated. Instead of tapping, swiping or clicking through a user interface, this technology would allow you to interact with an invisible interface using voice commands called intents, utterances, and slots. The act of designing and developing these intents, utterances and slots is what we call Conversational UX.  An intent is the purpose of the conversation, e.g. “Book a Hotel”.  An utterance is the way in which we ask a question about an intent. For hotel booking (the intent), an utterance might be “I want to book a hotel” or “I’d like to make a hotel reservation.” Lastly, a slot represents the data point(s) our bot needs in order to complete our request. For hotel booking, the slots would be the destination, check in dates, and check out dates etc. 

Now that we knew what makes up a Conversational Experience and we had our lingo down, we started to craft a conversational UI. 

When building our Karsh Hagan chatbot, we first needed to identify what our intents would be. With our bot, we decided to establish employee info as our first intent.  Then we wrote out all possible responses for this intent.  “Who are the account managers?”, “What does {first name, last name} do at Karsh Hagan?”, “Who makes up the management team?”, etc. Each utterance hits an API endpoint such as karshhagan.com/api/title/. This works because Alexa hears the intent "title," and then it hits the "title" endpoint with the "account manager" parameter. Alexa then grabs the corresponding data, returning the team members’ names. The data is then processed and plugged into a template that allows Alexa to speak. Of course, we had to inject some KH personality to make it a little more engaging, ‘cause that’s what we do here. 

 

Applying quality assurance to our bot

In the voice design of the Alexa Skill, the real challenge is accounting for how various people are going to interact with the Echo. Some people will say "Who is the Account Manager at Karsh Hagan?" while someone else will say, "Who has the job title Account Manager?" To build for multiple inputs that are seeking the same end result, we first had to see how people were using the Skill. We then took that info and added multiple combinations of slot types. These slot types picked out common words in the request phrase to invoke the appropriate intent and return the desired information. 

Practicing user-centered design, we asked some employees to interact with our Alexa Skill. We observed how they invoked the skill and their reactions to receiving a response. We then used this to influence how we wrote utterances and responses.

 

So, what’s next?

As with any Internet application experience, you are never really done. We barely scratched the surface of what our bot can do and we are prioritizing new features for the bot. We are currently building out the whole website as an Alexa Skill which will be available on Amazon soon. We’ll also be exploring third-party tools such as Google's api.ai and Facebook's wit.ai to see if the addition of machine learning will combine to form a super bot. And now we are embarking on client projects to build some more bots. That’s why we love our jobs - there is always something new to make.

If you have any questions on how we approached this project, let us know. Always happy to share and learn.  Happy bot building!

 
Joe Saperstein - Senior Creative Technologist 

Trevor Glassman - Senior UX Planner

David Stewart - VP, Director of Creative Technology @leuzstewart

 

Back to the blog