Recently, I got my hands on Amazon’s Echo Dot. It was a parting gift from my old team. My kid and I took it for a spin. To my surprise, it worked like a charm. Alexa, play “Hey Jude” by The Beatles. Alexa, how is the weather today? Alexa, tell me a joke? We had fun!! I found the experience highly engaging. It made chatbots look like a relic. Now is the time for Talkbots to rise. Amazon calls it, a virtual assistant AI technology. These applications are called Alexa skills.
You will be amazed by the simplicity of the technology!! Even though there are some serious computing involved they are not in the developer’s domain; Amazon takes care of them. If you have some coding experience and go through 3 or 4 YouTube videos, you should be able to build a simple skill on your own. I gave it a shot over the weekend and was able to successfully build an Alexa Skill by Sunday evening.
The technology
data:image/s3,"s3://crabby-images/a50c0/a50c08df74458415462b89a4dbc04bbe78b98c22" alt=""
The Alexa device (Voice user interface) is just an input/output device that takes your voice and sends it to Alexa Service. Alexa Service used Speech Recognition, ML & NLP to parse the input and tries to identify the intent. Intents are like a set of regular expressions (instead of texts they are sound-based). When matched would lead to an API call with the Intent. The API endpoints are provisioned through AWS Lambda; but one provision their own endpoints too. The logic is coded in these web services and they use ASK (Alexa Skills Kit) SDK. Today these SDKs are available in Node.js, Java & Python. Responses from these API calls are simple texts, that are converted back to speech and sent back to the device. It supports multiple languages and dialects like English-IN, English-UK, English-US, Hindi, etc.
The development workflow is very simple (see below). The complete development can be done on the web. There is no need for any special hardware or software. Log onto https://developer.amazon.com/; There you will find, developer console where we can build the skill end to end. The APIs can be hosted in Amazon for free as long as it doesn’t exceed a pre-defined quota.
data:image/s3,"s3://crabby-images/2ed8a/2ed8a9a188785d8113b516f93a8ecc312731b494" alt=""
With an Alexa-hosted skill, you can build, edit, and publish a skill without leaving the developer console. You use the online code editor to edit the code for your skill and deploy it directly to your AWS Lambda endpoints. When you create an Alexa-hosted skill, Alexa provisions the following:
- AWS Lambda endpoints in all three Alexa service regions
- An Amazon S3 bucket for media storage
- An Amazon DynamoDB table for persisting data
- An AWS CodeCommit repository
Your Alexa-hosted skill uses AWS resources. The following limits apply to each Alexa account.
- AWS Lambda: 1 million free AWS Lambda requests and 3.2 million seconds of compute time per month.
- Amazon S3: 5 GB of Amazon S3 storage, 20,000 get requests, 2,000 put requests, and 15 GB of data transfer out per month.
- Amazon DynamoDB: 25 GB of Amazon DynamoDB storage, and 1 GB of data transfer out per month.
- AWS CodeCommit: 50 GB-month of storage and 10,000 Git requests per month.
I am guessing this should be enough to hook someone for a weekend hackathon.
Challenges faced
I was using Alexa hosted skills. Today, Node.js or Python are the only programming options if you opt for Alexa-hosted skills. I opted for Python. However, all video tutorials are in Node.js. Slight discomfort but, if you understand the code in these tutorials you will also be able to work with Python. Additionally, when you start Amazon gives you 3 template codes to choose from. This acts as a good starting point to understand some additional capabilities that can come in handy.
Small syntax errors in Python wasted a lot of my time. You only come to know about them when you start testing the code. It becomes hard as there is no easy way to find out what failed. It must be in some logs but, I could not find them. In case of these errors, Alexa responds with a different statement. So try to update the output of all_exception_handler(), unhandled_intent_handler() & fallback_handler()
Lastly, I struggled a bit with the distribution. This is the last phase where you are done testing the skill and want it to get published on Amazon. You will be asked to fill a form. In one place you need to provide 3 example phrases that your skill can understand. There try to use phrase that doesn’t have any input, which will be captured in a variable.
Also if you create a Skill for your kid, it will be hard to get it published. Amazon imposed additional & more strict restrictions on such skills. Till date, I could not officially publish my skill. My skill was for kids, which can tell any multiplication table when asked or can test one’s table skills by asking questions at random.
Conclusion
This is an amazing technology!! It can be used in many domains to make more engaging customer experience.