Solving Alexa’s Accent Understanding Challenge, using Scalar Vector Machines

Posted by Chirash Rupela on Feb 1, 2019 11:48:00 AM

Alexa is great, providing amazing features to control apps and services with just your voice. But it’s understanding of non-American accents leaves much to be desired. Case in point - using Alexa with my Indian accent brings out some serious problems. No matter how many times I try to say “sprint”, it would only understand it as “spend”.

This is terrifying for Alexa developers like me who want to use the NLP power of Alexa to build solutions that cater primarily to the Indian population. Amazon does offer to develop Alexa skill in ‘EN-IN’ but it does not solve the problem. This major flaw in transcribing Indian accent results in a failure in the skill flow and broken conversations.

But should it be a roadblock for you to develop an Alexa skill?

No, because we found a way to solve this problem.

Devising a Solution

The solution is to use the ability to add synonyms for slot values (in custom slot types).

In any Alexa skill, you can add intents and each intent has different slots. You can choose pre-defined AMAZON slot types for your slots or you can create custom slot types. The difference between using AMAZON slot types and custom slot types is when you create a custom slot type, it allows you to add synonyms of slot values.

Using an example from our Alexa skill -

If we added “spend” as a synonym to “sprint” slot value, it would solve our problem. The next time Alexa hears “spend”, it would send slot value as “sprint” and that can be sent to the Lambda function which gives the back an appropriate response.

Quick aside: Our skill now available for beta testing, so do try it out.


This was the exact solution we were looking for.

Now we had the solutions and two ways to make it happen :

  • Manually add synonyms for each slot value based on user data and custom reviews.

  • Predict synonyms for each slot values and automatically add them once-twice a week.

    The manual additions are quite easy to do, but not a scalable option. Consider a case where you have more than 50 slot values and you want to add slot synonyms to each one or most of them. Doing it manually would be tedious.

    This is the reason we went with the Predictive approach and automated the addition of slot synonyms in our skill.

Implementing the Solution

To automate the prediction and addition of slot synonyms, we used following AWS resources  :

  • Lambda function

  • EC2 Instance

  • S3 bucket

  • Alexa developers account


Now, that all the resources are ready, there are three main steps in the Predictive approach :

       1. Capturing words like “spend” which are poorly transcribed by Alexa 

       2. Predicting the slot value the word “spend” belongs to. 

       3. Adding the word “spend” as a synonym to the predicted slot values.

I will explain steps 1 and 3 in a while, but let’s understand step 2 as it’s the most crucial step.

Prediction requires a machine learning algorithm. In our case, we have used Scalar Vector Machines(SVM) to predict the slot value. It’s one of the simplest yet quite accurate ML algorithm used for text classification.

SVM is a supervised ML algorithm which finds the line or hyperplane with the maximum distance from scalar vectors. Say, you have two classes -

a. Words similar to “sprint”

b. Words similar to “release”

Using SVM, we can find the line which clearly distinguishes these two classes based on the available training dataset. This line will be the maximum distance from the words which are on the outermost part of the clusters or so-called as scalar vectors.


You can learn more about SVM here

The  Architecture


Step 1

To capture the poorly transcribed words such as “spend”, we use our Lambda function to read the request JSON from Alexa and store the word along with its slot name in a CSV file, and store it in S3 bucket.

def checkutterance(data):
   result = []
   for k, v in data.items():
       if "resolutions" in v.keys():
           for i in v["resolutions"]["resolutionsPerAuthority"]:
               if i["status"]["code"] == "ER_SUCCESS_NO_MATCH":
                   result.append({"slot": v["name"], "utterance": v["value"]})
   s3 = boto3.client('s3')
   response = s3.get_object(Bucket="BUCKET_NAME", Key="FILE_NAME")
   data = response['Body'].read().decode('utf-8')
   string = ""
   for j in result:
       string = string + json.dumps(j) + "\n"
   data = data + string
   encoded = data.encode('utf-8')
   s3.put_object(Body=encoded, Bucket='BUCKET_NAME', Key='FILE_NAME')

Step 2 

Once the missed values are stored in a S3 bucket, we use our EC2 instance to read the file.

In our case, we have scheduled a cron job to do it every day.

The script deployed on EC2 instance is responsible for training and predicting classes using SVM. The script reads the missed values from the file and predicts the class for each value.  In our case, it predicts “spend” as a synonym for slot value “sprint”.

Here, we have also set a threshold value in case the slot value matches quite low to either of the class. Such values are again stored in a CSV file and mailed to us so that manually we can add them in the Alexa skill if required.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import boto3
from sklearn.pipeline import Pipeline
from sklearn import svm
from sklearn.utils import shuffle
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.feature_extraction.text import CountVectorizer

text_clf = Pipeline([('vect', CountVectorizer()),
                    ('tfidf', TfidfTransformer()),
                    ('clf-svm', svm.SVC(C=1,  class_weight=None, coef0=0.0,
   decision_function_shape='ovr', degree=2, gamma='auto', kernel='rbf',
   max_iter=-1, probability=True,
   tol=0.001, verbose=False),),
Step 3

Once the slot value is predicted for each word, using Alexa cli, we update the word as a synonym for the respective slot in the Interaction Model JSON of our Alexa skill.

os.system('ask api get-model -s ALEXA_SKILL_ID -l en-IN > alexamodel.json ')
data_alexa = []
with open('alexamodel.json', 'r+') as f :
   data_alexa = json.load(f)

for i in data_alexa["interactionModel"]["languageModel"]["types"]:

       if i["name"] == "choose":
           for j in i["values"]:
    if j["name"]["value"] =="sprint":
                   synonyms = j["name"]["synonyms"]
                   for s in sprint:
                       if s["utterance"] not in synonyms:
                   print("new list of synonyms " , synonyms)
                   j["name"]["synonyms"] = synonyms
               if j["name"]["value"] == "release":
                   synonyms = j["name"]["synonyms"]
                   for r in release:
                       if r["utterance"] not in synonyms:

                   print("new list of synonyms " , synonyms)
                   j["name"]["synonyms"] = synonyms

with open('alexa.json', 'w+') as fp :
                       json.dump(data_alexa, fp,ensure_ascii=False)
os.system("ask api update-model -s ALEXA_SKILL_ID -f alexa.json -l en-IN")

The Alexa skill is then built using the same skill and hence automating the process of updating synonyms in our Alexa skill.


With this, the problem of transcribing Indian accent with Alexa skill has been solved to some extent. We are continuously updating our training dataset to improve the accuracy of our model.

If you have any suggestions on how to improve an Alexa skill for this particular problem, do let us know in the comments section below. 

Topics: AWS, Machine Learning & AI, Architecture

5 key technology pre-requisites to deliver advanced personalization

Posted by Gaurav Mishra on Jan 15, 2019 4:34:00 PM

According to a survey by Evergage and Researchscape International, 96% marketers surveyed agreed that personalization helps advance customer relations. However, 55% also feel that businesses are currently not getting personalization right.

What's getting in the way?

Challenges with overall technology and data handling are two of the top five challenges enterprises face, when it comes to delivering effective personalization, as per a 2017 study by Sailthru.

So we decided to take a look at five key technological aspects to consider, when it comes to delivering advanced personalization.

Data Structuring

Effective personalization depends on the right data, and structuring it in a manner that can serve personalization requests.

The existing content on your site, or on any of your digital channels, has to be broken down and saved at a granular level. Each concise content snippet/granule should correspond to a particular personalization parameter. This is primarily because personalization for each individual is a unique combination of different parameters, and without granular content you won’t be able to serve the right mix of content that they want.

Data Storage and Retrieval

With a huge amount of data to be parsed and processed, how fast can you deliver a personalized experience to the user? You might have created the most carefully curated experience for you customers, but it never reaches them if your systems, especially frontend rendering of the data, is not fast enough.

Given these challenges, businesses need to build technology competencies around enterprise search engines like Elasticsearch and Solr. These will be critical in terms of making high-performance digital applications with quick response times. These search engines work well with large volumes of text and can quickly pull all necessary data required for personalization from the server side, while keeping the client side extremely lightweight.

Setting Up Personalization

The core tenet for personalization is “giving your user their next step, based on their previous steps”. While that sounds simple enough, how a personalization workflow is created could range from simple and straightforward, to quite complex. Essentially, there are two ways to set up a personalized experience for any user:

Rules based

This is a simple “if this - then that” logic applied to user actions in order to create personalization workflows. The rules are explicitly stated, and is simple to execute for your systems. Rules-based personalization is an ideal first step for businesses. It can be based on the major user-behavior parameters like location, age, previous pages visited on the site etc.

However, the challenge with this approach is that you have to manually cover every single personalization scenario possible. When you are just starting off with personalization, and probably have access to only a limited number of data points, mapping out all personalization opportunities is not a huge task. But as your user base scales, and the amount of data increases, rules-based personalization will begin to fall behind your requirements.

AI Based

This is where complex machine learning (ML) models come into play, parsing through volumes of collected heuristic data to find relevant user patterns and connections. Text and image classification, neural networks, and natural language processing combine to create highly contextual and personalized experiences. These models continuously learn, and hence can modify the personalization as user behaviors change.

However, the hurdle that most enterprises face with AI-based personalization is the lack of adequate and accurate training data for the machine learning models.


Adding the right tags against users and content pieces is a way to make them easily identifiable, and hence incorporated into the correct personalization workflow.

Content Tagging

One of the simplest, and also most comprehensive methods to prepare your existing content for personalization. Each piece of content, when tagged correctly with all the relevant parameters, gets served across the right personalization workflows. The objective with tagging is to let your systems know which content pieces can be served for which particular personalization parameters.

Progressive Tagging

Businesses should also progressively tag users to create a comprehensive profile. Every time a user visits your site, and performs certain actions, it gives you an opportunity to gain more insights into their behavior. So elements like the content filters they apply, the number times they visit a certain page, the amount of time they spend on a particular section of a page, could all lead to effective tags for their profile.

Conversational Interfaces

Chatbots are one of the most widely adopted methods to deliver personalized experiences today. But a key question to answer before diving into development is the kind of chatbot you wish to create. The choice is between restrictive and language-based chatbots.

Restrictive Chatbots

A simple conversation interface is where you type in a query and receive an answer but the interaction is restrictive in terms of what you can ask the bot. In this case, complete interaction trees are defined for the chatbot, and the user cannot deviate from that. These chatbots get the job done efficiently, and are focused on task completion.

Language-based Chatbots

These interfaces are powered by Natural Language Processing, and hence can handle a wide range of conversational diversity. The responses are determined by the bots’ understanding of the users’ question and context, rather than a pre-defined interaction tree. So you don’t have to manually define every conversation scenario for the chatbot to effectively assist your customers.

With the amount of data enterprises collect today, they can power tremendous growth, but only if they hone their ability to deliver valuable personalized experiences. And while I have illustrated a few points on here on how enterprises can approach this, our recent ebook takes a more detailed look at how exactly these five technology aspects work to enable effective personalization.

How about you take a look?

Personalization pre-requisites - ebook

Topics: Personalization, Machine Learning & AI

How are enterprises leveraging Chatbots for their business

Posted by Gaurav Mishra on Oct 25, 2018 2:19:00 PM

Chatbots are set to dominate enterprise-customer communications. But what’s probably less flaunted but equally important fact is that chatbots are being leveraged to streamline several intra-enterprise processes as well.

Srijan is currently enabling clients from different industries deploy chatbots for strategic business use cases. Here’s a look at the multiple ways in which these bots are aiding enterprise operations:

For Estée Lauder

Estée Lauder Company(ELC), with close to 30 brands, is revamping its enterprise learning system. Srijan is working to develop a digital learning ecosystem that would give the brand’s beauty advisors and over-the-counter sales teams anytime anywhere access to extensive learning resources. Part of this will be delivered via targeted micro-learning videos and training modules powered by xAPIs.

The other key part of the ecosystem with be simplified access to the ELC’s entire product information documentation, via chatbots.

ELC has an extensive database including information on different kinds of beauty products, their USP, and their suitability according to skin tone or type, ingredients, allergies and more. But memorizing this huge amount of information and answering accurately when dealing with a customer is almost impossible for the sales people. And having to search for this on a site or manual is not a great experience for the customer.

All the information is stored on a decoupled Drupal database, and the chatbot is designed to pull the necessary information from this repository to accurately answer questions. So the next time a customer enquires about whether a beauty cream is suitable for oily skin type, the salesperson could simply ask the chatbot, “What are the suitable skin types for using Product A?” and get the answer within seconds.

While this undoubtedly solves the challenges of a counter staff, it also has added benefits of enhancing the customer experience, who are now able to get their queries handled accurately and quickly.

For a Global Cleaning Solutions Company

As a leading provider of intelligent cleaning solutions - both chemicals and cleaning equipment - the company wanted to be able to analyze and optimize the performance of their products. They leverage IoT sensors to collect performance data from their equipments installed at various client sites. From the soap level in the dispensers, to equipment temperature, to resource consumption, the sensors tracked all sorts of data that could be viewed on interactive dashboards, and drive operational efficiency for their clients.

However, these dashboards had a few drawbacks:

  • They required a log in and  were not always remotely accessible
  • Clients took time to find the right data

And that is when chatbots came into the picture.

Make data accessible

To get around these challenges, Srijan built a chatbot that could make the data collected on the dashboard easily accessible to client stakeholders

The chatbot worked on an “asked-and-answered” approach where the client simply had to ask a query and the bot would analyze all necessary data to give a clear answer.

For example, by simply asking the chatbot, “Which machine is underutilized in France?”, the client exec could get an idea of which equipment, in which region and which factory is not underutilized, and why.

Other similar questions around equipment performance, profits, resource usage etc can be answered by the bot in real time. Because the bot is easy to use and does not require people to look at and analyze a lot of data, it has seen increased adoption by the company’s clients.

Besides real-time reporting, Srijan teams also created chatbot PoCs for two other use cases for the company:

Automate internal processes

Enterprise operations such as  ticketing, travel resourcing, customer care, customer onboarding etc. could all be streamlined to save the company’s time and resources.

For example:

  • The chatbots could access all company documentation to quickly answer employee queries on salary, leave policy and other relevant information.
  • An employee intending to book a flight could simply use the chatbot to access the travel interface, where he could provide details of his destination and date of journey and plan his travel requirements. He could also send automated leave applications to his department head, and inform others in his department of his absence, all via the chatbot

Real-time assistance for field teams

Field teams out for equipment repair and servicing could use chatbots to quickly access necessary information like technical specification and manuals. Any challenges they face during repairs could be directly addressed to the chatbots, and correct answers received. Also, with the help of AWS DeepLens a field team member can directly communicate with an offsite expert when stuck, as well as run machine learning models to allow the chatbot to master the process.

Srijan worked with Python as a backend, PostgreSQL database, and AWS solutions like Lambda, S3, and Lex to build these chatbots.

Building Branded Alexa Skills

For an airline brand

Srijan recently built a PoC for a prospects in the aviation industry, building a branded Alexa skill to interact with their customers. The skill can be activated when customers make a named invocation similar to “Uber, book me a cab”. It is designed to address customer queries like status of a flight, flight details, and latest deals and offers being provided by the company.

JIRA Assist

Srijan is also currently beta-testing JIRA Assist, an Alexa skill built to interface with JIRA and keep you updated on the status of your JIRA board, without actually having to open the board. This serves as a simple project management tool where people can ask for the status of their JIRA board, track stories and deliverables, create tasks and sub-tasks and more.

Srijan teams are developing chatbots for diverse use cases across industries. As a Standard Consulting Partner in the Amazon Web Services Partner Network, Srijan has machine learning expertise and certified AWS professionals who can help you build Alexa skills specific to your business area, as well as high-performance conversational interfaces.

So if you are looking to build chatbots for specific business requirements, let’s get the discussion started on how Srijan can help.

Topics: Machine Learning & AI, MarTech

Showcasing AI-Based Learning Systems at DemoFest, DevLearn 2018

Posted by Nilanjana on Oct 24, 2018 1:43:00 PM

DevLearn 2018 is all set to happen from 24-26 October, in Las Vegas as always. And one of the best parts of the event is the DemoFest. It’s an exciting showcase of the range of eLearning solution developed by the participants, and a chance to witness the first hand innovations sweeping the industry.

This year, Srijan will be a part of the DemoFest at DevLearn. It’s our chance to showcase the ambitious digital learning ecosystem we are developing for Estee Lauder, and everything that it’s capable of.

The Srijan team - Shashank Merothiya and Arunima Shekhar - will we talking about the AI-based Learning and Assessment Platform being built for Estee Lauder.

This learning platform takes apart their existing one-size-fits-all learning system, and create an intelligent solution that can help personalize the enterprise learning process. AI-based evaluation help map out what each employee has learnt, and what they are yet to learn based on a host of cues:

  • Length of the training content consumed
  • Number of post-training questions answered
  • How much time was taken to answer them, and more.

All of this analysis present a detailed picture of employee learning. And helps Estee Lauder ensure that the right information and training is pushed to the people, based on exactly what they need to learn.

If you this all that’s interesting, do join the session at DemoFest, on 25 Oct, 4-6 PM.

To explore Srijan' work with enterprise learning systems, you could also check out:

Enterprise Learning Management Systems - The Possibilities

xAPI: Towards ROI for Enterprise Learning & Development Programs

Topics: Machine Learning & AI

AWS Lex  - Cognito Auth and Translation

Posted by Sanjay Rohila on Oct 18, 2018 11:33:00 AM

Lex is a powerful deep learning service by AWS, offering automatic speech recognition and natural language understanding features. Lex must be used with some frontend interface  -  mobile app, web app etc.

AWS provides SDKs to talk to Lex which can be used in applications. We can have a login form in the application and authenticate with Cognito (via SDK or API) and then get back an accessToken, which can be added to sessionAttributes in Lex calls. Lex will pass on sessionAttributes to Lambda in the request.

Authentication: We can use the accessToken to authenticate now. I have created a small Cognito helper class. Feel free to use it and tweak it to your requirements. 

Authorization: We can also decide authorization with this. Simply fetch the user group from Cognito and control access to content.

Translation: We have the user info which also has locale information. We can use this to translate content to user-specific language. I am going to stick with AWS Translate service here, but feel free to use any translation because AWS has limited language support. We can create one file under src/aws_api/.  

While this'll work fine, it can be made more efficient by adding locale and user groups to sessionAttributes while sending the response back to Lex. This way Lambda doesn't have to make a request to Cognito for this info, as we can directly get it from sessionAttributes within the same session.

Lex communication via API: Lex doesn't understand any language other than English. So our application needs to translate user input into English then send to Lex. To overcome this and have more control over what goes to Lex as input, instead of using SDK we can have API gateway between application and Lex. This intermediate API (Lambda behind API) can do translation and other manipulation before sending to Lex.

So those were some of the basics of managing authentication and translation on Amazon Lex. You can also explore:

Topics: AWS, Machine Learning & AI, Architecture

Amazon Lex - Handlling Complex Workflows and Dynamic Slots

Posted by Sanjay Rohila on Oct 5, 2018 11:42:00 AM

Amazon Lex is a chatbot framework which allows us to create conversational bot over voice and text. There is a lot of documentation around for AWS Lex, intent fulfillment. But there is not much if we have to change the flow of bot based on previous slot values or create complex workflows in Lex. So In this post, we are gonna through some stuff that I have done for Lex while creating workflows:

Initialization and Validation Code Hook

Amazon Lex provides code configuration which allows us to add lambda on every user input (basically every slot). We can add lambda function and return dialogState from there. We can also add our own custom message with an updated option for the user based on business logic or previous slot values. dialogState is very important for deciding what should be our next slot to elicit or delegate to lex or fulfil and provide slot values by yourself, all based on the use case. We get the intent and current slot values from lex as request and then we can decide what we should ask next. On request, we can also repeat the process but with an updated values of slots. The code example is given below:

Core Controller

To make it more robust and scalable, following is the structure I use (From above code which bit goes in which file, go figure):

So each of the intent files can have JSON structure with slots, other configs and then some interface functions (below). All the controller has to do is play with business logic and call these functions for data whenever required. This data/content can be stored in some workflows management tools or database .

If this makes sense let's jump to our next post, where we are gonna talk about using AWS Cognito for Auth and Translation.

Topics: AWS, Machine Learning & AI, Architecture

Sriram Sitaraman is among speakers at Nasscom Annual Technology Conference 2018

Posted by Kimi Mahajan on Sep 24, 2018 11:46:00 AM

Technology has seen transformations since the late 1600s, which started with mechanical powered steam engines followed by assembly line transformation brought in by Henry Ford and Edison and then came the world of microprocessors. Now we are in Technology 4.0, the most profound one, and Nasscom’s 5th Annual Technology Conference (NATC 2018) is all about it. NATC 2018 is being held in Gurugram on 26-27th September.

The conference will focus on progressive topics including Cyber Physical systems, IoT, and all forms of computing including Cloud, Fog, Edge, Mist and Cognitive Computing. Every year, NATC brings together speakers from all over the country to share their knowledge and experiences amongst the upcoming talent that are looking forward to the transformation in them and for their organizations.

Sriram Sitaraman, Practice Head - Analytics and Data Science at Srijan, has been invited to speak at NATC 2018. Sriram will share his extensive experience on advanced NLG and how it is gaining traction and relatively more adoption in recent years. Advanced NLG holds the maximum business value as it produces meaningful interpretation of structured and unstructured data and distinguishes its most important and interesting part.

The most well-known subsets of Artificial Intelligence (AI) are Robotic Process Automation (RPA) to automate repetitive tasks by giving the ability to the systems to learn and improve work processes, and natural language processing (NLP) to enable machines to understand humans’ way of writing or talking. A lesser-known subset is Natural Language Generation (NLG). Ever wondered how a virtual assistant responds when asked a question? Answer to it is NLG. It’s a process that can generate natural language text and speech from predefined data.

Through this session, Sriram  plans to demystify NLG by putting it in context of NLP, Computational Linguistics and the basic semantic architecture. He will explain a few use cases across contextual narratives, summaries and long text generation to help you dive deeper into implementing Artificial Neural Network for NLG. The high level topics he would cover in his session are:

  • Advanced NLG: Deriving facts from data to interpret what is most important and interesting
  • Template Driven NLG: Fitting data into existing templates
  • Basic NLG: Transforming data to text



Speaker: Sriram Sitaraman

Schedule: 12:00pm, 27th Sep 2018

Session Abstract: Click here

Registrations: NATC website

Other speakers at the conference include Cynthia Stoddard, Global CIO at Adobe, Deep Kalra, founder and CEO of MakeMyTrip and Shailesh Kumar, Chief Data Scientist at Reliance Jio, to name a few.

About Sriram Sitaraman

Sriram has over 20 years of experience in designing and delivering innovative business solutions. He leverages his expertise in machine learning, statistical modelling, and business intelligence to enable digital transformation in industries as diverse as healthcare, manufacturing, retail, banking and more. Sriram is Practice Head for Analytics and Data Science at Srijan Technologies. He is currently working on a couple of projects for Fortune 500 companies where he is playing an instrumental role in AI & ML adoption on a large scale.

About Nasscom

The National Association of Software and Services Companies (NASSCOM) is a non-profit trade association of Indian Information Technology and Business Process Outsourcing industry. NASSCOM has over 1500 IT services companies as members, of which over 250 are companies from the United States, UK, EU, Japan and China and are in the business of software development, software services, software products, IT-enabled services and e-commerce. The NATC is a conference for technologists and technocrats and agenda of organizing it is to facilitate a common platform for top decision makers to discuss the current trends and encourage innovations.

Topics: Community, Machine Learning & AI, Event

The Basics of Voice Search Optimization

Posted by Gaurav Mishra on Sep 21, 2018 4:24:00 PM

Search is changing, and so is the way consumers choose to engage with businesses local or global. There is a distinct move away from screens and keyboards, and into voice-based interactions. Voice-search is becoming a fast growing habit across consumer segments, and fundamentally transforming how people and businesses transact on the internet.

Voice Search is Picking Pace

Consider this:

All this point to the fact that the future of search is voice-first. This is primarily owing to the fact that voice search is user-friendly.

What’s the easier out of these two options?

  • Unlocking your phone, opening a browser, typing a search query, scrolling through the options, and selecting the best one, OR
  • Just saying “OK Google” or “Alexa, followed by a query, and getting your answer


In an era of short attention spans and multi-tasking lifestyles, the easier and faster option is always the better option, whether it’s pizza or search results.

Voice assistants fetch you one answer, the best answer, to your query. And you can get your answers without distracting yourself from anything else that you are doing simultaneously.

voice search optimization


Basically, voice search brings in a degree of convenience that customers are rapidly growing accustomed to.  

What Does That Mean for Businesses?

Given the increasing popularity of voice-search, it is fast becoming an important channel for prospective customers to find and interact with your brand. In the absence of a visual interface, or a list of options, customers increasingly rely on the voice assistants’ first answer.

This interaction design severely limits your chances of getting in front of your customers. Businesses should expect steep fall in site traffic, brand awareness and engagement metrics if they are not prepared to serve voice-based search queries.

So, how do you serve voice search?

The answer is to get your enterprise content ready to respond to voice-based queries. While that might seem like an obvious next step, it does require significant planning and forward thinking. Also, it’s not as simple as feeding your existing content on web pages and blogs and ebooks into a voicebot database.

Let’s take a look at the key considerations to get your content ready for a voice-first world:

Voice-based content needs to be created differently

The way we speak and understand verbal communication is markedly different from the way we understand written information. So the content you create to serve voice search has to conform to certain standards:

Concise: People have lower attention span for verbal information, especially in the absence of a face to anchor the conversation. While responding to voice-based queries, your content/ information snippet has to be one short answer, to ensure attention and retention.

Clarity: While voice-based answers have to be short, they also have to give enough information to make the answer clear and understandable.

Provide Context: Since voice-based content is only heard and not seen, all contextual interaction clues are missing. The context has to be communicated verbally, for the user to know the next steps, or what to expect.

For example: On a webpage for industrial cleaning machinery, you can have a hyperlink that says “find more information here”. The customer can see it and knows what to expect when she clicks the link. She has visual context.

However, when the information is voice-based, simply translating the hyperlink text to voice will not suffice. Your content will have to be modified into a question that says, “Would you like more information on industrial cleaning machinery?”, to which the customer can respond by clicking or typing yes or no.

Voice-based content has to be optimized differently

Voice-search optimization is heavily focused on usability and creating an intuitive search experience. Some of the key tenets for voice optimization would be:

Leveraging long-tail keywords: Because voice-search is easier, the search queries are more conversational and longer than typed queries. 

For instance, a user might type “Buy Ferrero Rochers” on Google, but if he’s searching through a voicebot, it’s more likely that he’ll say “Hey Google, I’d like to buy some Ferrero Rochers”.

Thus, long tail keywords, that perfectly align with user intent, should be targeted while preparing voice-based content.

Featured snippets: Coming at the top of the search pages, even before rank one results, these snippets are very often used as the answer for voice searches. With only a few tweaks, it is possible to create segments in your content that be picked by Google as the featured snippet, thus increasing your chances of being the answer to a voice-based query.

Answer questions: Voice-searches are often articulated as specific questions. So creating short content pieces that answer expected consumer questions is the best way to leverage voice-search. Even after offering an answer, voice assistants should prompt relevant expected questions, to lead users to other results.

Use microdata markup: Markups like and Dublin Core can be used to mark-up your HTML code, to structure your metadata. Just like it works for text-based search, this markup can also optimize portions of your content to be picked to serve voice-based queries.

Voice-search is easy, convenient, and reliable - everything that customers want when they are searching for something. They are fast moving to this mode of interaction and businesses have to keep up, if they wish stay relevant. So whether it’s optimizing your content to serve voice-search, or creating voicebots to make your content more accessible, you need a voice strategy in place.

You could also check out:

Webinar: Voice-First World - Are You Ready For the Bots?

Blog: The Alexa Skills Race: Is Your Enterprise Keeping Up?

Blog: Why Voice Technology is the Next Big Investment for Media Enterprises

Srijan is helping enterprises explore the new opportunities in a voice-first world, and capitalize on them with the right technology solutions. Book a call with our expert team, and let’s see how your enterprise can be prepared to leverage voice-search.

Topics: Machine Learning & AI

Owning Amazon Alexa skills: Is your Enterprise keeping up?

Posted by Gaurav Mishra on Aug 20, 2018 2:23:00 PM

By 2020, Amazon’s Alexa is expected to become a $10 billion industry. A current leader among the voice assistant platforms, Alexa, allows third party developers to build new Amazon Alexa skills using its extended API. Many media enterprises  like the BBC, ESPN, and the Daily Show have already introduced their own ‘skills’ for the Alexa platform, aimed at providing unique content to their customers.

What are Alexa Skills?

Alexa Skills are like voice-powered equivalents of applications that are used in mobile phones or PCs. Once installed on devices, they can be used by Alexa to provide immediate and intimated response to customers, by understanding and building upon their intent and needs.

While branded skills like, “Alexa, what’s the Forbes quote of the day?” are specific only to a brand, it is the generic skills that are more challenging to capture. For example, if someone says, “Alexa, give me the latest news headlines”, there are millions of websites that provide the information but Alexa will read out only from the first search result, or curate it from a select bunch of sites.

However, if you are a media company that captures this generic skill, you become the sole owner of it. This means that next time someone asks “Alexa, give me the latest news headlines”, Alexa finds your skill to be the most suitable to answer this query. So your brand is the one that’s providing the answer, increasing your engagement with the audience. And even when people who are not currently your reader/audience, use this generic command for Alexa, they are exposed to your branded content, allowing you to rapidly expand your reach.

This single-owner model has created a rush among enterprises to capture the skills before they are gone. The race, as they say, is on.

Why Alexa Skills

Andrew Ng of Baidu estimates that 50% of all searches will be completed either via speech or image search by 2020. Besides, the Alexa Skills marketplace has also surpassed 30,000 in the US in just two years.

If enterprises do not see these figures as an opportunity now, they will face it as a threat in the near future as competitors gain the first-mover advantage, and capture critical generic skills. With enterprises like Uber, CNN, Starbucks and Techcrunch already in the picture, it only makes sense for your business to tap into the market before it is too late.

Besides, Alexa skills can help your enterprise achieve the following:

Increased engagement with customers

The use of Alexa skills will help in creating a unique content experience for the customers where they will be able to gain all the required information at one place, and complete everyday tasks like add things to their shopping list, or book a ride. For example, the Uber Skill can help them order an Uber by simply downloading the skill on their device and saying, “Alexa, ask Uber to request a ride".

Similarly, they can use other travel skills for their commute, Domino’s skill to order food and Audible’s skill to read books. These types of easy-to-use skills, makes your brand very accessible to the customers, lead to significant increase in engagement, as well as attract new customers.

Catering to new content consumption needs

With the help of new innovations in voice tech, brands can help users by providing them a skill that could organize their lives better.

For example, a fashion brand can come up with a skill that can take customer requests about the kind of clothes they require for an occasion, and suggest relevant outfit options from the brand’s collection. And these services can be generic skills, activated with simple commands like, “Alexa, find me the right outfit.”

How to approach Alexa Skills

There are currently over 25,000 skills on Amazon Alexa in the US and with companies in the race to launch their skills, it can be quite difficult to stand out. Enterprises can think about their Alexa Skills strategy in a few different ways:

Use the skills that already exist

You leverage existing generic Alexa Skills that syndicate content to answer specific queries. For example, the Alexa Flash Briefing skill curates headlines from various publications. If you are a media company, you can get your content to list for this particular skill.

While this strategy is a good starting point, in terms of getting your brand before the right audience, it has a drawback. Getting listed on a skill like this does not allow you to access any data on user interaction with your particular content.

Create a branded skill

Branded Alexa Skills are one of the best ways to engage your audience, especially for brands that offer a particular service, like ordering food, or booking a cab, opening a bank account.

For example,

  • BFSI enterprises already have their apps to ensure easy access for customers. These can also be transformed into branded skills that can allow customers to achieve tasks and receive information with single commands.
  • Branded skills could also be explored by B2B enterprises, to streamline internal operations, and save time and money. Like a retail company’s branded skill which can help them track inventory and receive crucial updates, all through a voice activated feature.

Branded skills allow you to extract and analyze a range of user data as well, which can be funnelled into creating more personalized responses and improved experiences. The only hiccup is skill discoverability - the fact that your users have to first know that your brand has an Alexa skill, and then download it on their devices.

Capture a generic skill

Generic skills are where the race heats up. “Alexa, give me the top financial news”, or “Alexa, suggest a good mystery novel” are simple commands, not demanding the user to add any brand name to the query. These name-free interactions are easily discoverable, and naturally integrated into users daily lives, which is what makes them prime property.

Amazon’s current single-owner model means capturing a generic skill give your brand sole access to answer a particular query. And once you own a generic skill, you can access complete usage data and leverage that to improve customer experience.

As enterprises realise the potential of generic skills, they are rapidly capturing these skills. So if you spot an opportunity here, you need to move fast or someone else will.

As of now, industry specific news feeds are one of the high-value generic skills that are still open. And your brand can capitalize on this irrespective of whether you are a media enterprise or not. Being the sole provider of top news feed for your specific industry can bring your brand in contact with a significant section of your target audience. And that is too valuable an opportunity to miss out on.

How to build an Amazon Alexa Skill

Now that we’ve established the value of Alexa Skills, the next question is how you can build these for your enterprise, or rather who would do that for you?

While Amazon has built a toolkit to help you develop Alexa skills, you can also get them built through third party solutions. One benefit of using a third party developer is that it saves time and helps you build your skill once and submit it to multiple devices. Using a third party solution can definitely accelerate your launch process, especially if you are just venturing into the voice arena.

The market, as they say, is ripe. Developing even a basic skill now can give you a first-mover advantage, and make your skill a habit for your customers.

Srijan teams are already building Amazon Alexa Skills for diverse use cases. As a Standard Consulting Partner in the Amazon Web Services Partner Network, Srijan has certified AWS professionals who can help you build Alexa skills specific to your business area. Let’s get the discussion started on building your very own Alexa Skills.

Topics: Machine Learning & AI, Enterprises

4 Things to get right with your Travel Chatbot

Posted by Gaurav Mishra on Jun 8, 2018 4:41:00 PM

Whether it’s Oscar from New Zealand Airlines or Mildred of Lufthansa, chatbots are a key peg of the digital experience strategy of travel enterprises. And they are driving higher revenues too - Rose, The Cosmopolitan’s virtual concierge, has increased customer spending on the hotel property by 39%. On the other side of this transaction, a TravelZoo survey pointed out that 80% of over 6000 travellers surveyed across Asia, Europe, North and South America, think that bots will be an accepted and helpful component of their lives by 2020. 

Offering superior digital customer experiences are the way forward for travel enterprises. And between your customers expecting simplified interactions, and competitors already launching bots with quirky personalities; the decision of whether or not to invest in a chatbot is out of your hands. The only question that remains is what kind of a chatbot you should design, and how soon can you launch it. 

So while you get down to the drawing board, here are four things to absolutely get right with your travel chatbot design:

Set the Right Expectations for Your Chatbot

The chatbot experience can go two ways: either the user gets exactly what they want from the bot and are very satisfied, or they keep asking questions that the bot cannot answer, and get frustrated. More often than not, the bot fails to perform well because the user is asking questions, or trying to perform tasks that the bot wasn’t designed for. 

So it’s extremely important that you clarify in your chatbot design what the bot can and cannot do. This goes for both the team that’s developing the bot, and for the users of the bot. 

For the development team, it’s important to realize that a chatbot is not a total replacement of your website or customer service executives. It cannot do everything for everyone. So it’s better to have a bot that’s limited in what it can do, but does its one job really well.

To begin with, you need to choose a set of key tasks that your bot should be able to do. This can usually be extracted from understanding some of the most common tasks that people perform on your app or website - searching for tickets, enquiring about offers, queries about documents required, etc. Once you know these, you know what your bot should be able to solve for, and design it to do those tasks well. 

For the users, you can clarify the abilities of the bot in a couple of ways:

  • Have a welcome message that clearly states what the bot is capable of. For example: An airline bot’s opening message could be something along the lines of, “Hello Boris, where do want to book a flight to?”. Or a booking website chatbot could state. “I can book a room for you at some of the best hotels across the world. Which city are your travelling to?”. 

Oscar chatbot designOscar’s welcome message here tells the users exactly what to expect from the chatbot, and what to ask next.


  • Have pre-programmed answers as option buttons. Option button are a great way to guide a conversation and keep it on track, giving users a clear understanding of which answers are acceptable and how to respond next. They also set the right expectation of what are the types of tasks your bot can handle, and what would be out of bounds.

  • Have clear apology messages. When the bot is unable to adequately respond to a query, the error message should explain exactly what the issue it. Or it should offer up a new suggestion that users might find helpful in a given context. A generic reply like “Sorry, I can’t help you there” causes frustration with the bot. But a message like, “I don’t think I can help you with that. But would you like to try exploring one of these options instead?”, is great, because the bot still attempts to help the user. 

Cater to the Context

One of the major reasons that chatbots are an improvement over human interactions is their ability to access past interactions and context, to personalize the conversation, while upselling new services. 

As an airline company, if you have data that a particular traveller flies business-class, your chatbot can use that information to be proactive in anticipating the traveller’s needs. A quick question by the bot, “Flying business class as always?”, can make the conversation more personal, and the traveller more inclined to fly business class. 

Similarly, an airline reservation bot can expand services to prompt cab bookings as soon as you land, or suggest great bars and restaurants at the airport if you have a layover. 

For chatbots designed to be tour guides, the ability to leverage past behaviour and weave that into the conversation becomes crucial. Creating a complete map of user preferences, based on past interactions and transactions, can go a long way in personalizing travel experiences for individual users. For example: Foursquare’s Marsbot does a great job of offering travel and entertainment recommendations based on user’s previous Foursquare check-ins. 

However, it’s also worth noting that making personalized recommendations alone cannot make a great chatbot. You have to back it up with actual problem-solving capabilities, like the chatbot being able to book the recommended places, make advance payments etc. to truly be valuable for users.

Handover to a Human

While chatbots are meant to automate a lot of the customer conversations happening at travel enterprises, it’s not advisable to leave them out in the wild, on their own. There are customer queries that could be too complicated for the bot, and inability to answer them could lead to a bad user experience. 

Usually bots can give out a customer care number on which to call and end the conversation. But your travel bot could go a step further and ensure that customers do not have to change channels to get answer to their questions. There should always be a human who can take over automated conversations and help answer queries that lie outside the bot’s capabilities. What’s important to ensure is that the switch from bot to human should be a smooth one, and clearly communicated to the user. 

For example, you could have a message like “I am sorry I cannot answer that. Would you like to speak to our customer happiness ambassador?”. Simple yes/no answer option gives users the choice to carry on their conversation with a human, and makes it very clear that a switch has taken place.

Measure Your Bot’s Performance

Chatbots are not a make-it-and-forget-it activity. As with all strategic tasks, proper measurement and analysis of bot’s conversations is key to optimizing bot performance. Here’s a look at some of the key metrics that you should be keeping an eye on:

Activation Rate: The number of people actively choosing to use your chatbot to complete their tasks is definitely an important metric to track. This shows whether your users are willing to try a different route, and how many of them believe that a bot will actually be able to help them get the job done. 

However, you will need to define the activation as more nuanced than just how many people click on the chat window. The focus should be on measuring how many people see a message from the bot and actually respond to the questions asked by the bot. In other words, measure actual interactions over clicks or likes.

Session Duration: How long a user interacts with the bot is another critical performance metric. The depth of the interaction indicates whether the bot has been able to initiate a meaningful conversation and lead the user through a helpful buyer or customer’s journey. 

However, what you can consider a good session duration would depend upon the purpose of your bot. If your bot is a simple reservation agent, then the entire conversation can be wrapped up successfully in four or five questions and answers, and that’s a good interaction. On the other hand, if your bot is more of a tour guide, then longer interactions are desirable, where the bot is recommending and exploring different travel experiences with the user.

Goal Completion: We started off with deciding what a bot can and cannot do. So each bot you build has one, or more, goals to achieve: ticket booking, driving more customers to a hotel’s in-house entertainment properties, answering specific questions people have about travelling in a particular city, etc. The number of times the chatbot is able to lead users to completing those goals is probably the most important metric. This shows whether the bot is able to do what it was designed for.

Retention: How often do users come back to your bot within a short period of time? Do they engage with the bot for all their requirements, every time they visit your site? This helps you figure out if users actually find the bot a helpful addition or just another piece of tech. If users who have already interacted with the bot once, do not use it on subsequent visits, then 
the bot wasn’t able to solve their problem, or 
they found it too confusing to operate the bot
In both these cases you will need to make some quick changes to your chatbot. 

Measuring bot performance
Source: Google's Chatbase is a great analytics tool for understanding chatbot performance.

Confusion Rates: How many times is your bot unable to satisfactorily reply to a user query? How many times is it confused or gives the wrong answer because it didn’t understand the question entirely? Measuring the confusion rate is crucial to understanding where your bot is failing, and if there might have been common interaction scenarios that you did not account for. Of course, bots learn better with increased usage, so the confusion rates will definitely be high to begin with and then slowly taper off. But you should definitely keep an eye out for consistently high confusion rates.

The travel industry is at the cusp of a tremendous growth wave. And the ability to leverage emerging technologies to deliver great customer experiences is going to be the key to getting your share of the growth pie. Chatbots are one of the most prominent of these emerging technologies, and you need to get started with them sooner rather than later.

Srijan’s chatbot development teams can help you design and develop a chatbot strategy tailored to your business requirements. Tell us a bit about your business goals, and our chatbot experts will be in touch.

Topics: Machine Learning & AI, MarTech, Travel & Hospitality


Write to us

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms Of Service apply. By submitting this form, you agree to our Privacy Policy.

See how our uniquely collaborative work style, can help you redesign your business.

Contact us