In part one of this blog post, we discussed some characteristics of how a bot behaves and what a user can expect when interacting with a bot. In this post, we will discuss technical elements in building a bot and learn to distinguish what is behind it so you can adjust based on the needs of your end user or customer. All of these is based on our learnings when working with our Stand.bot.
The development team will face a key challenge in recognizing a User Space / Context rather than just the user’s environment. Technology alone does not solve this problem and each platform represents its own challenges. So let’s get started and review the tech aspect of bot development.
Voice First bots.
One of the most outstanding platforms for bots is Voice, which is our natural form of expression and makes a very straight forward user’s experience. They tend to be the more complex service to build and as such we are reviewing it first.
Facebook’s Messenger is the de facto chat bot for the masses. For Voice is Amazon’s Alexa and in a distant second Google Home. Just this past December sales from Alexa grew 9x and Google Home launched with a blast on the Holiday season.
Building on top of Alexa is a no-brainer, but there are many competitors sprouting in the Voice landscape with Microsoft offering developers a speech service in its Bot Framework if you want to build a Voice Bot on Skype or a Slack bot.
The Amazon platform is very well documented, you can find a lot of information in their main developer's site and you have a wide range of skills (apps for Alexa) in their marketplace. For a bot to work inside the platform you need to understand that this is not a normal app and that it is limited to just the device you are working with your profile, so be patient.
The most important terminology you need to have on top of your mind are:
- Invocation Name
I recommend that you read the 3 part series (one, two & three) on building a Custom Skill from Liz Rice and very well written tutorial that takes you step by step in the process of developing an Alexa Skill.
We have to give credit to Amazon for making voice commands mainstream. Everyone has interacted with Siri or heard an iPhone conversation with it. But Alexa has brought it to everyone’s home and more people have become familiar with the use of Voice User Interfaces (VUI), so there are not a strange thing for people to interact with.
Google Home is right behind Alexa and some of their Google Assistant demo’s are impressive. Since Google can leverage all the information stored in your Google Account, it makes it much more “intelligent” to understand the space/user context.
You can checkout Google Assistants API.AI documentation and here is a video that explains how to get started building Actions for Assistant. Also, I recommend that you read this tutorial on Google Actions by Romin Irani on ProgrammableWeb and also this thread conversation on Stack Overflow.
The main advantages voice has over other types of communication in a remote or distributed team are:
Efficiency; voice is a transparent and direct medium to collaborate.
Distractions; a user can focus on direct commands and limit any other type of multitasking.
Data access; you can connect as many datasets as the platform can handle and have all the information in one place.
Unify communication; a virtual agent can become the main communication channel, avoiding multiple channels to communicate.
- Insights; the bot could learn to anticipate scenarios based on past use.
As you will see there are changes in how you build a Voice Bot. As with any new technology consolidation on terminologies will happen soon, but for now, your development team will need to navigate the different taxonomies to get things rolling.
Neural Networks & bots.
The magic behind some of these platforms are the algorithms that they use to interpret text and voice. But as any technology they have shortcomings in performance and accuracy.
Typically neural networks are comprised by multiple layers of “neurons” or nodes, as well there are many variations and combinations of ANN like Convolutional Neural Networks(CNN) and Recurrent Neural Networks(RNN). But this is outside the scope of this post, if you like more in depth knowledge on the matter please visit Machine Learning a Simple Neural Network and How To Build a Simple Neural Network to get a glimpse on how it works and how to code it.
What is really interesting about ANN is that the platform will learn from the continuous use of it and adapt to the verbosity of the environment making it flexible and robust in “theory”. The bot learns what users are typing and learning from the user/space context which is one of the main challenges faced by bot platforms.
One of the most talked about platforms is IBM Watson which has made its API available for any developer to use. They have a lot of content around chatbots and they even show you how to create a bot in 10 minutes but most importantly is that you will have access to Watson’s infrastructure and simple API tools to build a service that can be flexible for any scenario.
Virtual agents can be created for voice or text conversations, you have all sorts of tools you can use for JAVA, Node, Python ,etc. Do not worry about being limited to just one language for development.
Simpler solutions like Hubot and Lita are the perfect starting ground for. But an up and coming competitor is Microsoft Bot Framework which leverages Microsoft Cognitive Services, Bing Speech, Text Analytics and other machine learning API’s.
Once you create these virtual agents, they can interact in behalf of other team members or bring information from different data sets that are relevant to the user space. At PlanningWith.Cards we have integrated JIRA to our Hipchat and Scrum Poker Confluence Add-on, a bot could easily grab the issues assigned to one of our team members and remind her to update the status based on the time frame of the release date; it could send a message on Hipchat or notification in a Confluence page.
A remote team can benefit greatly with bots, because they can become the gatekeepers of the information available for everyone in the organization at any time. So an organization can be more efficient in distributing critical information to stakeholders for better decision making.