Building Chatbots – Introduction
This is really a hot topic these days: Chatbots . In this tutorial, we’re diving in the world of chatbots and how they are built.
What are chatbots?
Chatbots are systems that can have a fairly complex conversation with humans. They can go by different names: Conversational Agents or Dialog Systems . As you’ve probably guessed, chatbots use a lot of Natural Language Processing techniques in order to understand the human’s requests.
The Holy Grail of chatbot builders is to pass the Turing Test. This means that a human can’t figure out that he’s talking to an actual human. Although we are pretty far from that, (especially from a Natural Language Generation point of view) great progress has been made.
Most of the chatbots that are built these days are goal-oriented agents. This means that they will steer the conversation towards achieving a certain predefined goal. We can have customer support agents that figure out what the problem the user is facing and then solving it (or just opening a ticket).
Chatbots can use several types of engines for understanding conversation. Most of the modern chatbots use one or more of these architectures
- Rule-based Systems: Handcrafted rules
- Information Retrieval Systems: Search for the information in a collection of texts then present it to the user
- Transduction models: Use deep learning models for understanding the input and generating the output.
Natural Language Understanding
Natural Language Understanding (NLU for short) is a term used to refer to the core of a chatbot, the part that deals with understanding what the human says. There is a very popular architecture type, that almost all NLU engines (both opensource and proprietary) use.
This architecture implies processing the user input in two steps:
- Determine the intent:
schedule-meeting
,buy-tickets
,complaint
,ask-question
,ask-for-refund
,get-account-information
- Extract entities (process also called slot filling ): Each intent can support arguments. For example, the
schedule-meeting
intent can support thewhen
of typedatetime
andwho
of typePerson
Let’s take a quick and simple example:
- Hello MeetingBot :
[intent: "hello", slots: {}]
- Hi Andrew
- Can you schedule a meeting with John Smith for tomorrow morning?
[intent: "schedule-meeting", slots: {when: "tomorrow morning", who: "John Smith"}]
- I’ve booked meeting room B for 9am February 12th for 1h. I’ll send a notification to John Smith.
This is how most of the chatbots out there understand conversations. There is another phase I didn’t capture in my example: slot resolution phase. We don’t have a way of using “tomorrow morning”. We need to transform it into an actual datetime
. The same goes for John Smith, we need to assign that string to an actual contact from out phone.
NLU Frameworks
There are a lot of frameworks out there. Here are the most popular ones:
- DialogFlow – from Google
- Wit.ai – from Facebook
- Luis.ai – from Microsoft
- IBM Watson – from IBM
- Amazon Lex – from Amazon
- Rasa NLU – Open Source
- Snips.ai – Open Source
The proprietary ones have GUI you can use to train your engine. You can create intents and then give various examples for each intent. This way you train an engine to understand your specific needs. After that, you can highlight the entities and assign a type and the engine will also do the resolution automatically. After this, you can interact with your engine via a REST API.
The open source ones can be used programmatically. Both RasaNLU
and Snips
are python libraries that can be easily installed. This way you keep your models local. No need for Google/Amazon/Facebook/IBM/Microsoft to know what your chatbot is doing. There are obvious advantages of using proprietary engines as well:
- No need to manage infrastructure
- No need to handle scaling
- Lower costs for infrastructure
- Customer Support
Here is a comparative study between the most popular NLU engines: Evaluating NLU Engines. Note that the paper does not take Snips.ai into consideration. The main takeaway is that the performance of the systems is very similar, thus, choosing the opensource solution does not compromise accuracy.
Following this study, Snips did a side-by-side comparison of their own NLU engine and the commercial ones. You can read about the results here: Snips Benchmark. Main takeaway: Snips outperforms his proprietary competitors on various benchmarks.
Getting Started with RasaNLU
Let’s dive into Natural Language Understanding with Rasa. It’s a fairly simple process, that goes like this:
- Install RasaNLU:
pip install rasa_nlu rasa_core
- Launch the NLU server:
python -m rasa_nlu.server --path=~/Desktop
- Test that the server is working properly:
curl 'http://localhost:5000/parse?q=hi'
- Create a config file, containing the training set
- Test how the trained model is performing
Here’s the file from step4:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | language: "en" pipeline: "spacy_sklearn" data: | ## intent:greetings - Hello there! My name is [John](person)! - Hi there! - Hi, I m [Nate](person) - Hi - How are you? - hi there, this is [Eliza](person) ## intent:get_restaurants - what are some good restaurants around? - Are there any good [sushi](cuisine) places around? - Can you show me some interesting [burger](cuisine) places nearby - show me some [chinese](cuisine) places nearby - i want a [chinese](cuisine) restaurant nearby - I m in the mood for some [mediteranean](cuisine) food - Want to try out a [vegan](cuisine) restaurant please ## intent:bye - thanks, bye - great, bbye - see you - bye - good bye - awesome, see you later - goodbye |
This file tells the server we want to create a new model for the English language using the spacy_sklearn
backend. RasaNLU also offers a TensorFlow backend, but that only makes sense when we have a lot of data. Let’s name the file rasa_nlu_model.yaml
and save it somewhere handy. Here’s how to train a new model using that file:
1 2 3 4 5 6 | curl --request POST --header 'content-type: application/x-yml' --data-binary @rasa_nlu_model.yaml --url 'localhost:5000/train?project=restaurant_nlu' { "info": "new model trained", "model": "model_20181024-233334" } |
Let’s now take our new model for a spin:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | curl 'http://localhost:5000/parse?q=I+want+to+find+a+chinese+restaurant+nearby&project=restaurant_nlu' { "intent": { "name": "get_restaurants", "confidence": 0.7882460089750741 }, "entities": [ { "start": 17, "end": 24, "value": "chinese", "entity": "cuisine", "confidence": 0.8573729764467187, "extractor": "ner_crf" } ], "intent_ranking": [ { "name": "get_restaurants", "confidence": 0.7882460089750741 }, { "name": "greetings", "confidence": 0.10729879803584634 }, { "name": "bye", "confidence": 0.10445519298907957 } ], "text": "I want to find a chinese restaurant nearby", "project": "restaurant_nlu", "model": "model_20181024-233334" } |
Let’s do another one:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | curl 'http://localhost:5000/parse?q=Hi+there,+I+am+George&project=restaurant_nlu' { "intent": { "name": "greetings", "confidence": 0.6858263442513893 }, "entities": [], "intent_ranking": [ { "name": "greetings", "confidence": 0.6858263442513893 }, { "name": "bye", "confidence": 0.2358127625728285 }, { "name": "get_restaurants", "confidence": 0.07836089317578213 } ], "text": "Hi there, I am George", "project": "restaurant_nlu", "model": "model_20181024-233334" } |
Pretty simple, right? Notice how the results are not perfect. For example, the name “George” was not properly extracted. Probably providing more examples will make it more accurate.
Getting Started with Snips-NLU
Here’s how to install Snips: pip install snips-nlu
. We will also need to download some English linguistic resources: python -m snips_nlu download en
.
Working with Snips is pretty similar to working with Rasa. Let’s start by creating a dataset. In the case of Snips, we need different files for each intent and for each entity. It’s worth mentioning the subtle difference between entities and slots. Entities are real-world classes of objects. For example, city
is an entity. departure_city
is a slot, and it is of type city
. Let’s write the equivalent intent files for the previous example:
intent_greetings.txt
1 2 3 4 5 6 | Hello there! My name is [name:personName](John)! Hi there! Hi, I'm [name:personName](Nate) Hi How are you? hi there, this is [name:personName](Eliza) |
intent_get-restaurants.txt
1 2 3 4 5 6 7 | what are some good restaurants around? Are there any good [restaurantCuisine:cuisine](sushi) places around? Can you show me some interesting [restaurantCuisine:cuisine](burger) places nearby show me some [restaurantCuisine:cuisine](chinese) places nearby i want a [restaurantCuisine:cuisine](chinese) restaurant nearby I'm in the mood for some [restaurantCuisine:cuisine](mediteranean) food Want to try out a [restaurantCuisine:cuisine](vegan) restaurant please |
intent_bye.txt
1 2 3 4 5 6 7 | thanks, bye great, bbye see you bye goodbye awesome, see you later goodbye |
entity_cuisine.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | thai chinese asian american european spanish, tapas greek shawarma, kebab sushi, sashimi japanese mediterranean mexican, taco, burrito italian, pizza, pasta gluten-free, gluten free, GF vegetarian vegan paleo |
entity_personName.txt
1 2 3 4 5 6 7 8 9 10 | Nate Jane John Eliza Helen Anna Chris Roxanne Dan Mathew |
Let’s now create the dataset in the Snips format. These files are just a convenient way for us to organize the intents and entities. Snips expects a single json file:
1 2 | snips-nlu generate-dataset en snips_dataset/intent_bye.txt snips_dataset/intent_greetings.txt snips_dataset/intent_get-restaurants.txt snips_dataset/entity_cuisine.txt snips_dataset/entity_personName.txt > restaurant_chatbot.json |
Take a few minutes to look at the json file.
Let’s now actually train a Snips model. Here’s the easy command for doing just that:
1 2 | snips-nlu train restaurant_chatbot.json restaurant_chatbot.model |
Querying the engine goes like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | snips-nlu parse restaurant_chatbot.model -q "What's the nearest indonesian restaurant?" !snips-nlu parse restaurant_chatbot.model -q "What's the nearest indonesian restaurant?" { "input": "What's the nearest indonesian restaurant?", "intent": { "intentName": "get-restaurants", "probability": 0.5999915913758308 }, "slots": [ { "entity": "cuisine", "range": { "end": 29, "start": 19 }, "rawValue": "indonesian", "slotName": "restaurantCuisine", "value": { "kind": "Custom", "value": "indonesian" } } ] } |
Notice how Snips was able to detect that Indonesian is a cuisine, even though it wasn’t in the training dataset.
Conclusions
- There are a plethora of alternative NLU engines for building chatbots
- The basis for building chatbots is having an NLU engine handy. We can then build answers, interact with API and maintain state around this NLU.
- There a both proprietary and open-source NLU engine alternatives. Both deserve attention.
- All of the engines provide a pretty similar functionality. The differences are in the details and formats used.
NIce article! AI makes it possible for chatbots to learn by discovering patterns in data. This ability gives them the intelligence to perform tasks, solve problems and helps to understand human emotions. I found a few interesting blogs which enlighten the significance of chatbots in the field of marketing. If you are interested here are the link to those blogs (https://www.navedas.com/the-chatbot-marketings-new-secret-weapon/) and (https://www.hubspot.com/stories/chatbot-marketing-future).
is this blog dead? Shame, it’s been a good one!