Uploading simple Q&A data

Updated 2 weeks ago by Matthew Clementson

Training your bot

When building a bot, you require training data to help the Hu:toma AI engine learn and generate relevant responses to your end user’s input. The Hu:toma platform allows 2 main ways to input training data:

  • Simple Q&A upload: Simply upload, copy & paste, or type historical conversations or conversation samples between end customers and the company.  At this point, there are some rules to follow on how this file is structured.

  • Intents and Entities: If you have a more complex dialog or a larger set of Questions & Answers you can instruct the Hu:toma AI engine to train on intents and entities. This is explained in detail in the intents and entities section.

Anatomy of a Training file

The current version of the Hu:toma platforms supports upload of .txt files containing examples of conversations. Lets look in detail at how you can build your own training file.

Single Turn Dialogs

A single turn dialog, is a dialog that typically consists of one question and one answer. For example, asking 'what time is it' is typically responded without further clarification needed. A single turn dialog training file will essentially look like a sequence of Q&A pairs as follow.

Note that between each Q&A pair there is a line space. This tells our system that the previous conversation can be considered completed and no contextual information needs to be retained. We will explain this in details in just a moment. Once you built your training file all you need to do is upload it.

After following Step 1 Creating a bot and Step 2 Basic bot details you will enter the training phase that looks like this:

Once you have selected the Simple Q&A option you will see an inline editor.  Here you can type your Q&A pairs in directly, copy and paste from a text file or upload that text file directly by clicking "selecting them".   

Once you have added your Q&A, click "Save" and Hu:toma will start training your bot.  

 You can also view a sample file by clicking on the link “this example” or watch an online tutorial video by clicking "training video tutorial".

The rules for your training file, at this point, are:
  • .txt files only
  • Maximum file size 0.5 MB (Don’t worry you’d be able to pack a lot of training data here!)
  • A single pair question and answer is separated by a line break
  • Question and answer pairs are separated by an empty line in Single Turn Dialogs

    We recommend adding 3-5 variations of the same question to help the AI engine learn better. For example, let’s take a look at a sample question from a Weather bot.

    "Hi, what’s the weather in Barcelona?"

    For the above example, it’s likely that another user who wants the same information will ask the question in different ways:

    "Weather in Barcelona"
              "Can you tell me the weather in Barcelona?"
              "Is it sunny in Barcelona?"
              "Will it rain in Barcelona?"

    Adding the above variations in your training file (with the expected responses) would help the AI engine perform more accurately

    As soon as you upload your training file, you’ll see that the Training starts. Training happens in 2 phases. 

    Multi Turn Dialogs

    If you've ever ordered a pizza over the phone, you know you usually have to answer a few questions before they can complete your order. Typically, you will have to give your name and address, type of pizza you want etc. These questions together form a multi-turn dialog. That is, a dialog that requires follow up questions before it is completed whilst contextual information is stored so the dialog can move forward.

    Multi-turn dialogs are modelled in the platform by very simply by attaching Q&A pairs one after the other. A multi-run sample training file might look like this:

     Good Morning
               Hi! How are you
               I am good. How are you?
               Not too bad

    As you can see question and answer pairs are collated together. This hints to our system that the training dialog will require additional logic for the follow up questions to work. Training and uploading multi-turn dialogs works the same way as single-turn dialogs.

    How did we do?