Skip to content
Rich Kirk edited this page Feb 2, 2023 · 2 revisions

What is Rasa?

Rasa is a framework for building task-orientated dialog systems.

There are two core parts to a Rasa model:

  1. Natural Language Understanding (NLU)
  2. Dialog Policy

Natural Language Understanding

This process takes in raw text in the form of a user query and produces a machine-readable representation of the text.

There are two types of NLU:

  • Rule-based (e.g., Regex)

    • Patterns from forms (e.g., dates, emails, etc.)
    • Strict application, does NOT extrapolate well
  • Neural approach

    • Trains on examples and creates a decision making process
    • Can extrapolate to user input well

Dialog Policy

Once the query has been processed, it is then time for the model to decide what to do next (e.g., ask a new question, execute code, wait for a response).

Likewise to NLU, there are two types of dialog policies:

  • Rule-based
    • Dialog tree for all possible paths
    • does NOT extrapolate well
  • Neural approach
    • Picks next best turn in the conversation so far from all the conversations it has been trained on

File Configuration of Rasa

Domain file

Directory of everything the assistant is "aware" of.

This file contains:

  • responses: things the assistant can say back to the user
  • intents: categories of things the users may say (e.g. greet, search_job, is_bot)
  • slots: variables remembered over the course of a conversation (e.g., name of user)
  • entities: pieces of information extracted from incoming text
  • forms and actions: business logic, extends what the assistant can do

Config file

Configures the NLU pipeline. Includes customization for tokenizers, featurizers, synonym mappers, and more.

Data/Rules file

Short pieces of conversation that ALWAYS go the same way. These are simple step rules for an intent and an immediate action.

For example, whenever the intent of the user is to say goodbye, a rule should be to say goodbye back.

Data/Stories file

Potential conversation flows that set intent and action paths.

Stories have a name denoted by - story: [name of story]. Each story has a various number of steps, alternating intents and actions (actions must start with utter_)

Stories may contain logical OR statements like:

- or:
    - intent: affirm
    - intent: thanks

Stories can also contain checkpoints or points in a conversation that there are multiple branches:

- story: conversation
  steps:
  - intent: greet
  ...
  - checkpoint: ask_feedback

- story: user provides feedback
  steps:
  - checkpoint: ask_feedback
  - action: ...
  ...

- story: user does NOT provides feedback
  steps:
  - checkpoint: ask_feedback
  - action: ...
  ...

NOTE: USE OR AND CHECKPOINT SPARINGLY

Data/NLU file

Example ways the user can express an intent.

For example, if the intent is to greet, some NLU data may be 'Hello', 'What's up?', or 'Ciao!'.

Entities

Important information that the chatbot could use later in the conversation.

For example, dates, location, and name can be helpful entities to recall.

To train entities, use the nlu.yml file. The word that should be extracted as an entity should be captured with square brackets. Next to it should be the label of the entity.

- intent: phone number
  examples: |
    - My phone number is [1234568901](phone_number)

There are three ways entities can be extracted:

  1. Pre-built models
    • Duckling can extract numbers, dates, email, etc.
    • SpaCy can extract names, locations, etc.
  2. Using regex to match specific and strict patterns (e.g., phone numbers,)
  3. Machine Learning for custom entities

Can add synonyms to nlu.yml.

Slots

Enables assistant to collect

Set slots either by entity type or custom actions (e.g., data from API, DB, etc.). Through NLU entities you can set:

entities:
  - destination

slots:
  destination:
    type: text
    influence_conversation: false
    mappings:
    - type: from_entity
      entity: destination

General tips:

  • Seldom use OR and checkpoint in stories
  • Use the fewest amount of intents possible
  • Generate data from real conversations (use rasa interactive) (see here)