Research Updates from Rasa: Transformers in NLU and Dialogue

Research Updates from Rasa:
Transformers in NLU and Dialogue
Alan Nichol
Co-Founder & CTO, Rasa

We’ll cover two recent research projects from Rasa
● Why we do research at Rasa
● DIET: new NLU architecture
● TED: new dialogue policy
● Q&A
● More resources

OUR MISSION
Empower all makers to create AI
assistants that work for everyone

To do that, we’re building the standard infrastructure for conversational AI
@alanmnichol
Open Source Community Applied Research

*Cumulative Pypi and Github downloads
of Rasa open source tools
Downloads
2M+ 8,000+
Forum Members
300+
Contributors
Rasa X: downloaded in 135 countries
Downloads
Our community is friendly, global, and growing fast
RASA COMMUNITY

Check out rasa.com/research to see some of the projects we’re working on

Conversational AI requires NLU and Dialogue management
@alanmnichol
We’ll talk about the role of transformer architectures in both of these tasks

Dual Intent and Entity
Transformer (DIET)

DIET is our new neural network architecture for NLU
💡 To understand how DIET works, check
our YouTube channel
What is DIET?
● New state of the art neural network architecture for NLU
● Predicts intents and entities together
● Plug and play pretrained language models

How to use DIET in your Rasa project
Here’s an example config.yml
Before the DIET model, you can specify any
featurizer.
In our experiments, we use:
● Sparse features (aka no pre-trained model)
● GloVe (word vectors)
● BERT (large language model)
● ConveRT (pre-trained encoder for
conversations)

Experiments on the NLU-benchmark dataset
● Repo is on github
● Domain: human-robot interaction (smart home setting)
● 64 diﬀerent intents
● 54 diﬀerent entity types
● ~26k labelled examples
Previous state of the art:
● HERMIT NLU (Vanzo, Bastianelli, and Lemon @ SIGdial 2019)
● uses ELMo embeddings

Result 1: DIET outperforms SotA even without any pretrained embeddings
Previous state of the art: intent: 87.55 entities: 84.74
@alanmnichol

Result 2: GloVe embeddings perform better than BERT

Result 3: ConveRT embeddings perform best on the NLU-benchmark dataset

Result 4: DIET outperforms fine-tuning BERT

Which featurizer is best depends on your dataset, so try diﬀerent ones!
At Rasa, we don’t believe in “one size fits all”
machine learning
● We aim to provide sensible defaults and
suggestions
● BUT even more important that Rasa models
are easy to customize
Share your results and compare notes with 8000+
Rasa developers at forum.rasa.com

Transformer
Embedding Dialogue
policy (TED)

Conversational AI requires NLU and Dialogue management
@alanmnichol

Happy paths are best described in code
@alanmnichol

But real conversations don’t follow the happy path
@alanmnichol

Users will always surprise you
@alanmnichol

And will revisit topics as they please
@alanmnichol

You can’t anticipate all the ways users will act
@alanmnichol

Can we build a model that handles this?

People typically use a recurrent neural net (RNN) to model dialogue
h1
h2
h3
y1
y2
y3
W
W
W
W
W
W
W
W
@alanmnichol

But not all input should be treated equally
@alanmnichol
https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html

Transformers (AKA self-attention) are now state of the art for many tasks
https://distill.pub/2016/augmented-rnns/
@alanmnichol

We found out that the Transformer Embedding Dialogue policy can untangle
sub-dialogues
@alanmnichol
paper

TED is available in Rasa 1.3 and up
The embedding policy (TED)
● better at handling unseen edge cases
● less likely to get confused when users
behave in highly unexpected ways
● used in combination with other policies
● Becoming the new default ML policy
(replacing KerasPolicy)
With all contextual assistants, please write tests!
@alanmnichol

So we now have the algorithms to handle this
@alanmnichol

But you also need training data!
@alanmnichol
Review conversations and
improve your assistant based
on what you learn
Collect
conversations
between users and
your assistant
Ship updates using
continuous
integration &
deployment

Build minimum
viable assistant Improve by
talking to the
assistant
Improve using
conversations
with real users
Improve using
conversations
with test users
Quality of assistant
Rasa Open Source (Local)
Rasa X (Server)
Rasa Open Source is an open
source framework for natural
language understanding, dialogue
management, and integrations.
Rasa X is a toolset used
to improve a contextual
assistant built using
Rasa Open Source.
Deploy your minimum viable assistant on a server and improve it using Rasa X

Rasa X: downloaded in 135 countries

How can the transitions be eﬀectively tested in a large
dialogue tree, to ensure that the policy works as expected?

Will Rasa provide a way to select the best policy based on my
use case and training data?

Does Rasa support multi-label classification for intents and
entities?

Is there a way to do cross domain transfer learning using
Rasa? (For instance, a healthcare assistant trained on
healthcare terminology to an IT help desk assistant)

To get started, watch the Rasa Masterclass on YouTube

● Unpacking the TED Policy in Rasa Open Source ( Rasa Blog)
● Introducing DIET: state-of-the-art architecture that outperforms fine-tuning BERT
and is 6X faster to train (Rasa Blog)
● Rasa Algorithm Whiteboard - Diet Architecture 1: How it Works (YouTube)
● Rasa Algorithm Whiteboard - Diet Architecture 2: Design Decisions (YouTube)
Further Reading

Alan Nichol
Co-founder & CTO
alan@rasa.com
@alanmnichol
Email me! →
alan@rasa.com

Research Updates from Rasa: Transformers in NLU and Dialogue

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Research Updates from Rasa: Transformers in NLU and Dialogue

Ähnlich wie Research Updates from Rasa: Transformers in NLU and Dialogue (20)

Mehr von Rasa Technologies

Mehr von Rasa Technologies (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Research Updates from Rasa: Transformers in NLU and Dialogue