Managing Dialog Strategy in Multiskill AI Assistant with Discourse Management
1. DIALOG STRATEGY MANAGEMENT IN
MULTISKILL AI ASSISTANTS WITH
DISCOURSE MANAGEMENT
Daniel Kornev, Chief Product Officer, DeepPavlov.ai
2.
3. Step 1
Dialog Management is complex
Correctly identify user’s utterance’s goal
Step 2 Correctly generate response(s)
Step 3 Correctly pick the best response
Alice: what’s happened in city hall?
Bot: [Which city hall [Entity Disambiguation]? Where (NYC | Local)? When (Today | Yesterday | at some time)?]
Domain:News: Occupy City Hall happened in July 2020 | Confidence: 0.85
Domain:Factoid: City Hall is the seat of New York City government | Confidence: 0.95
Retrieval: By 1969 City Hall was described as badly crowded because of Bellevue growing … | Confidence: 0.75
Alice: What’s happened in city hall?
Bot: City Hall is the seat of New York City government
4. DeepPavlov.ai
Dialog Manager was predominantly tactical (single-turn, reactive):
User
•Let’s chat
Bot
•(greeting)
(question)
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
Skill #1 Skill #2 Skill #3 Skill #4
Pros: Retrieval skills helped us to cover lots of societal topics (goal of the
challenge): we always have something to say
Cons: skills couldn’t drive conversation spanning across several conversation steps
5. DeepPavlov.ai
Dialog Manager got an influx of scenario-driven skills:
User
•Let’s chat
Bot
•(greeting)
(question)
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
Small Talk Movies Skill
Pros: Scenario-Driven skills enabled us to give users several conversation turns with
the same context, and limited user understanding within human-curated scenarios
Cons: we didn’t control switching between scenarios
6. DeepPavlov.ai
link_to added rudimentary support of topic-switching to Dialog Manager (randomly proactive):
User
•Let’s chat
Bot
•(greeting)
(question)
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
User
•[utterance]
Bot
•[utterance]
Small Talk Movies Skill
LINK_TO
Pros: Users got several islands of meaning within the conversation. Fine-tuned
management of dialog between sc-, gen-, and ret-based skills helped a lot.
Cons: we didn’t cover enough topics through scenarios
7. DeepPavlov.ai
Thanks to DFF, Dialog Manager got even more scenario-driven skills:
User
• Let’s chat
Bot
• (greeting)
(question)
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
Greeting Movies Skill
LINK_TO
Pros: Users got even more islands of meaning within the conversation.
Cons: users demand more breadth (more topics), more depth, opinions, understanding!
DFF X Skill DFF Y Skill
LINK_TO
8. DeepPavlov.ai
Thanks to wiki parser, Dialog Manager’s scenario-driven skills increased depth within their domains:
User
• Let’s chat
Bot
• (greeting)
(question)
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
Greeting Movies Skill
LINK_TO
Pros: Users got even more islands of meaning within the conversation.
Cons: users demand more breadth (more topics), more depth, opinions, understanding!
DFF X Skill DFF Y Skill
LINK_TO
Wiki Parser
9. DeepPavlov.ai
Thanks to dialog acts, Dialog Manager can better react to user’s needs (reactive):
User
• Let’s chat
Bot
• (greeting)
(question)
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
Greeting Movies Skill
LINK_TO
Pros: we can process user dialog acts with the corresponding responses at higher
level
But: dialog acts are incomplete, and are isolated, single-turn
DFF X Skill DFF Y Skill
LINK_TO
Wiki Parser
10. DeepPavlov.ai
Thanks to DFF wiki skill, Dialog Manager can increase coverage besides primary topics:
User
• Let’s chat
Bot
• (greeting)
(question)
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
Greeting Movies Skill
LINK_TO
Pros: we can address way more topics on different topics that we don’t have human-
curated scenarios for
But: lack of understanding of user questions, no opinion, no backstory
DFF X Skill DFF Y Skill
LINK_TO
Wiki Parser
DFF Wiki Skill
11. DeepPavlov.ai
▪ Aimless
• Bot isn’t aware of its own goals (dialog length,
user’s mood, understanding and addressing
user’s goals), and doesn’t take them into
account
▪ Mostly Tactical
• Dialog Management is mostly single-turn-based
(though we give priority to multi-turn scenario-
driven skills)
▪ Mostly Reactive
• Response to Dialog Acts is reactive
• Topic Switching is reactive
• Link_to is mostly random (unless we have
manual transitions)
▪ Mostly Selfless
• Little to no opinion is expressed by our bot in
conversations with users
▪ Mostly Careless
• Bot mostly doesn’t relate to the user’s mood or
discuss user’s emotions
▪ Goal-Aware
• Bot should be aware, and actively drive its goals (dialog length, user’s
mood, understanding and addressing user’s goals)
▪ Strategic
• Dialog Management should be focused on reaching Bot’s goals, foresee
every possible user’s step, and its each action should complement
Bot’s strategy
▪ Proactive
• Bot should know which Speech Function as a response is
appropriate to user’s Speech Function, and then pick the best one
to complement it’s strategy
• Topic Switching should be utilized by Bot from strategic
perspective
• Link_to should utilize relationships between entities within topics
and between topics, and should be used by Bot from strategic
perspective
▪ Be Opinionated
• Bot should be able to express its opinion, be able to explain it, and be
coherent (don’t contradict itself except in minor things)
▪ Be Caring
• Bot should relate to the user’s mood and be able to discuss user’s
emotions
12. DeepPavlov.ai
In multi-turn conversation bot should plan strategically, across turns
Single-Turn Management is our tactics! To become strategic we need a higher-level
abstraction to act across turns.
13. DeepPavlov.ai
Eggins and Martin
(1997)
Discourse structure patterns operate across turns: thus overtly interactional & sequential
Discourse Management is a basis for acting across turns, thus becoming strategic
14. Eggins and Slade
(1997)
Speech Functions control Discourse:
Give
information
Demand
information
Speech Acts
Discourse Moves
Speech Function Example:
open:initiate:give_opinion
15. Eggins and Slade
(1997)
Speech Functions have hierarchy based on the role in Discourse:
move
open
attend Initiate
Give
Fact
opinion
Demand
Open
Fact
Opinion
Closed
Fact
Opinion
sustain
Continue
Monitor Prolong
Elaborate
Extend
Enhance
Append
Elaborate
Extend
Enhance
React
Respond
Support
Develop
Elaborate
Extend
Enhance
Engage Register Reply
Accept
Comply
Agree
Answer
Acknowledge
Affirm
Confront
Disengage Reply
Decline
Non-comply
Disagree
Withold
Disawow
Contradict
Rejoinder
Support
Track
Check
Confirm
Clarify
Probe
Response
Resolve
Repair
Acquiesce
Confront
Challenge
Detach
Rebound
Counter
Response
Unresolve
Refute
Re-challenge
16. Eggins and Slade
(1997)
Removing SFs we don’t have to classify from user’s utterances in Alexa Prize
move
open
attend Initiate
Give
Fact
opinion
Demand
Open
Fact
Opinion
Closed
Fact
Opinion
sustain
Continue
Monitor Prolong
Elaborate
Extend
Enhance
Append
Elaborate
Extend
Enhance
React
Respond
Support
Develop
Elaborate
Extend
Enhance
Engage Register Reply
Accept
Comply
Agree
Answer
Acknowledge
Affirm
Confront
Disengage Reply
Decline
Non-comply
Disagree
Withold
Disawow
Contradict
Rejoinder
Support
Track
Check
Confirm
Clarify
Probe
Response
Resolve
Repair
Acquiesce
Confront
Challenge
Detach
Rebound
Counter
Response
Unresolve
Refute
Re-challenge
18. DeepPavlov.ai
Topics != Discourses. Discourses can span across multiple scenarios (e.g., food skill + dff_wiki_skill):
User
• Let’s chat
Bot
• (greeting)
(question)
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
User
• [utterance]
Bot
• [utterance]
Greeting Movies Skill
LINK_TO
Pros: Speech functions automatically track existing discourse. When Bot or user
switches subject of conversation == new discourse has been started.
Cons: how to detect subject switching? What is discourse to us?
DFF X Skill DFF Y Skill
LINK_TO
D #1 (Small Talk) D #2 (Movies) D #3 (Subject A) D #4 (Subject B)
19. DeepPavlov.ai
Example
Discourse is a combination of key entity (subject), related entities (w/ user & bot relation to them), topic(s)
Discourse #1
• Topics: Entertainment_Movies, Actors
• Key Entity (Subject): Science Fiction Movies
• Related Entities:
movies: Aliens, Terminator,
actors: Sigourney Weaver, Arnold Schwarznegger
Pros: We don’t limit ourselves to one topic (~10 topics as in CoBot DialogAct Topics)
but have flexibility within each topic cause one topic can have myriads of entities to
discuss. When what is discussed is too far from Discourse, our/user’s move is a
change to a new Discourse.
But: why should bot propose a change of a Discourse?
20. DeepPavlov.ai
Level 1
Dialog Manager should act based on these 3 levels, where each higher-level influences lower level:
Dialog: Bot Goals
• Understand User Interests & Conversation Goal(s)
• Address User Goal(s)
• Prolong Conversation
• Keep or improve user’s mood
• Address Bot Interests
Discourse: Discourse Management
• Maintain existing or change Discourse
Conversation Turn: Speech Function Management
• Pick the most appropriate Speech Function within chosen Discourse
Level 2
Level 3
21. DeepPavlov.ai
How to express programmatically:
• Bot’s Own Interests: e.g., Bot wants to understand what is love, what is telepathy,
why people want to go to Mars, etc.
• User’s interests: through topic modeling
• User’s Conv. Goals: tell smth (like to preacher), get advice, get into better mood,
small talk, trivia etc.
• User’s mood: TBD (see affective computing)
• Success of Conversation: # of convo turns & discourses, # of personal details
shared, user mood changes, rating (where applicable)
22. DeepPavlov.ai
- Speech Function Classifier
Discourse Management
- Speech Function Predictor
- Speech
Function
Classifier
Discourse Mgmt
- Speech
Function
Predictor
- DFF Generic Responses Skill
Discourse Management
- Personality
Detector
- etc.
etc.
- Basic Scenario-Driven Skills
- DFF Scenario-Driven Skills
- Generative Skills
- Retrieval Skils
Existing Skills
24. DeepPavlov.ai
Multiskill
orchestration
Conversa-
tionalskills
NLP
frameworks
ML platforms
Proprietary Open Source
▪ Multiskill Orchestration
• DeepPavlov Agent is an engine for
conversational skill deployment and
orchestration
▪ Conversational skills
• DeepPavlov Dream is a collection of pre-
build conversational skills and a default
distribution package for Dream AI
Assistant
▪ NLP frameworks
• DeepPavlov Library provides pretrained
models and simple declarative approach
to build NLP processing pipelines
▪ ML platforms
• TensorFlow and PyTorch as backends
25. DeepPavlov.ai
demo.deeppavlov.ai select [Deepy]
Web Demo
@deeppavlov_dream_ai_bot
TG Bot
github.com/deepmipt/deepy
Build Your AI Assistant:
Clone and build your own!
medium.com/deeppavlov
Read us:
forum.deeppavlov.ai
Talk to us: @DeepPavlovDreamDiscussions
TG:
@DeepPavlov
Twitter/TG:
27. DeepPavlov.ai
Example
Interest Modeling is a combination of a topic (and user’s level of interest to it, 0…1), list of “instance of” entities in
a given topic, generic activities for each of these “instance of”, and “level of interest” (0…1) to each of the activities
Interest #1
• Topic: Sports
Instance Of Activity Level of Interest
Football Playing 0.8
Watching 0.9
Discussing 0.9
Tennis Playing 0.1
Watching 0.8
Discussing 0.7
Modeling user interests by conceptual clustering
Daniela Godoy, 2006
28. DeepPavlov.ai
Example
Bot’s Current Task is the dialog fragment across one or more conversation turns on the goal the bot is currently
pursuing. It is our Active Skill, and it also includes current discourse as context.
Task #1
• Goal: Prolong Conversation
• Skill: Small Talk
• Discourse #1
Task #2
• Goal: Identify User’s Interests
• Skill: Friendship
• Discourse #2