Managing Dialog Strategy In Multiskill AI Assistant.pdf

DIALOG STRATEGY MANAGEMENT IN
MULTISKILL AI ASSISTANTS WITH
DEEPPAVLOV
Daniel Kornev, CPO & Dilyara Baymurzina, Researcher @ DeepPavlov.ai

I’M AVAILABLE TO CHAT DURING THIS SESSION
Click on “1:1 Chat,” then “Ask the Presenter/Moderator” button to submit your question.
After the session is over, connect with me via attendee chat by searching for my name.

Skills for consumer
AI Assistants

Simple chatbots
Skills for consumer
AI Assistants

?
Simple chatbots
Skills for consumer
AI Assistants

DeepPavlov.ai
© Copyright PresentationGO.com
Pre-Purchase
Post-Purchase
Surveys
Promotions
Campaigns
Customer Service
Technical Support
Product Usage
Billing & Payment
Account Management
Logistics
▪ Customer experience spans
multiple domains
• Surveys
• Promotions
• Campaigns
• Customer Service
• Technical Support
• …
▪ Every domain requires specific
skill

Goal-oriented Bot
+
Chit-Chat Mode
+
Response Selector
Simple chatbots
Skills for consumer
AI Assistants
Vendor X

Simple chatbots
Skills for consumer
AI Assistants
Multiskill AI Assistants:
DeepPavlov

Grand challenge: create a socialbot that can engage in a fun, high
quality conversation on popular societal topics for 20 minutes and
achieve an average rating of at least 4.0/5.0.
Alexa Prize 3 Winners:
1.Emora - $500K, 3.8/5.0, 7 min, 32 sec, Emora University
2.Chirpy Cardinal -- $100K, 3.17/5.0, Stanford University
3.Alquist -- 50k$, (2nd in ‘17, 3rd in ‘19, ‘20), Czech Technical University

DeepPavlov.ai
ngc.nvidia.com/catalog/containers/partners:deeppavlov

• Open repository of NLP models
and pipelines
• easy to find and reuse NLP
components for development of
new skills or extension of
existing
• Open repository of
conversational skills
• alternative implementations of
the most popular skills
• Open hub for AI Assistant
distributions
• general and domainindustry
specific distributions of skill sets
DeepPavlov.ai +

DeepPavlov.ai
MEET GERTY 3000
Can help Sam with problems on
the Moon Base Sarang?
Can entertain Sam? Yes ✔
Yes ✔
Can we emulate it with
DeepPavlov DREAM?
Yes ✔
Main Question
Functionality Analysis
© Copyright Sony Pictures Classics

Text OR
Voice
Input
TTS
(NeMo)
Spell
Checking
NeMo ASR Harvesters
Status
Chit-Chat
(AIML)
Emotion
BUILT-IN
SKILL
SELECTOR
RULE-BASED
RESPONSE
SELECTOR

DeepPavlov.ai
services:
agent:
[..]
depends_on:
- mongo
harvesters_maintenance_skill:
[..]
mongo:
[..]
rule_based_response_selector:
[..]
nemo:
[..]
depends_on:
- agent
emotion_classification:
[..]
program_y:
[..]
spell_checking:
[..]
Spell Checking
Annotators
Emotion Classification
Harvesters Status Skill
Skills
Chit-Chat Skill
NeMo ASR & TTS
Other Services
Rule-Based Response Selector

DeepPavlov.ai
services:
agent:
[..]
depends_on:
- mongo
harvesters_maintenance_skill:
[..]
mongo:
[..]
rule_based_response_selector:
[..]
nemo:
[..]
depends_on:
- agent
emotion_classification:
[..]
program_y:
[..]
clone_tts:
[..]
Annotators
Services Groups
Skills
Response Annotators
Depend on groups (e.g., “skills”)
Services Are Isolated
Limited in what they see in dialog
Invoke Agent’s State Manager
Can run via HTTP or be Python-based
"skills": {
"harvesters_maintenance_skill": {
"connector": {
"protocol": "http",
"url": "http://harvesters_maintenance
_skill:3002/respond"
},
"dialog_formatter": "dp_formatters:ful
l_dialog",
"response_formatter": "dp_formatters:b
ase_skill_formatter",
"state_manager_method": "add_hypothesi
s",
"previous_services": ["annotators"]
},
Response Selectors

DeepPavlov.ai
What is (all) harvesters’ status?
Intents
What is harvester status?
Prepare rover for a trip
domain.yml
intents:
- all_statuses_request
- status_request
[..]
- trip_request
responses:
utter_status_request:
- text: "The harvester {harv_id} is {harv_status}.“
[..]
nlu.md
## intent:all_statuses_request
- What is the harvesters status?
- What is the combines status?
[..]
stories.md
## harv_status + prepare_trip
* status_request
- utter_status_request
stories.md – training for dialogs
RASA Configs
nlu.md – training for intents & slots
domain.yml – basic ontology for skill
Simple and easy to use

DeepPavlov.ai
Works with GoBot
GoBotWrapper
Obtains data from DB
Generates NLG
stories.md – training for dialogs
Tutorial in Google Colab
nlu.md – training for intents & slots
domain.yml – basic ontology for skill
Full sample: use it to train your GoBot
and save its output to your Skill
GoBotWrapper
[..]
@app.route("/respond", methods=["POST"])
def respond():
[..]
dialogs = request.json["dialogs"]
for dialog in dialogs:
sentence = dialog['human_utterances'][-1]
['annotations'].get("spelling_preprocessing")
[..]
uttr_resp, conf = gobot(sentence)
response = gobot.getNlg(uttr_resp)
responses.append(response)
confidences.append(conf)
return jsonify(list(zip(responses, confidences)))

DeepPavlov.ai
<?xml version="1.0" encoding="UTF-8"?>
<aiml version="2.0">
<category>
<pattern>I AM ^ TIRED</pattern>
<template>
🙁
<random>
<li>Get some sleep<get name="name"/>.
You're very tired.</li>
<li>Have a rest and be happy! How can
I help you?</li>
</random>
</template>
</category>
[..]
</aiml>
</xml>
Assistant Profile (Name, Place, etc.)
Patterns
Greeting scenario
Topics
Looks up for patterns
Dialog Processing
Picks random pre-defined response
If not sure, confidence is low (0.2)
Returns response + confidence

DEMO TIME
Deepy as a Multiskill AI Assistant
DeepPavlov.ai +
> docker-compose up --build
> curl --location --request POST 'localhost:4242' --header 'Content-
Type: application/json' --data-raw '{"user_id": "name", "payload":
“what do I do here?"}'

DeepPavlov.ai
Dream AI Assistant Demo
• Own scenarios + Wikidata & ODQA
Amazon Alexa
• Own scenarios (w/ 3rd Party integrations) + Evi
+ 3rd Party Skills
Yandex Alice
• Own scenarios (w/ 3rd Party integrations) +
Yandex Search + 3rd Party Skills

DeepPavlov.ai
Step 1
Dialog Management is complex
Correctly identify user’s utterance’s goal
Step 2 Correctly generate response(s)
Step 3 Correctly pick the best response
Alice: what’s happened in city hall?
Bot: [Which city hall [Entity Disambiguation]? Where (NYC | Local)? When (Today | Yesterday | at some time)?]
Domain:News: Occupy City Hall happened in July 2020 | Confidence: 0.85
Domain:Factoid: City Hall is the seat of New York City government | Confidence: 0.95
Retrieval: By 1969 City Hall was described as badly crowded because of Bellevue growing … | Confidence: 0.75
Alice: What’s happened in city hall?
Bot: City Hall is the seat of New York City government

DeepPavlov.ai
At Skill Selector Level you may have more than skill to choose from
Inside each the Skill you may have more than one context to choose from:
• Story/Scenario (in goal-oriented systems)
• Scenario/Dialog Tree (in chat-based systems)
Challenge: How to make sure that the response we’ve
chosen addresses the user’s goal the best?
At Response Selector Level you may have more than one hypothesis to choose
from

DeepPavlov.ai
* DREAM technical report for the Alexa Prize 2019, Yuri Kuratov, et al. http://bit.ly/DP_Dream_TR_2020
Skill Selector
● Based on annotated human utterance and dialog state, in particular, topics and dialog
acts, picks up list of skills to try to generate response hypotheses
Skills
● Different types of skills: template-based, AIML, retrieval skills
● Each of selected skills may generate zero/one/several hypotheses
Response Selector
● Based on toxicity and blacklisted words annotations filters inappropriate hypotheses
● Based on dialog state, annotated hypotheses, their confidences and evaluation scores
picks up the best hypothesis

DeepPavlov.ai
Filtration
● Based on toxicity and blacklisted words annotations filters inappropriate hypotheses
Evaluation
● Calculates final single-value scores for hypotheses from confidences, conversation
evaluation scores
Hand-written Heuristics
● Gives priority to special cases, like high-priority intents, significantly increasing final score
Penalties
● Decreases final scores for repetitions
Prompts
● Adds link-to questions to final response if short with no requests reply from some
particular skills

DeepPavlov.ai
● Final score depends on confidence which is assigned by hands/rules in template-
based skills
● Final score formula is empirically created
● No dependency on dialog acts
● Almost only single-turn dialog management
● Latency **
* Latency is partially solved with setting up timeout management and software optimization, but in
Conversational AI it ultimately requires powerful AI-optimized hardware like NVIDIA GPU clusters

DeepPavlov.ai
● Final score should not depend on confidence
● Final score should be calculated by one ranking model
● Some dialog acts require responses with particular dialog acts
● Priority to multi-turn scripted skills

DeepPavlov.ai
There are at least 2 ways* look at a Conversation:
* S. Eggins & D. Slade, Analysing Casual Conversation.London: Cassell, 1997
Pragmatic Conversation
• Motivated by clear pragmatic purpose. Aka task-
oriented. Usually very short. Formal.
Casual Conversation
• NOT motivated by clear pragmatic purpose. Can
and often are lengthy. Informal, can have humor.
Aka chit-chat.

There are at least 4 different approaches to classify utterances & sentences:
Speech Acts*
• Work at utterance level. Hearer interprets
speaker’s intentions and tries to interpret
desired actions from hearer.
Dialog Acts
• Work at sentence level. Ascribe each sentence’s
dialog function to the entire utterance.
Speech Functions
• Work at utterance level. Similar to Speech Acts
but they produce utterance’s through its role in
Discourse.
Utterance Acts**
• Work at utterance level but
include body movements.
**Not applicable for us as we can’t see the person
*Original authors were not concerned with
Discourse

Open.Initiate
Sustain.Continue
React.Respond.Support
Sustain.Continue
Open.Initiate
React.Rejoinder.Support
Give.Fact
Append.Elaborate
Reply.Acknowledge
Prolong.Elaborate
Demand.Closed.Opinion
Reply.Affirm
Track.Confirm
Reply.Affirm
Discourse Move Speech Act

DeepPavlov.ai
Eggins and Martin
(1997)
Casual Conversation is about people, not facts
Discourse Strategy Advice:
In a conversation w/ user:
Explore their interpersonal relations through
confronting moves

Eggins and Slade
(1997)
Speech Functions control Discourse:
Give
information
Demand
information
Speech Acts
Discourse Moves
Speech Function Example:
open:initiate:give_opinion

Text OR
Voice
Input
TTS
(NeMo)
Spell
Checking
NeMo ASR Harvesters
Status
Chit-Chat
(AIML)
Emotion
BUILT-IN
SKILL
SELECTOR
RULE-BASED
RESPONSE
SELECTOR
Speech Function Classifier
Discourse Management
Speech Function Predictor
Speech
Function
Classifier
Discourse Mgmt
Speech
Function
Predictor

DeepPavlov.ai
Step 1
Part I: Use Speech Function to understand user’s goal
Classify user utterance’s Speech Function
Step 2 Predict the Speech Function for the best response
Step 3 Identify whether you understand user’s goal
Alice: what’s happened in city hall?
Speech Function: React.Rejoinder.Support.Track.Clarify
Speech Function Predictor: React.Rejoinder.Support.Track.Clarify | Confidence: 0.85
Speech Function Predictor: React.Respond.Support.Reply.Answer | Confidence: 0.79
Speech Function Predictor: React.Rejoinder.Confront.Response.Rechallenge | Confidence: 0.75
Speech Function Predictor: Predicts that it is a good idea to React.Rejoinder.Support.Track.Clarify
Bot: [Which city hall [Entity Disambiguation]? Where (NYC | Local)? When (Today | Yesterday | at some time)?

DeepPavlov.ai
Step 4 In each Skill, generate relevant Speech Function response
Step 5 For each hypothesis, predict Speech Function for user response
Domain:Factoid: Which City Hall? | React.Rejoinder.Support.Track.Clarify | Confidence: 0.85
Domain:News: When? | React.Rejoinder.Support.Track.Clarify | Confidence: 0.95
…
Speech Function Predictor: React.Resoind.Confront.Reply.Disawow | Confidence: 0.65
Domain:Factoid:
Which City Hall?
Speech Function Predictor: React.Resoind.Confront.Reply.Disawow | Confidence: 0.64
Domain:News:
When?
Part II: Use Speech Function Predictor to predict user’s response

DeepPavlov.ai
Step 6
Part III: Use Speech Function to understand user’s goal
In Response Selector, ignore irrelevant hypotheses
Step 7 In Response Selector, identify what Conversation path you’re in
Step 8 In Response Selector, give greenlight to hypothesis that is best
for the recognized Conversation type
User’s Utterance: React.Rejoinder.Support.Track.Clarify | Conversation Type: Casual
C. Type: Pragmatic
C. Type: Casual

DEMO TIME
Sneak Peek at Discourse-Driven Dialog Management in DeepPavlov Deepy

Multiskill
orchestration
Conversa-
tionalskills
NLP
frameworks
ML platforms
Proprietary Open Source
▪ Multiskill orchestration
• DeepPavlov Agent is an engine for
conversational skill deployment and
orchestration
▪ Conversational skills
• DeepPavlov Dream is a collection of pre-
build conversational skills and a default
distribution package for Dream AI
Assistant
▪ NLP frameworks
• DeepPavlov Library provides pretrained
models and simple declarative approach
to build NLP processing pipelines
▪ ML platforms
• TensorFlow and PyTorch as backends

demo.deeppavlov.ai select [Deepy]
Web Demo
@deeppavlov_dream_ai_bot
TG Bot
github.com/deepmipt/deepy
Build Your AI Assistant:
Clone and build your own!
medium.com/deeppavlov
Read us:
forum.deeppavlov.ai
Talk to us: @DeepPavlovDreamDiscussio
ns
TG:
@DeepPavlov
Twitter/TG:
DeepPavlov.ai +

Managing Dialog Strategy In Multiskill AI Assistant.pdf

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Managing Dialog Strategy In Multiskill AI Assistant.pdf

Ähnlich wie Managing Dialog Strategy In Multiskill AI Assistant.pdf (20)

Mehr von Daniel Kornev

Mehr von Daniel Kornev (14)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Managing Dialog Strategy In Multiskill AI Assistant.pdf