SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
The Fifth Dialog State Tracking Challenge (DSTC5)
Seokhwan Kim1
, Luis Fernando D’Haro1
, Rafael E. Banchs1
, Jason D. Williams2
, Matthew Henderson3
, Koichiro Yoshino4
1
Institute for Infocomm Research, Singapore. 2
Microsoft Research, USA. 3
Google, USA. 4
Nara Institute of Science and Technology, Japan.
Problems
Goal
Human-human dialogs on tourist information in English and Chinese
Focusing on the problem of adaptation to a new language
Main Task
Dialog State Tracking (DST)
Pilot Tasks
Spoken Language Understanding (SLU)
Speech Act Prediction (SAP)
Spoken Language Generation (SLG)
End-to-end System (EES)
Datasets
Dialogs
Set Task Language # dialogs # utterances
Train ALL English 35 31,304 ← DSTC4 datasets
Dev ALL Chinese 2 3,130
Test MAIN Chinese 10 14,878
Test SLU Chinese 8 12,655
Test SAP Chinese 8 11,456
Test SLG Chinese 8 12,346
Translations
5-best translations were provided for each utterance with word alignments
generated by English-to-Chinese and Chinese-to-English MT systems
The ontology for DSTC4 was given with its automatic translation to Chinese
Main Task: Dialog State Tracking
Task Definition
Dialog state tracking for each sub-dialog level
Input
Transcribed utterances from the beginning of the session to each timestep
Manually segmented by sub-dialogs and annotated with topic categories
Output
Frame structures defined with slot-value pairs
For 5 major topic categories: Accommodation, Attraction, Food, Shopping, Transportation
Example
Speaker Utterance Dialog State
Guide 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.) TOPIC: Attraction
TYPE OF PLACE:
Ethnic enclave
NEIGHBORHOOD:
Kampong Glam
Tourist 对。(Right.)
Guide 你看,它是个-它是马来村嘛
(You see, it is a- it’s a Malay Village)
Tourist 对,甘榜- (Right, Kampong-)
Guide 它就卖了很多马来食物。 (It sells a lot of Malay food.) TOPIC: Food
CUISINE:
Malay cuisine
NEIGHBORHOOD:
Kampong Glam
Tourist 比较有特色的食物, (It’s quite a unique food,)
Guide 对,哦。(Right.)
Guide 马来食物,基本上,它是香。
(Malay food, basically, it smells very nice.)
Tourist 那我们住宿呢?(Then, where do we stay?)
TOPIC: Accommodation
INFO: Pricerange
NAME: V Hotel
Guide 我介绍一间呵,叫V Hotel的。 (Let me recommend to you, the V Hotel.)
Guide 这个酒店,价格这个不贵。 (This hotel, the price is not expensive.)
Tourist 好的。 (Okay.)
Guide 如果要去,我建议的这个马来文化村,
TOPIC: Transportation
INFO: Duration
TYPE: Walking
FROM: V Hotel
TO: Kampong Glam
(If you want to go, I suggest this Malay cultural village,)
Tourist 马来村? (Malay village?)
Guide 步行大概我看十五分钟吧。 (I think it take fifteen minutes on foot.)
Tourist 好。 (That’s good.)
Main Task: Dialog State Tracking
Baselines
Fuzzy string matching between ontology entries and utterances (DSTC4)
Baseline 1: Translations in English with the original ontology in English
Baseline 2: Original utterances in Chinese with the translated ontology in Chinese
Evaluation
Schedules: (1) every turn; (2) only at the end of each sub-dialog
Metrics: (1) Frame-level Accuracy; (2) Slot-level Precision/Recall/F-measure
Results (32 entries from 9 teams)
Schedule 1 Schedule 2
Team Entry Accuracy F-measure Accuracy F-measure
0 0 0.0250 0.1124 0.0321 0.1462 ← Baseline 1
0 1 0.0161 0.1475 0.0222 0.1871 ← Baseline 2
1 0 0.0397 0.3115 0.0551 0.3565
1 1 0.0386 0.3032 0.0597 0.3540
1 2 0.0393 0.3071 0.0551 0.3563
1 3 0.0387 0.3052 0.0597 0.3580
1 4 0.0417 0.3166 0.0612 0.3675
2 0 0.0736 0.3966 0.0964 0.4430
2 1 0.0567 0.3764 0.0712 0.4267
2 2 0.0529 0.3756 0.0681 0.4259
2 3 0.0788 0.4047 0.0956 0.4519
2 4 0.0699 0.4024 0.0872 0.4499
3 0 0.0351 0.2060 0.0505 0.2539
3 1 0.0303 0.2424 0.0367 0.2830
3 2 0.0289 0.2074 0.0406 0.2573
3 3 0.0341 0.2442 0.0451 0.2895
4 0 0.0583 0.3280 0.0765 0.3658
4 1 0.0407 0.3405 0.0413 0.3572
4 2 0.0515 0.3708 0.0635 0.3945
4 3 0.0552 0.3649 0.0681 0.3913
4 4 0.0454 0.3572 0.0559 0.3758
5 0 0.0330 0.2749 0.0520 0.3314
5 1 0.0187 0.1804 0.0230 0.1967
5 2 0.0183 0.1520 0.0168 0.1371
5 3 0.0313 0.1574 0.0413 0.1880
5 4 0.0093 0.0945 0.0115 0.0977
6 0 0.0389 0.2849 0.0482 0.3230
6 1 0.0340 0.3070 0.0383 0.3532
6 2 0.0491 0.2988 0.0643 0.3381
7 0 0.0092 0.0783 0.0107 0.0794
7 1 0.0085 0.0767 0.0115 0.0809
8 0 0.0192 0.1570 0.0214 0.1554
8 1 0.0068 0.0554 0.0069 0.0577
9 0 0.0231 0.1114 0.0314 0.1449
Pilot Task: Spoken Language Understanding
Task Definition
Input: Transcribed utterance at each timestep
Output
Speech Act: 4 main categories with 21 attributes
Semantic Tags: 8 main categories with subcategories, relative modifiers and from-to modifiers
Example
Input: 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.)
Speech Act: INI (RECOMMEND)
Semantic Tags: 我介绍你这<LOC CAT=“CULTURAL”>个甘榜格南</LOC>。
(I recommend you this <LOC CAT=“CULTURAL”>Kampong Glam</LOC>.)
Pilot Task: Spoken Language Understanding
Baselines: SVM for Speech Acts and CRF for Semantic Tags
Evaluation Metrics: Precision/Recall/F-measure
Results on Speech Acts (12 entries from 4 teams)
Guide Tourist
Team Entry P R F P R F
0 0 0.4588 0.2480 0.3219 0.3694 0.1828 0.2446 ← SVM baseline
2 0 0.5450 0.3911 0.4554 0.5001 0.5501 0.5239
2 1 0.5305 0.3969 0.4540 0.5331 0.5263 0.5297
2 2 0.5533 0.3829 0.4526 0.5107 0.5425 0.5261
2 3 0.5127 0.4251 0.4648 0.5605 0.4999 0.5285
3 0 0.4279 0.3583 0.3900 0.4591 0.4241 0.4409
3 1 0.4340 0.3635 0.3956 0.4498 0.4119 0.4300
5 0 0.4085 0.3364 0.3690 0.5026 0.4484 0.4739
5 1 0.3905 0.3216 0.3527 0.4519 0.4031 0.4261
5 2 0.4639 0.3820 0.4190 0.4916 0.4385 0.4635
5 3 0.4540 0.3739 0.4101 0.4871 0.4346 0.4594
5 4 0.4459 0.3672 0.4028 0.4984 0.4446 0.4700
7 0 0.5007 0.2976 0.3733 0.5079 0.4156 0.4571
Results on Sementic Tags (8 entries from 3 teams)
Guide Tourist
Team Entry P R F P R F
0 0 0.4666 0.3187 0.3787 0.5259 0.2659 0.3532 ← CRF baseline
3 0 0.4650 0.3182 0.3779 0.5331 0.2620 0.3513
3 1 0.4650 0.3182 0.3779 0.5331 0.2620 0.3513
5 0 0.5006 0.2923 0.3691 0.5083 0.3110 0.3859
5 1 0.5469 0.1893 0.2813 0.5121 0.3081 0.3847
5 2 0.3577 0.2476 0.2926 0.3031 0.2237 0.2574
5 3 0.3486 0.2541 0.2939 0.2932 0.2149 0.2480
5 4 0.3395 0.2111 0.2603 0.2947 0.2072 0.2433
7 0 0.4400 0.3207 0.3710 0.4408 0.2926 0.3517
Pilot Task: Spoken Language Generation
Task Definition
Input: Speech act and semantic tags at each time step
Output: Generated utterance
Example
Input: INI (RECOMMEND), <LOC CAT=“CULTURAL”>Kampong Glam</LOC>
Output: 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.)
Baseline
Example-based language generation
Using k-nearest neighbors algorithm on speech acts and semantic tags
Evaluation Metrics
BLEU: Geometric average of n-gram precision of system outputs to references
AM-FM: Linear interpolation of cosine similarity and normalized n-gram probability
Results (4 entries from 1 team)
Guide Tourist
Team Entry AM-FM BLEU AM-FM BLEU
0 0 0.1981 0.3854 0.2602 0.5921 ← Baseline
5 0 0.2818 0.3264 0.3221 0.4850
5 1 0.3180 0.3371 0.3635 0.5249
5 2 0.2737 0.2852 0.3100 0.4741
5 3 0.2405 0.2758 0.4258 0.5302
* More details can be found from our paper in the SLT proceeding, DSTC5 official website (http://workshop.colips.org/dstc5/) and DSTC5 GitHub repository (https://github.com/seokhwankim/dstc5).

Weitere ähnliche Inhalte

Ähnlich wie The Fifth Dialog State Tracking Challenge (DSTC5)

Remote detection of weak aftershocks of the DPRK underground explosions using...
Remote detection of weak aftershocks of the DPRK underground explosions using...Remote detection of weak aftershocks of the DPRK underground explosions using...
Remote detection of weak aftershocks of the DPRK underground explosions using...
Ivan Kitov
 
Group assigment statistic group3
Group assigment statistic group3Group assigment statistic group3
Group assigment statistic group3
Narith Por
 

Ähnlich wie The Fifth Dialog State Tracking Challenge (DSTC5) (12)

SophiaConf 2018 - J. Rahajarison (My Little Adventure)
SophiaConf 2018 - J. Rahajarison (My Little Adventure)SophiaConf 2018 - J. Rahajarison (My Little Adventure)
SophiaConf 2018 - J. Rahajarison (My Little Adventure)
 
How to use a Kalman Filter in Brand Tracking?
How to use a Kalman Filter in Brand Tracking?How to use a Kalman Filter in Brand Tracking?
How to use a Kalman Filter in Brand Tracking?
 
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
 
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)
 
AP Statistics - Confidence Intervals with Means - One Sample
AP Statistics - Confidence Intervals with Means - One SampleAP Statistics - Confidence Intervals with Means - One Sample
AP Statistics - Confidence Intervals with Means - One Sample
 
Umedia2011 - uP: A lightweight protocol for services in smart spaces
Umedia2011 -  uP: A lightweight protocol for services in smart spacesUmedia2011 -  uP: A lightweight protocol for services in smart spaces
Umedia2011 - uP: A lightweight protocol for services in smart spaces
 
Remote detection of weak aftershocks of the DPRK underground explosions using...
Remote detection of weak aftershocks of the DPRK underground explosions using...Remote detection of weak aftershocks of the DPRK underground explosions using...
Remote detection of weak aftershocks of the DPRK underground explosions using...
 
sCorrecting for country skew: How APNIC adjusts for sample bias in the counts
sCorrecting for country skew: How APNIC adjusts for sample bias in the countssCorrecting for country skew: How APNIC adjusts for sample bias in the counts
sCorrecting for country skew: How APNIC adjusts for sample bias in the counts
 
Trigonotabel
TrigonotabelTrigonotabel
Trigonotabel
 
93 crit valuetables_4th
93 crit valuetables_4th93 crit valuetables_4th
93 crit valuetables_4th
 
The lecture is dead
The lecture is deadThe lecture is dead
The lecture is dead
 
Group assigment statistic group3
Group assigment statistic group3Group assigment statistic group3
Group assigment statistic group3
 

Mehr von Seokhwan Kim

Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Seokhwan Kim
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Seokhwan Kim
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Seokhwan Kim
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic Tracking
Seokhwan Kim
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
Seokhwan Kim
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognition
Seokhwan Kim
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...
Seokhwan Kim
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
Seokhwan Kim
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...
Seokhwan Kim
 
An Alignment-based Pattern Representation Model for Information Extraction
An Alignment-based Pattern Representation Model for Information ExtractionAn Alignment-based Pattern Representation Model for Information Extraction
An Alignment-based Pattern Representation Model for Information Extraction
Seokhwan Kim
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation Detection
Seokhwan Kim
 
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
Seokhwan Kim
 

Mehr von Seokhwan Kim (16)

The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
 
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
 
Dynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingDynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic Tracking
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
 
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
 
Sequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog States
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic Tracking
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognition
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...
 
An Alignment-based Pattern Representation Model for Information Extraction
An Alignment-based Pattern Representation Model for Information ExtractionAn Alignment-based Pattern Representation Model for Information Extraction
An Alignment-based Pattern Representation Model for Information Extraction
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation Detection
 
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

The Fifth Dialog State Tracking Challenge (DSTC5)

  • 1. The Fifth Dialog State Tracking Challenge (DSTC5) Seokhwan Kim1 , Luis Fernando D’Haro1 , Rafael E. Banchs1 , Jason D. Williams2 , Matthew Henderson3 , Koichiro Yoshino4 1 Institute for Infocomm Research, Singapore. 2 Microsoft Research, USA. 3 Google, USA. 4 Nara Institute of Science and Technology, Japan. Problems Goal Human-human dialogs on tourist information in English and Chinese Focusing on the problem of adaptation to a new language Main Task Dialog State Tracking (DST) Pilot Tasks Spoken Language Understanding (SLU) Speech Act Prediction (SAP) Spoken Language Generation (SLG) End-to-end System (EES) Datasets Dialogs Set Task Language # dialogs # utterances Train ALL English 35 31,304 ← DSTC4 datasets Dev ALL Chinese 2 3,130 Test MAIN Chinese 10 14,878 Test SLU Chinese 8 12,655 Test SAP Chinese 8 11,456 Test SLG Chinese 8 12,346 Translations 5-best translations were provided for each utterance with word alignments generated by English-to-Chinese and Chinese-to-English MT systems The ontology for DSTC4 was given with its automatic translation to Chinese Main Task: Dialog State Tracking Task Definition Dialog state tracking for each sub-dialog level Input Transcribed utterances from the beginning of the session to each timestep Manually segmented by sub-dialogs and annotated with topic categories Output Frame structures defined with slot-value pairs For 5 major topic categories: Accommodation, Attraction, Food, Shopping, Transportation Example Speaker Utterance Dialog State Guide 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.) TOPIC: Attraction TYPE OF PLACE: Ethnic enclave NEIGHBORHOOD: Kampong Glam Tourist 对。(Right.) Guide 你看,它是个-它是马来村嘛 (You see, it is a- it’s a Malay Village) Tourist 对,甘榜- (Right, Kampong-) Guide 它就卖了很多马来食物。 (It sells a lot of Malay food.) TOPIC: Food CUISINE: Malay cuisine NEIGHBORHOOD: Kampong Glam Tourist 比较有特色的食物, (It’s quite a unique food,) Guide 对,哦。(Right.) Guide 马来食物,基本上,它是香。 (Malay food, basically, it smells very nice.) Tourist 那我们住宿呢?(Then, where do we stay?) TOPIC: Accommodation INFO: Pricerange NAME: V Hotel Guide 我介绍一间呵,叫V Hotel的。 (Let me recommend to you, the V Hotel.) Guide 这个酒店,价格这个不贵。 (This hotel, the price is not expensive.) Tourist 好的。 (Okay.) Guide 如果要去,我建议的这个马来文化村, TOPIC: Transportation INFO: Duration TYPE: Walking FROM: V Hotel TO: Kampong Glam (If you want to go, I suggest this Malay cultural village,) Tourist 马来村? (Malay village?) Guide 步行大概我看十五分钟吧。 (I think it take fifteen minutes on foot.) Tourist 好。 (That’s good.) Main Task: Dialog State Tracking Baselines Fuzzy string matching between ontology entries and utterances (DSTC4) Baseline 1: Translations in English with the original ontology in English Baseline 2: Original utterances in Chinese with the translated ontology in Chinese Evaluation Schedules: (1) every turn; (2) only at the end of each sub-dialog Metrics: (1) Frame-level Accuracy; (2) Slot-level Precision/Recall/F-measure Results (32 entries from 9 teams) Schedule 1 Schedule 2 Team Entry Accuracy F-measure Accuracy F-measure 0 0 0.0250 0.1124 0.0321 0.1462 ← Baseline 1 0 1 0.0161 0.1475 0.0222 0.1871 ← Baseline 2 1 0 0.0397 0.3115 0.0551 0.3565 1 1 0.0386 0.3032 0.0597 0.3540 1 2 0.0393 0.3071 0.0551 0.3563 1 3 0.0387 0.3052 0.0597 0.3580 1 4 0.0417 0.3166 0.0612 0.3675 2 0 0.0736 0.3966 0.0964 0.4430 2 1 0.0567 0.3764 0.0712 0.4267 2 2 0.0529 0.3756 0.0681 0.4259 2 3 0.0788 0.4047 0.0956 0.4519 2 4 0.0699 0.4024 0.0872 0.4499 3 0 0.0351 0.2060 0.0505 0.2539 3 1 0.0303 0.2424 0.0367 0.2830 3 2 0.0289 0.2074 0.0406 0.2573 3 3 0.0341 0.2442 0.0451 0.2895 4 0 0.0583 0.3280 0.0765 0.3658 4 1 0.0407 0.3405 0.0413 0.3572 4 2 0.0515 0.3708 0.0635 0.3945 4 3 0.0552 0.3649 0.0681 0.3913 4 4 0.0454 0.3572 0.0559 0.3758 5 0 0.0330 0.2749 0.0520 0.3314 5 1 0.0187 0.1804 0.0230 0.1967 5 2 0.0183 0.1520 0.0168 0.1371 5 3 0.0313 0.1574 0.0413 0.1880 5 4 0.0093 0.0945 0.0115 0.0977 6 0 0.0389 0.2849 0.0482 0.3230 6 1 0.0340 0.3070 0.0383 0.3532 6 2 0.0491 0.2988 0.0643 0.3381 7 0 0.0092 0.0783 0.0107 0.0794 7 1 0.0085 0.0767 0.0115 0.0809 8 0 0.0192 0.1570 0.0214 0.1554 8 1 0.0068 0.0554 0.0069 0.0577 9 0 0.0231 0.1114 0.0314 0.1449 Pilot Task: Spoken Language Understanding Task Definition Input: Transcribed utterance at each timestep Output Speech Act: 4 main categories with 21 attributes Semantic Tags: 8 main categories with subcategories, relative modifiers and from-to modifiers Example Input: 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.) Speech Act: INI (RECOMMEND) Semantic Tags: 我介绍你这<LOC CAT=“CULTURAL”>个甘榜格南</LOC>。 (I recommend you this <LOC CAT=“CULTURAL”>Kampong Glam</LOC>.) Pilot Task: Spoken Language Understanding Baselines: SVM for Speech Acts and CRF for Semantic Tags Evaluation Metrics: Precision/Recall/F-measure Results on Speech Acts (12 entries from 4 teams) Guide Tourist Team Entry P R F P R F 0 0 0.4588 0.2480 0.3219 0.3694 0.1828 0.2446 ← SVM baseline 2 0 0.5450 0.3911 0.4554 0.5001 0.5501 0.5239 2 1 0.5305 0.3969 0.4540 0.5331 0.5263 0.5297 2 2 0.5533 0.3829 0.4526 0.5107 0.5425 0.5261 2 3 0.5127 0.4251 0.4648 0.5605 0.4999 0.5285 3 0 0.4279 0.3583 0.3900 0.4591 0.4241 0.4409 3 1 0.4340 0.3635 0.3956 0.4498 0.4119 0.4300 5 0 0.4085 0.3364 0.3690 0.5026 0.4484 0.4739 5 1 0.3905 0.3216 0.3527 0.4519 0.4031 0.4261 5 2 0.4639 0.3820 0.4190 0.4916 0.4385 0.4635 5 3 0.4540 0.3739 0.4101 0.4871 0.4346 0.4594 5 4 0.4459 0.3672 0.4028 0.4984 0.4446 0.4700 7 0 0.5007 0.2976 0.3733 0.5079 0.4156 0.4571 Results on Sementic Tags (8 entries from 3 teams) Guide Tourist Team Entry P R F P R F 0 0 0.4666 0.3187 0.3787 0.5259 0.2659 0.3532 ← CRF baseline 3 0 0.4650 0.3182 0.3779 0.5331 0.2620 0.3513 3 1 0.4650 0.3182 0.3779 0.5331 0.2620 0.3513 5 0 0.5006 0.2923 0.3691 0.5083 0.3110 0.3859 5 1 0.5469 0.1893 0.2813 0.5121 0.3081 0.3847 5 2 0.3577 0.2476 0.2926 0.3031 0.2237 0.2574 5 3 0.3486 0.2541 0.2939 0.2932 0.2149 0.2480 5 4 0.3395 0.2111 0.2603 0.2947 0.2072 0.2433 7 0 0.4400 0.3207 0.3710 0.4408 0.2926 0.3517 Pilot Task: Spoken Language Generation Task Definition Input: Speech act and semantic tags at each time step Output: Generated utterance Example Input: INI (RECOMMEND), <LOC CAT=“CULTURAL”>Kampong Glam</LOC> Output: 我介绍你这个甘榜格南。 (I recommend you this Kampong Glam.) Baseline Example-based language generation Using k-nearest neighbors algorithm on speech acts and semantic tags Evaluation Metrics BLEU: Geometric average of n-gram precision of system outputs to references AM-FM: Linear interpolation of cosine similarity and normalized n-gram probability Results (4 entries from 1 team) Guide Tourist Team Entry AM-FM BLEU AM-FM BLEU 0 0 0.1981 0.3854 0.2602 0.5921 ← Baseline 5 0 0.2818 0.3264 0.3221 0.4850 5 1 0.3180 0.3371 0.3635 0.5249 5 2 0.2737 0.2852 0.3100 0.4741 5 3 0.2405 0.2758 0.4258 0.5302 * More details can be found from our paper in the SLT proceeding, DSTC5 official website (http://workshop.colips.org/dstc5/) and DSTC5 GitHub repository (https://github.com/seokhwankim/dstc5).