SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Solve the most
wicked text
categorization
problems
June 19, 2019
MEANINGCLOUD – 2019
Webinar
MEANINGCLOUD - 2019
2
Presenter
How to participate
• Send questions using the chat feature, or
• Click the “Raise your hand” button to speak and we will enable your mic
• Afterwards, you’ll be able to access a recording of the webinar and its contents as
tutorials on our blog
Before we get started…
Antonio Matarranz
CMO
3
MEANINGCLOUD – 2019
Why this webinar?
In the real world, there are
wicked text categorization
problems
A new approach based on
semantic analysis can solve
them
MEANINGCLOUD - 2019
4
Agenda
• Developing categorization models in the real world
• Categorization based on pure machine learning
• Deep Categorization API. Pre-defined models and
vertical packs
• The new Deep Categorization Customization Tool.
Semantic rule language
• Case Study: development of a categorization model
• Deep Categorization - Text Classification. When to use
one or the other
• Agile model development process. Combination with
machine learning
• Conclusions and Q&A
MEANINGCLOUD - 2019
5
Text categorization in a perfect world
Machine-Learning
Categorization
Model
Input text Categories
Model Training
Training texts
1) Use machine learning to train a
Model using tagged corpora
1) Collect a corpus of tagged texts
2) Represent each text by a feature
vector that models structure
and semantics
3) Train a classifier using any
suitable supervised learning
algorithm (SVM, Naïve Bayes,
kNN, Deep Learning…)
2) Categorize input text using the
Model
1. Training
2. Execution
Humans tagging
texts
6
MEANINGCLOUD – 2019
Advantages (and limitations) of machine learning
• Building models is easy and fast
(provided that we have a sufficient
training set)
• Easy adaptation to new domains
• Availability of enough training data
• “Black box” model where adding new
knowledge is hard/impossible
• High “inertia”
• Does not justify categorization result
MEANINGCLOUD - 2019
7
Does it look familiar?
“This is our new
taxonomy, but it can
still be improved.”
“Training text? We
do not have tagged
texts.”
“It is important to
differentiate
Washington (the city)
from Washington
(the sports team),
from Washington
(the surname).”
“You have to change
the names of all our
plans and
promotions for
tomorrow.”
MEANINGCLOUD - 2019
8
The real world is very difficult
WICKED
PROBLEMS
Categories are not
defined or they are
evolving
We do not have
adequate training
corpus
Great precision is
required to discriminate
among categories
Context in general is
very dynamic
HUGE
DEVELOPMENT,
EXPLOITATION AND
EVOLUTION COSTS
MEANINGCLOUD - 2019
9
We need a different way of doing things
Agile
Text
Analytics
Rapid Model
Generation
Incorporated
Domain
Knowledge
Powerful
Configuration
and
Refinement
Quality
Assurance
An inherently iterative and
incremental process of
continuous improvement
How we solve it
11
MEANINGCLOUD - 2019
MeaningCloud: Meaning as a Service
Standard APIs (SaaS and on-premises)
Use it free at www.meaningcloud.com
MEANINGCLOUD - 2019
12
The foundation of our solution:
Deep Categorization API
Our API for wicked categorization problems
Based on the meaning of the text
➢ Leverages the deep morphosyntactic and semantic analysis that MeaningCloud performs
Deep
Categorization
Model
Input text Categories
MEANINGCLOUD - 2019
13
Deep Categorization predefined models
Vertical Packs
IAB 2.0
Web content
Voice of the
Customer (*)
Customer
feedback
Voice of the
Employee (*)
Employee
feedback
Intention
Analysis (*)
Stage in
customer
journey
(*) Included in MeaningCloud’s Vertical Pack
MEANINGCLOUD - 2019
14
Now totally customizable
Deep Categorization
Model
Input text Categories
Customization Tool
Domain
knowledge
(+ training text)
Customization Tool
MEANINGCLOUD - 2019
15
Categorization based on the meaning of the text
Use (generally) human-defined rules based on advanced pattern matching
1. Divide text into words
2. Normalization (stemming/lemmatization, case conversion, etc.)
3. Morphosyntactic and semantic analysis
4. Check and apply rules for detecting categories
MEANINGCLOUD - 2019
16
A difficult endeavour…
I'm going to buy an iPhone
I bought an iPhone
I will never buy an iPhone
Washington?, What Washington?
MEANINGCLOUD - 2019
17
Semantic rule language
Modularity
and Reuse
Operators
and
Expressions
Use of
Semantic
Information
Abstraction
<Rules> ->
#Category
MEANINGCLOUD - 2019
18
Rule language highlights (1)
• Literals, regular expressions and (multiword) phrases
• Logical (AND, OR, AND NOT) and proximity (NEAR) operators
• Lemmatization and grammatical function vs. Exact word forms
L@produce vs. produces
[new L@product|L@service@N|L@process@N|L@value@N]~4 ->
#Management>Innovation
• Macros to group words/semantic expressions and reuse them in
different rules
MACRO {pet} = dog|cat|rabbit|turtle
MEANINGCLOUD - 2019
19
Rule language highlights(2)
• Use of detected entities and concepts and their semantic types
S@Top>Organization>Company>FinancialCompany>BankingCompany
@instance AND NOT Bank_of_America ->
#BankAmericaCompetitors
S@Top>LivingThing>Animal::{pet}-> #NonPetAnimal
• Geographical information
{travel} AND G@America>Canada -> #Travel>Canada
• Use of categories in rules (if the text is or isn’t classified in a category it
can be used in the rules)
#SpeedAgility AND #Channel>App -> #SpeedAgilityWithApp
• Robustness to spelling mistakes (Bank of Amerca)
Use case
MEANINGCLOUD - 2019
21
Contact center ticket categorization
➢ Information request
➢ Prices and conditions
➢ Bugs - Website
➢ Bugs - APIs
➢ Bugs - Integrations
MeaningCloud contact center
22
MEANINGCLOUD - 2019
From a ticket sample to the categorization model
MEANINGCLOUD - 2019
23
Process
1. Write rules based on a basic
knowledge of the categories
2. Use advanced features to multiply
recall and precision
3. Apply iterative and incremental
development to refine and adapt
to dynamic scenarios
MEANINGCLOUD - 2019
24
A simple case
Category: Bug Report – Web
• Rule: Validation email
I didn’t receive the validation mail
I’m still waiting for the confirmation email
I’m waiting on confirmation that you have received my e-mail
receive|wait AND "validation|confirmation e-?mail|mail"
Lemma: “I didn’t receive”, “I’m waiting”…
Literal multiword expression: “validation mail”, “confirmation email”…
Regular expression: ”mail”, “email”, “e-mail”
25
MEANINGCLOUD – 2019
Including semantic information (1)
Category: Bug Report – APIs
• Rule: API error
Category: Bug Report - Integrations
• Rule: Integration error
I‘m having issues with the sentiment API
I am trying to install the VoE plugin but keep receiving the error below
<MeaningCloud API mention>AND error|bug|issue|problem
<MeaningCloud Integration mention>AND error|bug|issue|problem
MEANINGCLOUD - 2019
26
Including semantic information (2)
Creation of a custom dictionary
• Entities and concepts, with their
semantic information
• Use them in rules
Topics Extraction
Text Classification
Sentiment Analysis
Deep Categorization
Summarization
…
API
Top
Product
Integration
Excel add-in
GATE plug-in
Google Sheets add-on
RapidMiner extension
Zapier app
…
MEANINGCLOUD - 2019
27
Including semantic information (3)
S@Top>Product>API AND error|bug|issue|problem
S@Top>Product>Integration AND error|bug|issue|problem
Any mention of an API product
Any mention of an Integration product
MEANINGCLOUD - 2019
28
Modularity and reuse applying macros
Ej.: error|bug|issue|problem appears in multiple contexts and rules
{error} = error|issue|problem|bug
{agent} = representative|agent|someone|engineer
S@Top>Product>API AND {error}
S@Top>Product>Integration AND {error}
Modular reuse
MEANINGCLOUD - 2019
29
Using categories within rules
• Conflicts between categories
• Rules that depend on certain categories having been triggered
Hi, I’ve received an error message when using
the sentiment analysis tool for Excel that says
“you don’t have access to this sm/model yet”
Bug Report – APIs
o
Bug Report - Integrations
#BR-INT AND #BR-API -> #BR-API
If both categories meet, exclude Bug Report – APIs
MEANINGCLOUD - 2019
30
E.g., releasing a new API: Insight Engine
Deep
Categorization
API
Verbatims
Deep
Categorization
Model
Dictionary
Categories
Including a new product without modifying rules
Changes are propagated to
the model without needing to
modify anything
Include “Insight Engine” in
the dictionary
31
MEANINGCLOUD – 2019
Advantages (and limitations) of semantic rules
• "White box" model, where adding new
knowledge is easy
• Low "inertia"
• Errors are easy to correct
• Accuracy can be as high as desired
• Does not require tagging training corpus
• Justifies categorization results
• The development of models requires
effort (but less than manually tagging a
training set)
• Adaptation to new domains is relatively
expensive
Agile development process
33
MEANINGCLOUD – 2019
API Comparison: Deep Categorization vs. Text Classification.
When to use one or the other?
Text Classification API
(Machine Learning + Basic Rules)
• Well defined and fixed categories
• Very big models
• Plenty of training texts are available
• Relatively static scenario
Deep Categorization API
(Semantic Rules)
• Badly defined or evolving categories
• Models that are not too extensive
• Not enough training texts are available
• High precision is required to
discriminate among categories
• Dynamic scenario
• The justification of categories is a
necessity
MEANINGCLOUD - 2019
34
Agile model development process. Combination with
machine learning – Option 1
Machine-Learning (ML)
Categorization
Deep Categorization
Rule ModelML Model
Input text Intermediate
categories
Categories
Model Training
Model Editor
Training texts
Rule editor
Automatic categorization engine
Classifier training engine
Classifier engine
Fast model development and high
precision from the beginning
Transparency, refinement and adaptation
35
MEANINGCLOUD - 2019
MEANINGCLOUD - 2019
37
Customer case: contact center call categorization
in telco
• Automatic categorization of call summaries prepared by operators to extract the reason (root cause) of the call
• Goal: increase satisfaction and reduce calls to the contact center
• Challenges:
– Highly dimensional complex model
▪ 3 levels: functional area + reason + 2nd order reason /
product
▪ 56 categories in level 1; 1,615 categories in total
– High semantic overlap
– Texts with incorrect capitalization and abundant typos
– Modular categories, need to reuse definitions
– Need for evolution over time
– 10 days
• Solution:
– Abundant use of macros and "virtual" categories
– Complex rules
– Expansion of rules using Word Embeddings to discover synonyms and related terms
– Final model with 800 macros and 2,395 rules
– Recall of 80% of the texts
– Final precision: 78% in level 1, 75% exact-match
MEANINGCLOUD - 2019
38
Customer case: categorization of emails in banking
• Automatic categorization of email messages in the contact center
• Goal: automatic routing to the area in charge
• Challenges:
– Model with 3 orthogonal dimensions (reason + product / service + satisfaction), 39
categories in total
– 3 different languages
– High semantic overlap
– Multi-label scenario (several labels allowed)
– 4 weeks
• Solution
– One model per language
– Use of product / service dictionaries
– Abundant use of macros
– Rules with weights for relevance calculation
– Model with 590 - 733 rules, depending on language
– Final precision: 70% reason, 75% product / service, 93% satisfaction
39
MEANINGCLOUD – 2019
Conclusion
Wicked text categorization
problems HAPPEN
Give our agile development
process a chance
Q & A time
MEANINGCLOUD - 2019
41
Stay tuned to our blog and emails
We’ll be posting a recording of the webinar and
its contents as tutorials soon
42
MEANINGCLOUD - 2019 www.meaningcloud.com
Automating the extraction of Meaning from any information source.
+1 (646) 403-31043537 36th Street
New York, NY 11106
amatarranz@meaningcloud.com
Thank you for your attention!

Weitere ähnliche Inhalte

Ähnlich wie Solve the most wicked text categorization problems - MeaningCloud webinar

Pradeep Jain - Smart Procedures at the US Department of Defense
Pradeep Jain - Smart Procedures at the US Department of DefensePradeep Jain - Smart Procedures at the US Department of Defense
Pradeep Jain - Smart Procedures at the US Department of DefenseLavaConConference
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Point of View on Integrated Learning Ver 1
Point of View on Integrated Learning Ver 1Point of View on Integrated Learning Ver 1
Point of View on Integrated Learning Ver 1Krishnan Nilakantan
 
Design decisions in job architectures and competency modeling June 2020
Design decisions in job architectures and competency modeling June 2020Design decisions in job architectures and competency modeling June 2020
Design decisions in job architectures and competency modeling June 2020Steven Forth
 
Report on Summer Internship Project (Biswadeep Ghosh Hazra)
Report on Summer Internship Project (Biswadeep Ghosh Hazra)Report on Summer Internship Project (Biswadeep Ghosh Hazra)
Report on Summer Internship Project (Biswadeep Ghosh Hazra)Biswadeep Ghosh Hazra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Advanced Content Targeting & Personalization Within the Digital Experience Us...
Advanced Content Targeting & Personalization Within the Digital Experience Us...Advanced Content Targeting & Personalization Within the Digital Experience Us...
Advanced Content Targeting & Personalization Within the Digital Experience Us...Perficient, Inc.
 
Canonical Modeling for API Interoperability
Canonical Modeling for API InteroperabilityCanonical Modeling for API Interoperability
Canonical Modeling for API InteroperabilityTed Epstein
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
Adopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterpriseAdopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterpriseQuantUniversity
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsGianmario Spacagna
 
Domain Driven Design Introduction
Domain Driven Design IntroductionDomain Driven Design Introduction
Domain Driven Design Introductionwojtek_s
 
Importance of testing for the business
Importance of testing for the businessImportance of testing for the business
Importance of testing for the businessEggplant
 
i-lovelearning London 2016 | Netex 2017 Preview [EN]
i-lovelearning London 2016 | Netex 2017 Preview [EN]i-lovelearning London 2016 | Netex 2017 Preview [EN]
i-lovelearning London 2016 | Netex 2017 Preview [EN]Netex Learning
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsDatabricks
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startMaxim Salnikov
 
CWIN17 london digital ops model and transformation - max bocchini and ishit...
CWIN17 london   digital ops model and transformation - max bocchini and ishit...CWIN17 london   digital ops model and transformation - max bocchini and ishit...
CWIN17 london digital ops model and transformation - max bocchini and ishit...Capgemini
 
Improve Product Design with High Quality Requirements
Improve Product Design with High Quality RequirementsImprove Product Design with High Quality Requirements
Improve Product Design with High Quality RequirementsElizabeth Steiner
 

Ähnlich wie Solve the most wicked text categorization problems - MeaningCloud webinar (20)

Pradeep Jain - Smart Procedures at the US Department of Defense
Pradeep Jain - Smart Procedures at the US Department of DefensePradeep Jain - Smart Procedures at the US Department of Defense
Pradeep Jain - Smart Procedures at the US Department of Defense
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Point of View on Integrated Learning Ver 1
Point of View on Integrated Learning Ver 1Point of View on Integrated Learning Ver 1
Point of View on Integrated Learning Ver 1
 
Design decisions in job architectures and competency modeling June 2020
Design decisions in job architectures and competency modeling June 2020Design decisions in job architectures and competency modeling June 2020
Design decisions in job architectures and competency modeling June 2020
 
Report on Summer Internship Project (Biswadeep Ghosh Hazra)
Report on Summer Internship Project (Biswadeep Ghosh Hazra)Report on Summer Internship Project (Biswadeep Ghosh Hazra)
Report on Summer Internship Project (Biswadeep Ghosh Hazra)
 
ChatGPT for Technical Writers
ChatGPT for Technical WritersChatGPT for Technical Writers
ChatGPT for Technical Writers
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Advanced Content Targeting & Personalization Within the Digital Experience Us...
Advanced Content Targeting & Personalization Within the Digital Experience Us...Advanced Content Targeting & Personalization Within the Digital Experience Us...
Advanced Content Targeting & Personalization Within the Digital Experience Us...
 
Canonical Modeling for API Interoperability
Canonical Modeling for API InteroperabilityCanonical Modeling for API Interoperability
Canonical Modeling for API Interoperability
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
Adopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterpriseAdopting Data Science and Machine Learning in the financial enterprise
Adopting Data Science and Machine Learning in the financial enterprise
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning products
 
Building A Content Engine For Scale - Eda Kavlakoglu, IBM
Building A Content Engine For Scale - Eda Kavlakoglu, IBMBuilding A Content Engine For Scale - Eda Kavlakoglu, IBM
Building A Content Engine For Scale - Eda Kavlakoglu, IBM
 
Domain Driven Design Introduction
Domain Driven Design IntroductionDomain Driven Design Introduction
Domain Driven Design Introduction
 
Importance of testing for the business
Importance of testing for the businessImportance of testing for the business
Importance of testing for the business
 
i-lovelearning London 2016 | Netex 2017 Preview [EN]
i-lovelearning London 2016 | Netex 2017 Preview [EN]i-lovelearning London 2016 | Netex 2017 Preview [EN]
i-lovelearning London 2016 | Netex 2017 Preview [EN]
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
Building Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to startBuilding Generative AI-infused apps: what's possible and how to start
Building Generative AI-infused apps: what's possible and how to start
 
CWIN17 london digital ops model and transformation - max bocchini and ishit...
CWIN17 london   digital ops model and transformation - max bocchini and ishit...CWIN17 london   digital ops model and transformation - max bocchini and ishit...
CWIN17 london digital ops model and transformation - max bocchini and ishit...
 
Improve Product Design with High Quality Requirements
Improve Product Design with High Quality RequirementsImprove Product Design with High Quality Requirements
Improve Product Design with High Quality Requirements
 

Mehr von MeaningCloud

More scalable and valuable market intelligence with deep text analytics - Mea...
More scalable and valuable market intelligence with deep text analytics - Mea...More scalable and valuable market intelligence with deep text analytics - Mea...
More scalable and valuable market intelligence with deep text analytics - Mea...MeaningCloud
 
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...Inteligencia de mercado más escalable y valiosa mediante la analítica profund...
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...MeaningCloud
 
Transform your customer feedback into action with deep text analytics - Meani...
Transform your customer feedback into action with deep text analytics - Meani...Transform your customer feedback into action with deep text analytics - Meani...
Transform your customer feedback into action with deep text analytics - Meani...MeaningCloud
 
Convierte el feedback de tus clientes en acción con la analítica profunda de ...
Convierte el feedback de tus clientes en acción con la analítica profunda de ...Convierte el feedback de tus clientes en acción con la analítica profunda de ...
Convierte el feedback de tus clientes en acción con la analítica profunda de ...MeaningCloud
 
Resuelve los problemas más complejos de categorización de texto - MeaningClou...
Resuelve los problemas más complejos de categorización de texto - MeaningClou...Resuelve los problemas más complejos de categorización de texto - MeaningClou...
Resuelve los problemas más complejos de categorización de texto - MeaningClou...MeaningCloud
 
NLP for Small Data - MeaningCloud at T3chFest 2019
NLP for Small Data  - MeaningCloud at T3chFest 2019NLP for Small Data  - MeaningCloud at T3chFest 2019
NLP for Small Data - MeaningCloud at T3chFest 2019MeaningCloud
 
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...MeaningCloud
 
Vertical Packs: adapt text analytics to your application with only one click;...
Vertical Packs: adapt text analytics to your application with only one click;...Vertical Packs: adapt text analytics to your application with only one click;...
Vertical Packs: adapt text analytics to your application with only one click;...MeaningCloud
 
How to extract health market intelligence from the voice of the patient - Mea...
How to extract health market intelligence from the voice of the patient - Mea...How to extract health market intelligence from the voice of the patient - Mea...
How to extract health market intelligence from the voice of the patient - Mea...MeaningCloud
 
Why you need Deep Semantic Analytics MeaningCloud webinar
Why you need Deep Semantic Analytics  MeaningCloud webinarWhy you need Deep Semantic Analytics  MeaningCloud webinar
Why you need Deep Semantic Analytics MeaningCloud webinarMeaningCloud
 
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinar
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinarPor qué necesitas Deep Semantic Analytics - MeaningCloud webinar
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinarMeaningCloud
 
Integrate the most advanced text analytics into your predictive models - Mean...
Integrate the most advanced text analytics into your predictive models - Mean...Integrate the most advanced text analytics into your predictive models - Mean...
Integrate the most advanced text analytics into your predictive models - Mean...MeaningCloud
 
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...MeaningCloud
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudMeaningCloud
 
Cuándo usar las diferentes herramientas de analítica de texto - Meaningcloud
Cuándo usar las diferentes herramientas de analítica de texto - MeaningcloudCuándo usar las diferentes herramientas de analítica de texto - Meaningcloud
Cuándo usar las diferentes herramientas de analítica de texto - MeaningcloudMeaningCloud
 
Aprende a desarrollar clasificadores de texto a medida con MeaningCloud
Aprende a desarrollar clasificadores de texto a medida con MeaningCloudAprende a desarrollar clasificadores de texto a medida con MeaningCloud
Aprende a desarrollar clasificadores de texto a medida con MeaningCloudMeaningCloud
 
Entirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarEntirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarMeaningCloud
 
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...MeaningCloud
 
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...MeaningCloud
 
Intelligent Content for Media & Publishers
Intelligent Content for Media & PublishersIntelligent Content for Media & Publishers
Intelligent Content for Media & PublishersMeaningCloud
 

Mehr von MeaningCloud (20)

More scalable and valuable market intelligence with deep text analytics - Mea...
More scalable and valuable market intelligence with deep text analytics - Mea...More scalable and valuable market intelligence with deep text analytics - Mea...
More scalable and valuable market intelligence with deep text analytics - Mea...
 
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...Inteligencia de mercado más escalable y valiosa mediante la analítica profund...
Inteligencia de mercado más escalable y valiosa mediante la analítica profund...
 
Transform your customer feedback into action with deep text analytics - Meani...
Transform your customer feedback into action with deep text analytics - Meani...Transform your customer feedback into action with deep text analytics - Meani...
Transform your customer feedback into action with deep text analytics - Meani...
 
Convierte el feedback de tus clientes en acción con la analítica profunda de ...
Convierte el feedback de tus clientes en acción con la analítica profunda de ...Convierte el feedback de tus clientes en acción con la analítica profunda de ...
Convierte el feedback de tus clientes en acción con la analítica profunda de ...
 
Resuelve los problemas más complejos de categorización de texto - MeaningClou...
Resuelve los problemas más complejos de categorización de texto - MeaningClou...Resuelve los problemas más complejos de categorización de texto - MeaningClou...
Resuelve los problemas más complejos de categorización de texto - MeaningClou...
 
NLP for Small Data - MeaningCloud at T3chFest 2019
NLP for Small Data  - MeaningCloud at T3chFest 2019NLP for Small Data  - MeaningCloud at T3chFest 2019
NLP for Small Data - MeaningCloud at T3chFest 2019
 
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...
Packs Verticales: adapta la analítica de texto a tu aplicación con solo un cl...
 
Vertical Packs: adapt text analytics to your application with only one click;...
Vertical Packs: adapt text analytics to your application with only one click;...Vertical Packs: adapt text analytics to your application with only one click;...
Vertical Packs: adapt text analytics to your application with only one click;...
 
How to extract health market intelligence from the voice of the patient - Mea...
How to extract health market intelligence from the voice of the patient - Mea...How to extract health market intelligence from the voice of the patient - Mea...
How to extract health market intelligence from the voice of the patient - Mea...
 
Why you need Deep Semantic Analytics MeaningCloud webinar
Why you need Deep Semantic Analytics  MeaningCloud webinarWhy you need Deep Semantic Analytics  MeaningCloud webinar
Why you need Deep Semantic Analytics MeaningCloud webinar
 
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinar
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinarPor qué necesitas Deep Semantic Analytics - MeaningCloud webinar
Por qué necesitas Deep Semantic Analytics - MeaningCloud webinar
 
Integrate the most advanced text analytics into your predictive models - Mean...
Integrate the most advanced text analytics into your predictive models - Mean...Integrate the most advanced text analytics into your predictive models - Mean...
Integrate the most advanced text analytics into your predictive models - Mean...
 
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...
Incorpora la analitica de texto mas avanzada a tus modelos predictivos - Mean...
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning Cloud
 
Cuándo usar las diferentes herramientas de analítica de texto - Meaningcloud
Cuándo usar las diferentes herramientas de analítica de texto - MeaningcloudCuándo usar las diferentes herramientas de analítica de texto - Meaningcloud
Cuándo usar las diferentes herramientas de analítica de texto - Meaningcloud
 
Aprende a desarrollar clasificadores de texto a medida con MeaningCloud
Aprende a desarrollar clasificadores de texto a medida con MeaningCloudAprende a desarrollar clasificadores de texto a medida con MeaningCloud
Aprende a desarrollar clasificadores de texto a medida con MeaningCloud
 
Entirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinarEntirely tailored sentiment analysis - MeaningCloud webinar
Entirely tailored sentiment analysis - MeaningCloud webinar
 
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
 
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
10 formas de aumentar los beneficios de los medios utilizando metadatos - pre...
 
Intelligent Content for Media & Publishers
Intelligent Content for Media & PublishersIntelligent Content for Media & Publishers
Intelligent Content for Media & Publishers
 

Kürzlich hochgeladen

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Solve the most wicked text categorization problems - MeaningCloud webinar

  • 1. Solve the most wicked text categorization problems June 19, 2019 MEANINGCLOUD – 2019 Webinar
  • 2. MEANINGCLOUD - 2019 2 Presenter How to participate • Send questions using the chat feature, or • Click the “Raise your hand” button to speak and we will enable your mic • Afterwards, you’ll be able to access a recording of the webinar and its contents as tutorials on our blog Before we get started… Antonio Matarranz CMO
  • 3. 3 MEANINGCLOUD – 2019 Why this webinar? In the real world, there are wicked text categorization problems A new approach based on semantic analysis can solve them
  • 4. MEANINGCLOUD - 2019 4 Agenda • Developing categorization models in the real world • Categorization based on pure machine learning • Deep Categorization API. Pre-defined models and vertical packs • The new Deep Categorization Customization Tool. Semantic rule language • Case Study: development of a categorization model • Deep Categorization - Text Classification. When to use one or the other • Agile model development process. Combination with machine learning • Conclusions and Q&A
  • 5. MEANINGCLOUD - 2019 5 Text categorization in a perfect world Machine-Learning Categorization Model Input text Categories Model Training Training texts 1) Use machine learning to train a Model using tagged corpora 1) Collect a corpus of tagged texts 2) Represent each text by a feature vector that models structure and semantics 3) Train a classifier using any suitable supervised learning algorithm (SVM, Naïve Bayes, kNN, Deep Learning…) 2) Categorize input text using the Model 1. Training 2. Execution Humans tagging texts
  • 6. 6 MEANINGCLOUD – 2019 Advantages (and limitations) of machine learning • Building models is easy and fast (provided that we have a sufficient training set) • Easy adaptation to new domains • Availability of enough training data • “Black box” model where adding new knowledge is hard/impossible • High “inertia” • Does not justify categorization result
  • 7. MEANINGCLOUD - 2019 7 Does it look familiar? “This is our new taxonomy, but it can still be improved.” “Training text? We do not have tagged texts.” “It is important to differentiate Washington (the city) from Washington (the sports team), from Washington (the surname).” “You have to change the names of all our plans and promotions for tomorrow.”
  • 8. MEANINGCLOUD - 2019 8 The real world is very difficult WICKED PROBLEMS Categories are not defined or they are evolving We do not have adequate training corpus Great precision is required to discriminate among categories Context in general is very dynamic HUGE DEVELOPMENT, EXPLOITATION AND EVOLUTION COSTS
  • 9. MEANINGCLOUD - 2019 9 We need a different way of doing things Agile Text Analytics Rapid Model Generation Incorporated Domain Knowledge Powerful Configuration and Refinement Quality Assurance An inherently iterative and incremental process of continuous improvement
  • 11. 11 MEANINGCLOUD - 2019 MeaningCloud: Meaning as a Service Standard APIs (SaaS and on-premises) Use it free at www.meaningcloud.com
  • 12. MEANINGCLOUD - 2019 12 The foundation of our solution: Deep Categorization API Our API for wicked categorization problems Based on the meaning of the text ➢ Leverages the deep morphosyntactic and semantic analysis that MeaningCloud performs Deep Categorization Model Input text Categories
  • 13. MEANINGCLOUD - 2019 13 Deep Categorization predefined models Vertical Packs IAB 2.0 Web content Voice of the Customer (*) Customer feedback Voice of the Employee (*) Employee feedback Intention Analysis (*) Stage in customer journey (*) Included in MeaningCloud’s Vertical Pack
  • 14. MEANINGCLOUD - 2019 14 Now totally customizable Deep Categorization Model Input text Categories Customization Tool Domain knowledge (+ training text) Customization Tool
  • 15. MEANINGCLOUD - 2019 15 Categorization based on the meaning of the text Use (generally) human-defined rules based on advanced pattern matching 1. Divide text into words 2. Normalization (stemming/lemmatization, case conversion, etc.) 3. Morphosyntactic and semantic analysis 4. Check and apply rules for detecting categories
  • 16. MEANINGCLOUD - 2019 16 A difficult endeavour… I'm going to buy an iPhone I bought an iPhone I will never buy an iPhone Washington?, What Washington?
  • 17. MEANINGCLOUD - 2019 17 Semantic rule language Modularity and Reuse Operators and Expressions Use of Semantic Information Abstraction <Rules> -> #Category
  • 18. MEANINGCLOUD - 2019 18 Rule language highlights (1) • Literals, regular expressions and (multiword) phrases • Logical (AND, OR, AND NOT) and proximity (NEAR) operators • Lemmatization and grammatical function vs. Exact word forms L@produce vs. produces [new L@product|L@service@N|L@process@N|L@value@N]~4 -> #Management>Innovation • Macros to group words/semantic expressions and reuse them in different rules MACRO {pet} = dog|cat|rabbit|turtle
  • 19. MEANINGCLOUD - 2019 19 Rule language highlights(2) • Use of detected entities and concepts and their semantic types S@Top>Organization>Company>FinancialCompany>BankingCompany @instance AND NOT Bank_of_America -> #BankAmericaCompetitors S@Top>LivingThing>Animal::{pet}-> #NonPetAnimal • Geographical information {travel} AND G@America>Canada -> #Travel>Canada • Use of categories in rules (if the text is or isn’t classified in a category it can be used in the rules) #SpeedAgility AND #Channel>App -> #SpeedAgilityWithApp • Robustness to spelling mistakes (Bank of Amerca)
  • 21. MEANINGCLOUD - 2019 21 Contact center ticket categorization ➢ Information request ➢ Prices and conditions ➢ Bugs - Website ➢ Bugs - APIs ➢ Bugs - Integrations MeaningCloud contact center
  • 22. 22 MEANINGCLOUD - 2019 From a ticket sample to the categorization model
  • 23. MEANINGCLOUD - 2019 23 Process 1. Write rules based on a basic knowledge of the categories 2. Use advanced features to multiply recall and precision 3. Apply iterative and incremental development to refine and adapt to dynamic scenarios
  • 24. MEANINGCLOUD - 2019 24 A simple case Category: Bug Report – Web • Rule: Validation email I didn’t receive the validation mail I’m still waiting for the confirmation email I’m waiting on confirmation that you have received my e-mail receive|wait AND "validation|confirmation e-?mail|mail" Lemma: “I didn’t receive”, “I’m waiting”… Literal multiword expression: “validation mail”, “confirmation email”… Regular expression: ”mail”, “email”, “e-mail”
  • 25. 25 MEANINGCLOUD – 2019 Including semantic information (1) Category: Bug Report – APIs • Rule: API error Category: Bug Report - Integrations • Rule: Integration error I‘m having issues with the sentiment API I am trying to install the VoE plugin but keep receiving the error below <MeaningCloud API mention>AND error|bug|issue|problem <MeaningCloud Integration mention>AND error|bug|issue|problem
  • 26. MEANINGCLOUD - 2019 26 Including semantic information (2) Creation of a custom dictionary • Entities and concepts, with their semantic information • Use them in rules Topics Extraction Text Classification Sentiment Analysis Deep Categorization Summarization … API Top Product Integration Excel add-in GATE plug-in Google Sheets add-on RapidMiner extension Zapier app …
  • 27. MEANINGCLOUD - 2019 27 Including semantic information (3) S@Top>Product>API AND error|bug|issue|problem S@Top>Product>Integration AND error|bug|issue|problem Any mention of an API product Any mention of an Integration product
  • 28. MEANINGCLOUD - 2019 28 Modularity and reuse applying macros Ej.: error|bug|issue|problem appears in multiple contexts and rules {error} = error|issue|problem|bug {agent} = representative|agent|someone|engineer S@Top>Product>API AND {error} S@Top>Product>Integration AND {error} Modular reuse
  • 29. MEANINGCLOUD - 2019 29 Using categories within rules • Conflicts between categories • Rules that depend on certain categories having been triggered Hi, I’ve received an error message when using the sentiment analysis tool for Excel that says “you don’t have access to this sm/model yet” Bug Report – APIs o Bug Report - Integrations #BR-INT AND #BR-API -> #BR-API If both categories meet, exclude Bug Report – APIs
  • 30. MEANINGCLOUD - 2019 30 E.g., releasing a new API: Insight Engine Deep Categorization API Verbatims Deep Categorization Model Dictionary Categories Including a new product without modifying rules Changes are propagated to the model without needing to modify anything Include “Insight Engine” in the dictionary
  • 31. 31 MEANINGCLOUD – 2019 Advantages (and limitations) of semantic rules • "White box" model, where adding new knowledge is easy • Low "inertia" • Errors are easy to correct • Accuracy can be as high as desired • Does not require tagging training corpus • Justifies categorization results • The development of models requires effort (but less than manually tagging a training set) • Adaptation to new domains is relatively expensive
  • 33. 33 MEANINGCLOUD – 2019 API Comparison: Deep Categorization vs. Text Classification. When to use one or the other? Text Classification API (Machine Learning + Basic Rules) • Well defined and fixed categories • Very big models • Plenty of training texts are available • Relatively static scenario Deep Categorization API (Semantic Rules) • Badly defined or evolving categories • Models that are not too extensive • Not enough training texts are available • High precision is required to discriminate among categories • Dynamic scenario • The justification of categories is a necessity
  • 34. MEANINGCLOUD - 2019 34 Agile model development process. Combination with machine learning – Option 1 Machine-Learning (ML) Categorization Deep Categorization Rule ModelML Model Input text Intermediate categories Categories Model Training Model Editor Training texts Rule editor Automatic categorization engine Classifier training engine Classifier engine Fast model development and high precision from the beginning Transparency, refinement and adaptation
  • 36. MEANINGCLOUD - 2019 37 Customer case: contact center call categorization in telco • Automatic categorization of call summaries prepared by operators to extract the reason (root cause) of the call • Goal: increase satisfaction and reduce calls to the contact center • Challenges: – Highly dimensional complex model ▪ 3 levels: functional area + reason + 2nd order reason / product ▪ 56 categories in level 1; 1,615 categories in total – High semantic overlap – Texts with incorrect capitalization and abundant typos – Modular categories, need to reuse definitions – Need for evolution over time – 10 days • Solution: – Abundant use of macros and "virtual" categories – Complex rules – Expansion of rules using Word Embeddings to discover synonyms and related terms – Final model with 800 macros and 2,395 rules – Recall of 80% of the texts – Final precision: 78% in level 1, 75% exact-match
  • 37. MEANINGCLOUD - 2019 38 Customer case: categorization of emails in banking • Automatic categorization of email messages in the contact center • Goal: automatic routing to the area in charge • Challenges: – Model with 3 orthogonal dimensions (reason + product / service + satisfaction), 39 categories in total – 3 different languages – High semantic overlap – Multi-label scenario (several labels allowed) – 4 weeks • Solution – One model per language – Use of product / service dictionaries – Abundant use of macros – Rules with weights for relevance calculation – Model with 590 - 733 rules, depending on language – Final precision: 70% reason, 75% product / service, 93% satisfaction
  • 38. 39 MEANINGCLOUD – 2019 Conclusion Wicked text categorization problems HAPPEN Give our agile development process a chance
  • 39. Q & A time
  • 40. MEANINGCLOUD - 2019 41 Stay tuned to our blog and emails We’ll be posting a recording of the webinar and its contents as tutorials soon
  • 41. 42 MEANINGCLOUD - 2019 www.meaningcloud.com Automating the extraction of Meaning from any information source. +1 (646) 403-31043537 36th Street New York, NY 11106 amatarranz@meaningcloud.com Thank you for your attention!