SlideShare ist ein Scribd-Unternehmen logo
1 von 27
How I Test Ai Model
DEVDAY 2019
April 06, 2019
Minh Hoang
A Tester
A member of
Technology
team
Fond of new
technology
A
Challenge-taker
I’M
Objectives
Sharing used tools, key metrics in AI testing
and how to evaluate the AI model.
Agenda
1 • What is machine learning
• Myths & Facts about AI
• Myths & Facts about Chatbot
ABOUT A.I
3 TAKE AWAY
2
• The right metrics for evaluating the
ML model
• How we test FAQ model
• Demo
HOW I TEST THE AI MODEL
4 REFERENCES
• Tools & Libraries
About A.I
What Is Machine Learning?
Machine learning is the subfield of
computer science that gives
computers the ability to learn without
being explicitly programmed.
Myths And Facts About A.I
MYTH FACT
Artificial intelligence and machine learning will wipe out
all the jobs.
A.I is no different from other technological advances in
that it helps humans become more effective and
processes more efficient.
“Cognitive AI” technologies are able to understand and
solve new problems the way the human brain can.
“Cognitive” technologies can’t solve problems they
weren’t designed to solve.
You need a PH.D. to work in machine learning & data
science.
Nowadays, a lot of documents and tutorial on the Internet
can help people step by step approach machine learning
world.
v
What Is Chatbot?
A computer program designed to
simulate conversation with human
users, especially over the Internet.
Myths And Facts About Chatbot
MYTH FACT
Chatbot have only been around for a short while.
ELIZA is one of the most well-known Chatbot
therapists and the bot was created about 50 years
ago.
Texts or voice is the only way to interact with Bots.
Actually Chatbot platforms allows users to interact
with them via graphical interfaces or graphical
widgets, and recent Chatbot platforms follow this
development approach.
All Chatbot platforms use AI.
Not all Chatbot platforms use AI. Most Chatbot
platforms are rule-based which follow a simple,
autonomous process, something along the lines of a
decision tree.
How We Test The Ai Model
Regression
• MSPE
• MSAE
• R Square
• Adjusted R Square
Classification
• Precision – Recall
• ROC-AUC
• Accuracy
• Log-Loss
Unsupervised Models
• Rand Index
• Mutual
• Information
Others
• CV Error
• Heuristic methods to
find K
• BLEU Score (NLP)
The Right Metric For Evaluating
Ml Models
Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
Confusion Matrix
Commonly Used Metrics In Classification
Accuracy:
• Percentage of total items classified correctly
• Formula:
Commonly Used Metrics In Classification
Recall/Sensitivity/TPR (True Positive Rate):
• Number of items correctly identified as
positive out of total true positives
• Formula:
Commonly Used Metrics In Classification
Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
Precision
• Number of items correctly identified as
positive out of total items identified as
positive
• Formula:
Commonly Used Metrics In Classification
Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
Precision
• It is a harmonic mean of precision and recall
• Formula:
Commonly Used Metrics In Classification
Precision Recall F1
1 1 1
0.1 0.1 0.1
0.5 0.5 0.5
1 0.1 0.182
0.3 0.8 0.36
0.8 0.3 0.436
What Is FAQ Model?
Prepare test
data
•Crawl FAQ data
•Generate question
from FAQ data
Run test
•Train model with FAQ
data
•Run test
Analyze
result
•Pre-process the raw
result
•Calculate metrics to
evaluate the AI model
in classification
•Visualize the metrics
Model
Result
•Select the threshold
value
The Process To Test FAQ Model?
• Collect FAQ questions data (Manual and
Automate)
• Use NLTK to generate new question data
(NLG)
• Self-defined question data
How We Define Test Data
Set?
Train with domain X and run the test defined for domain X.
How We Evaluate The AI Model?
• Pre-process the raw result.
• Calculate metrics to evaluate the AI model
in classification.
• Visually metrics.
How We Analyze The
Result?
Demo
Take Away
Take Away
• Know main metrics for evaluating ML model.
• Know how to test the classification AI model.
• It is up to your self-learning skills and adaptability to decide whether working on
___ projects (AI, blockchain, VR, etc.) is difficult.
• Use Automation to reduce time and effort to prepare test data
Tools & Libraries
Tools & Libraries
• API: requests and postman.
• AI/ML: nltk, difflib, plot.ly, pandas and numpy.
Question
& Answer

Weitere ähnliche Inhalte

Was ist angesagt?

Artificial Intelligence Overview PowerPoint Presentation Slides
Artificial Intelligence Overview PowerPoint Presentation Slides Artificial Intelligence Overview PowerPoint Presentation Slides
Artificial Intelligence Overview PowerPoint Presentation Slides SlideTeam
 
Principles of Artificial Intelligence & Machine Learning
Principles of Artificial Intelligence & Machine LearningPrinciples of Artificial Intelligence & Machine Learning
Principles of Artificial Intelligence & Machine LearningJerry Lu
 
ARTIFICIAL INTELLIGENCE BASIC PPT
ARTIFICIAL INTELLIGENCE BASIC PPTARTIFICIAL INTELLIGENCE BASIC PPT
ARTIFICIAL INTELLIGENCE BASIC PPTRohitYemul1
 
Lesson 1 intro to ai
Lesson 1   intro to aiLesson 1   intro to ai
Lesson 1 intro to aiankit_ppt
 
The State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned ReportThe State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned ReportNathan Benaich
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligenceSai Nath
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligencevallibhargavi
 
Applications of Artificial Intelligence
Applications of Artificial IntelligenceApplications of Artificial Intelligence
Applications of Artificial IntelligenceMehr Un Nisa Manjotho
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An OverviewGiri Dharan
 
15 Pros and 5 Cons of Artificial Intelligence in the Classroom
15 Pros and 5 Cons of Artificial Intelligence in the Classroom15 Pros and 5 Cons of Artificial Intelligence in the Classroom
15 Pros and 5 Cons of Artificial Intelligence in the ClassroomLiveTiles
 
International Journal of Artificial Intelligence & Applications (IJAIA)
International Journal of Artificial Intelligence & Applications (IJAIA)International Journal of Artificial Intelligence & Applications (IJAIA)
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
 
Applications of artificial intelligence (AI) models for management decision m...
Applications of artificial intelligence (AI) models for management decision m...Applications of artificial intelligence (AI) models for management decision m...
Applications of artificial intelligence (AI) models for management decision m...The Higher Education Academy
 
Issues on Artificial Intelligence and Future (Standards Perspective)
Issues on Artificial Intelligence  and Future (Standards Perspective)Issues on Artificial Intelligence  and Future (Standards Perspective)
Issues on Artificial Intelligence and Future (Standards Perspective)Seungyun Lee
 
GTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceGTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceKürşat İNCE
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewSpotle.ai
 
Artificial intelligence and its impact on jobs and employment
Artificial intelligence and its impact on jobs and employmentArtificial intelligence and its impact on jobs and employment
Artificial intelligence and its impact on jobs and employmentafp11saurabhj
 
Artificial Intelligence with Python | Edureka
Artificial Intelligence with Python | EdurekaArtificial Intelligence with Python | Edureka
Artificial Intelligence with Python | EdurekaEdureka!
 
Analytics and Big Data Analytics
Analytics and Big Data AnalyticsAnalytics and Big Data Analytics
Analytics and Big Data AnalyticsInside Analysis
 

Was ist angesagt? (20)

Artificial Intelligence - Overview
Artificial Intelligence - OverviewArtificial Intelligence - Overview
Artificial Intelligence - Overview
 
Artificial Intelligence Overview PowerPoint Presentation Slides
Artificial Intelligence Overview PowerPoint Presentation Slides Artificial Intelligence Overview PowerPoint Presentation Slides
Artificial Intelligence Overview PowerPoint Presentation Slides
 
Principles of Artificial Intelligence & Machine Learning
Principles of Artificial Intelligence & Machine LearningPrinciples of Artificial Intelligence & Machine Learning
Principles of Artificial Intelligence & Machine Learning
 
ARTIFICIAL INTELLIGENCE BASIC PPT
ARTIFICIAL INTELLIGENCE BASIC PPTARTIFICIAL INTELLIGENCE BASIC PPT
ARTIFICIAL INTELLIGENCE BASIC PPT
 
Lesson 1 intro to ai
Lesson 1   intro to aiLesson 1   intro to ai
Lesson 1 intro to ai
 
The State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned ReportThe State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned Report
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligence
 
Applications of Artificial Intelligence
Applications of Artificial IntelligenceApplications of Artificial Intelligence
Applications of Artificial Intelligence
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An Overview
 
15 Pros and 5 Cons of Artificial Intelligence in the Classroom
15 Pros and 5 Cons of Artificial Intelligence in the Classroom15 Pros and 5 Cons of Artificial Intelligence in the Classroom
15 Pros and 5 Cons of Artificial Intelligence in the Classroom
 
International Journal of Artificial Intelligence & Applications (IJAIA)
International Journal of Artificial Intelligence & Applications (IJAIA)International Journal of Artificial Intelligence & Applications (IJAIA)
International Journal of Artificial Intelligence & Applications (IJAIA)
 
Applications of artificial intelligence (AI) models for management decision m...
Applications of artificial intelligence (AI) models for management decision m...Applications of artificial intelligence (AI) models for management decision m...
Applications of artificial intelligence (AI) models for management decision m...
 
Issues on Artificial Intelligence and Future (Standards Perspective)
Issues on Artificial Intelligence  and Future (Standards Perspective)Issues on Artificial Intelligence  and Future (Standards Perspective)
Issues on Artificial Intelligence and Future (Standards Perspective)
 
GTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceGTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial Intelligence
 
Introduction of ai
Introduction of aiIntroduction of ai
Introduction of ai
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The Overview
 
Artificial intelligence and its impact on jobs and employment
Artificial intelligence and its impact on jobs and employmentArtificial intelligence and its impact on jobs and employment
Artificial intelligence and its impact on jobs and employment
 
Artificial Intelligence with Python | Edureka
Artificial Intelligence with Python | EdurekaArtificial Intelligence with Python | Edureka
Artificial Intelligence with Python | Edureka
 
Analytics and Big Data Analytics
Analytics and Big Data AnalyticsAnalytics and Big Data Analytics
Analytics and Big Data Analytics
 

Ähnlich wie [DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS

From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine LearningDev Raj Gautam
 
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...DataScienceConferenc1
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning pyingkodi maran
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOpsFuture Processing
 
Artificial Intelligence.pptx learn and practice
Artificial Intelligence.pptx learn and practiceArtificial Intelligence.pptx learn and practice
Artificial Intelligence.pptx learn and practicePavankalayankusetty
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk Vijay Ganti
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk Vijay Ganti
 
MLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business PerspectiveMLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business PerspectiveBigML, Inc
 
How to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfHow to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfLaura Miller
 
How to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfHow to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfLaura Miller
 
W1_Lec01_Lec02_Introduction.pptx
W1_Lec01_Lec02_Introduction.pptxW1_Lec01_Lec02_Introduction.pptx
W1_Lec01_Lec02_Introduction.pptxJavaid Iqbal
 
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...SlideTeam
 
Ai demystified for HR and TA leaders
Ai demystified for HR and TA leadersAi demystified for HR and TA leaders
Ai demystified for HR and TA leadersAntonia Macrides
 
Better Service Management with Artificial Intelligence
Better Service Management with Artificial IntelligenceBetter Service Management with Artificial Intelligence
Better Service Management with Artificial IntelligenceTOPdesk
 
Better Service Management with AI
Better Service Management with AIBetter Service Management with AI
Better Service Management with AITOPdesk
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine LearningOgilvy Consulting
 

Ähnlich wie [DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS (20)

AI Meets HR
AI Meets HRAI Meets HR
AI Meets HR
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
 
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...
[DSC Europe 22] On the Aspects of Artificial Intelligence and Robotic Autonom...
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptx
 
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
[QE 2018] Paul Gerrard – Automating Assurance: Tools, Collaboration and DevOps
 
Artificial Intelligence.pptx learn and practice
Artificial Intelligence.pptx learn and practiceArtificial Intelligence.pptx learn and practice
Artificial Intelligence.pptx learn and practice
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
MLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business PerspectiveMLSEV. Machine Learning: Business Perspective
MLSEV. Machine Learning: Business Perspective
 
How to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfHow to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdf
 
How to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdfHow to Build an AI System A Complete Guide.pdf
How to Build an AI System A Complete Guide.pdf
 
W1_Lec01_Lec02_Introduction.pptx
W1_Lec01_Lec02_Introduction.pptxW1_Lec01_Lec02_Introduction.pptx
W1_Lec01_Lec02_Introduction.pptx
 
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
 
Ai demystified for HR and TA leaders
Ai demystified for HR and TA leadersAi demystified for HR and TA leaders
Ai demystified for HR and TA leaders
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Better Service Management with Artificial Intelligence
Better Service Management with Artificial IntelligenceBetter Service Management with Artificial Intelligence
Better Service Management with Artificial Intelligence
 
Better Service Management with AI
Better Service Management with AIBetter Service Management with AI
Better Service Management with AI
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine Learning
 

Mehr von DevDay.org

[DevDay2019] Lean UX - By Bryant Castro, Bryant Castro at Wizeline
[DevDay2019] Lean UX - By  Bryant Castro,  Bryant Castro at Wizeline[DevDay2019] Lean UX - By  Bryant Castro,  Bryant Castro at Wizeline
[DevDay2019] Lean UX - By Bryant Castro, Bryant Castro at WizelineDevDay.org
 
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...DevDay.org
 
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...DevDay.org
 
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at WizelineDevDay.org
 
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...[DevDay2019] Growth Hacking - How to double the benefits of your startup with...
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...DevDay.org
 
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...[DevDay2019] Collaborate or die: The designers’ guide to working with develop...
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...DevDay.org
 
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...DevDay.org
 
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...DevDay.org
 
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...DevDay.org
 
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...[DevDay2019] Power of Test Automation and DevOps combination - One click savi...
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...DevDay.org
 
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...DevDay.org
 
[Devday2019] Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...
[Devday2019]  Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...[Devday2019]  Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...
[Devday2019] Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...DevDay.org
 
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...DevDay.org
 
[DevDay2019] Opportunities and challenges for human resources during the digi...
[DevDay2019] Opportunities and challenges for human resources during the digi...[DevDay2019] Opportunities and challenges for human resources during the digi...
[DevDay2019] Opportunities and challenges for human resources during the digi...DevDay.org
 
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...DevDay.org
 
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...DevDay.org
 
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...DevDay.org
 
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...DevDay.org
 
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IODevDay.org
 
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...DevDay.org
 

Mehr von DevDay.org (20)

[DevDay2019] Lean UX - By Bryant Castro, Bryant Castro at Wizeline
[DevDay2019] Lean UX - By  Bryant Castro,  Bryant Castro at Wizeline[DevDay2019] Lean UX - By  Bryant Castro,  Bryant Castro at Wizeline
[DevDay2019] Lean UX - By Bryant Castro, Bryant Castro at Wizeline
 
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...
[DevDay2019] Why you'll lose without UX Design - By Szilard Toth, CTO at e·pi...
 
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...
[DevDay2019] Things i wish I knew when I was a 23-year-old Developer - By Chr...
 
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline
[DevDay2019] Designing design teams - Christopher Nguyen, UX Manager at Wizeline
 
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...[DevDay2019] Growth Hacking - How to double the benefits of your startup with...
[DevDay2019] Growth Hacking - How to double the benefits of your startup with...
 
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...[DevDay2019] Collaborate or die: The designers’ guide to working with develop...
[DevDay2019] Collaborate or die: The designers’ guide to working with develop...
 
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...
[DevDay2019] How AI is changing the future of Software Testing? - By Vui Nguy...
 
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...
[DevDay2019] Hands-on Machine Learning on Google Cloud Platform - By Thanh Le...
 
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...
[DevDay2019] Micro Frontends Architecture - By Thang Pham, Senior Software En...
 
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...[DevDay2019] Power of Test Automation and DevOps combination - One click savi...
[DevDay2019] Power of Test Automation and DevOps combination - One click savi...
 
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...
[DevDay2019] How to quickly become a Senior Engineer - By Tran Anh Minh, CEO ...
 
[Devday2019] Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...
[Devday2019]  Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...[Devday2019]  Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...
[Devday2019] Dev start-up - By Le Trung, Founder & CEO at Hifiveplus and Edu...
 
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...
[DevDay2019] Web Development In 2019 - A Practical Guide - By Hoang Nhu Vinh,...
 
[DevDay2019] Opportunities and challenges for human resources during the digi...
[DevDay2019] Opportunities and challenges for human resources during the digi...[DevDay2019] Opportunities and challenges for human resources during the digi...
[DevDay2019] Opportunities and challenges for human resources during the digi...
 
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
 
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...
[DevDay2019] Do you dockerize? Are your containers safe? - By Pham Hong Khanh...
 
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...
[DevDay2019] Develop a web application with Kubernetes - By Nguyen Xuan Phong...
 
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...
[DevDay2019] Paradigm shift towards effective Scrum - By Tam Doan, Agile Coac...
 
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO
[DevDay2019] JAM Stack - By Ngo Thi Ni, Web Developer at Agility IO
 
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...
[DevDay2019] Layering GraphQL on top of existing infrastructure - By Phan Tha...
 

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS

  • 1. How I Test Ai Model DEVDAY 2019 April 06, 2019
  • 2. Minh Hoang A Tester A member of Technology team Fond of new technology A Challenge-taker I’M
  • 3. Objectives Sharing used tools, key metrics in AI testing and how to evaluate the AI model.
  • 4. Agenda 1 • What is machine learning • Myths & Facts about AI • Myths & Facts about Chatbot ABOUT A.I 3 TAKE AWAY 2 • The right metrics for evaluating the ML model • How we test FAQ model • Demo HOW I TEST THE AI MODEL 4 REFERENCES • Tools & Libraries
  • 6. What Is Machine Learning? Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed.
  • 7. Myths And Facts About A.I MYTH FACT Artificial intelligence and machine learning will wipe out all the jobs. A.I is no different from other technological advances in that it helps humans become more effective and processes more efficient. “Cognitive AI” technologies are able to understand and solve new problems the way the human brain can. “Cognitive” technologies can’t solve problems they weren’t designed to solve. You need a PH.D. to work in machine learning & data science. Nowadays, a lot of documents and tutorial on the Internet can help people step by step approach machine learning world.
  • 8. v What Is Chatbot? A computer program designed to simulate conversation with human users, especially over the Internet.
  • 9. Myths And Facts About Chatbot MYTH FACT Chatbot have only been around for a short while. ELIZA is one of the most well-known Chatbot therapists and the bot was created about 50 years ago. Texts or voice is the only way to interact with Bots. Actually Chatbot platforms allows users to interact with them via graphical interfaces or graphical widgets, and recent Chatbot platforms follow this development approach. All Chatbot platforms use AI. Not all Chatbot platforms use AI. Most Chatbot platforms are rule-based which follow a simple, autonomous process, something along the lines of a decision tree.
  • 10. How We Test The Ai Model
  • 11. Regression • MSPE • MSAE • R Square • Adjusted R Square Classification • Precision – Recall • ROC-AUC • Accuracy • Log-Loss Unsupervised Models • Rand Index • Mutual • Information Others • CV Error • Heuristic methods to find K • BLEU Score (NLP) The Right Metric For Evaluating Ml Models
  • 12. Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative Confusion Matrix Commonly Used Metrics In Classification
  • 13. Accuracy: • Percentage of total items classified correctly • Formula: Commonly Used Metrics In Classification
  • 14. Recall/Sensitivity/TPR (True Positive Rate): • Number of items correctly identified as positive out of total true positives • Formula: Commonly Used Metrics In Classification Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative
  • 15. Precision • Number of items correctly identified as positive out of total items identified as positive • Formula: Commonly Used Metrics In Classification Actual positive Actual negative Predicted positive True positive False positive (Type I errors) Predicted negative False negative (Type II errors) True negative
  • 16. Precision • It is a harmonic mean of precision and recall • Formula: Commonly Used Metrics In Classification Precision Recall F1 1 1 1 0.1 0.1 0.1 0.5 0.5 0.5 1 0.1 0.182 0.3 0.8 0.36 0.8 0.3 0.436
  • 17. What Is FAQ Model?
  • 18. Prepare test data •Crawl FAQ data •Generate question from FAQ data Run test •Train model with FAQ data •Run test Analyze result •Pre-process the raw result •Calculate metrics to evaluate the AI model in classification •Visualize the metrics Model Result •Select the threshold value The Process To Test FAQ Model?
  • 19. • Collect FAQ questions data (Manual and Automate) • Use NLTK to generate new question data (NLG) • Self-defined question data How We Define Test Data Set?
  • 20. Train with domain X and run the test defined for domain X. How We Evaluate The AI Model?
  • 21. • Pre-process the raw result. • Calculate metrics to evaluate the AI model in classification. • Visually metrics. How We Analyze The Result?
  • 22. Demo
  • 24. Take Away • Know main metrics for evaluating ML model. • Know how to test the classification AI model. • It is up to your self-learning skills and adaptability to decide whether working on ___ projects (AI, blockchain, VR, etc.) is difficult. • Use Automation to reduce time and effort to prepare test data
  • 26. Tools & Libraries • API: requests and postman. • AI/ML: nltk, difflib, plot.ly, pandas and numpy.

Hinweis der Redaktion

  1. Artificial intelligence and machine learning will wipe out all the jobs: Technology has been threatening jobs and displacing jobs throughout history. Telephone switching technology replaced human operators. Automatic call directors replaced receptionists. Word processing and voicemail replaced secretaries, email replaced inter-office couriers. Call center technology innovation has added efficiency and effectiveness at various stages of standing up customer service capabilities—from recruiting new reps using machine learning to screen resumes, to selecting the right training program based on specific learning styles, to call routing based on sentiment of the caller and disposition of the rep, to integration of various information sources and channels of communication. In each of these processes, technology augmentation enhanced the capabilities of humans. Were some jobs replaced? Perhaps, but more jobs were created, albeit requiring different skills. The use of AI-driven chatbots and virtual assistants is another iteration of this ongoing evolution. It needs to be thought of as augmentation rather than complete automation and replacement. Humans engage, machines simplify. There will always be the need for humans in the loop to interact with humans at some level. Bots and digital workers will enable the “super CSR” of the future and enable increasing levels of service with declining costs. At the same time, the information complexity of our world is increasing and prompting the need for human judgment. Some jobs will be lost, but the need and desire for human interaction at critical decision points will increase, and the CSR’s role will change from answering rote questions to providing better customer service at a higher level, especially for interactions requiring emotional engagement and judgment. “Cognitive AI” technologies are able to understand and solve new problems the way the human brain can: Cognitive AI simulates how a human might deal with ambiguity and nuance; however, we are a long way from AI that can extend learning to new problem areas. AI is only as good as the data on which it is trained, and humans still need to define the scenarios and use cases under which it will operate. Within those scenarios, cognitive AI offers significant value, but AI cannot define new scenarios in which it can successfully operate. This capability is referred to as “general AI” and there is much debate about when, if ever, it will emerge. For computers to answer broad questions and approach problems the way that humans do will require technological breakthroughs that are not yet on the horizon.
  2. RMSE (Root Mean Square Error) MAE is the average of the absolute difference between the predicted values and observed value. BLEU (Bilingual Evaluation Understudy)
  3. Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive. Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP) Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive. False Positive Rate or Type I Error: Number of items wrongly identified as positive out of total true negatives- FP/(FP+TN) False Negative Rate or Type II Error: Number of items wrongly identified as negative out of total true positives- FN/(FN+TP)
  4. Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive. Hay còn gọi là tỉ lệ dự đoán chính xác giá trị positive của model
  5. Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive. Hay còn gọi là khả năng phân loại Positive chính xác của model
  6. Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive. Hay còn gọi là khả năng phân loại Positive chính xác của model Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive. Hay còn gọi là tỉ lệ dự đoán chính xác giá trị positive của model (tỉ lệ bỏ sót positive data) Mô hình 1: lý tưởng Mô hình 2: tệ vì dự doán chính xác giá trị positive thấp cũng như bỏ sót giá tị là positive Mô hình 3: balance Mô hình 4: tỉ lệ dự đoán chính xác giá trị positive chính xác tuyết đối nhưng tỉ lệ tìm ra positive thấp. Ví dụ: tập data có 100 giá trị positive nhưng model chỉ dự đoán đuọc đúng 1 giá trị là positive data và giá trị đó được dự đoán đúng là positive Mô hình 5: tỉ lệ dự đóán chính xác giá trị positive thấp nhưng tỉ lệ tìm ra positive cao. Ví dụ: tập data có 100 giá trị positive, model dự đoán 80 giá tị positive nhưng chỉ có 10 trong số đó là positive Mô hình 5: tỉ lệ dự đóán chính xác giá trị positive cao nhưng tỉ lệ tìm ra positive thấp. Ví dụ: tập data có 100 giá trị positive, model dự đoán 30 giá tị positive và trong 20 giá trị trong số đó là positive