SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Swiss Group for Artificial Intelligence and Cognitive Science 
Intelligent Systems and Applications Workshop 2014, University of Basel 
Watson Technical Deep Dive 
© 2013 IBM Corporation 
@RomeoKienzler, IBM Innovation Center Zurich
(part of) my role @IBM 
● Accelerate cognitive computing 
● In Switzerland 
● Through 
● Academia 
● Startups/ISV's 
● Cloud 
Watson in the cloud: bit.ly/go4bluemix 
2 © 2013 IBM Corporation
What Watson is not 
● Search Engine 
● Database System 
● HAL9000 
3 © 2013 IBM Corporation
4 © 2013 IBM Corporation
What Watson is 
● Cognitive System (Marketing) 
● Combination of 
● Information Retrieval 
● NLP 
● Structured + Unstructured Data ! 
● Runs on UIMA 
● Based on supervised learning 
5 © 2013 IBM Corporation
What is a parser? 
● Annotate sentence with 
● Tags 
● Relationships 
● Probabilistic (e.g. Stanford) 
● Rule based 
● (E) Slot Grammar 
6 © 2013 IBM Corporation
Slot Grammar 
● Simplified 
● Lexicalist character 
● High focus on words 
● Low focus on structure 
● Assign words to slots 
I can resist everything except temptation 
● Subject (I) 
● Verb (can resist) 
● Object (everything except temptation) 
7 © 2013 IBM Corporation
PAS - Builder 
● Predicate-Argument Structure 
● Downstream to ESG 
● Reduces complexity of ESG 
“John opened Bill's door (with his key) 
John's key opened Bill's door 
Bill's door opened 
Bill's door was opened (by John)” 
OPEN (John door key) 
| | | 
Agent Theme Instrument 
Many ESG trees reduce to same PAS 
8 © 2013 IBM Corporation
Relationships 
● Relationship Extractor 
● Combination of 
● Manual pattern specifications 
~30 types, high precision 
● Statistical methods 
~7000 types, low precision 
● SVM's on DBPedia/Wikipedia 
9 © 2013 IBM Corporation
Relationships (2) 
“The Screwtape Letters” from a senior devil to 
an under devil are by this man better known for 
children’s books 
author(“this man”,“The Screwtape Letters”) 
10 © 2013 IBM Corporation
Ingestion 
11 © 2013 IBM Corporation
Ingestion 
● Corpus creation 
● Input format: TREC 
(Text Retrieval Conference) 
● Multiple HTML pages in one 
HDFS file 
● Parallel ingestion process 
(LiteScale) 
12 © 2013 IBM Corporation
Dictionary 
● started w/ Wikipedia copus 
● Keyword → Text structure 
● Transformation of free text 
● into Keyword → Text 
● optimization objective 
13 © 2013 IBM Corporation
Knowledge Expansion 
● Follow links in content 
● Identify content keywords and link 
to new content 
● → generate more content in 
Keyword → Text form 
14 © 2013 IBM Corporation
Question Analysis 
15 © 2013 IBM Corporation
Question Analysis 
● Named entity recognition 
● Type identification /Extract focus 
● ESG/PAS 
● Relationship detection 
16 © 2013 IBM Corporation
Question Analysis 
1) Extract focus 
2) Map to LAT 
3) Broad Type classification 
4) Detect if special handling is 
needed (e.g. nested question) 
17 © 2013 IBM Corporation
Query Decomposition 
18 © 2013 IBM Corporation
Query Decomposition 
● Keyword identification 
● LAT (Lexical Answer Type) 
● IBM Pat. US20120078890 for 
confidence estimation of LAT 
● optimization objective: choosing 
keywords out of nontrivial set of 
words based on ML 
19 © 2013 IBM Corporation
Query Decomposition 
In 1894 C.W. Post created his warm cereal 
drink Postum in this Michigan city 
● Focus: this Michigan City 
● LAT: Michigan 
● Keywords: 1894, C.W. Post, 
created, warm, cereal, drink, 
Postum, Michigan, City 
20 © 2013 IBM Corporation
Query Decomposition 
21 © 2013 IBM Corporation
Primary Search 
22 © 2013 IBM Corporation
Primary Search 
● Lucene and Indri search engine 
● Preprocessing generated 
keyword->text based documents 
● Keyword associated with found 
document added to candidate 
answer list 
23 © 2013 IBM Corporation
Hypothesis generation 
24 © 2013 IBM Corporation
Supporting Evidence Retrieval 
Unlike most sea animals, in the sea horse this pair 
of sense organs can move independently of one 
another 
Question decomposition: 
Which [sense organ] of [Sea Horse] move independently? 
Hypothesis generation: 
A Sea Horse can move its eyes independently. 
A Sea Horse can move its ears independently. 
A Sea Horse can move its skin independently. 
A Sea Horse can move its nose independently. 
A Sea Horse can move its tung independently. 
25 © 2013 IBM Corporation
http://angelalmassey.com/SHC/about.html 
26 © 2013 IBM Corporation
Supporting Evidence 
● Generated Candidate Answer is 
● ESG'd 
● PAS'd 
● searched against corpus 
● LATs used to determine whether 
a candidate answer is an 
instance of the answer types 
27 © 2013 IBM Corporation
Supporting Evidence 
28 © 2013 IBM Corporation
Scoring 
29 © 2013 IBM Corporation
Scoring 
● Optimization objective 
(confidence estimation framework) 
● Relational (PRISMATIC, Dbpedia) 
● Taxonomic,Geospacial 
● Temporal, Source Reliability 
● Gender, Name consistency 
● Passage Support 
● 30 Theory consistency 
© 2013 IBM Corporation
Scoring challenges 
● Feature significance different for 
● Different questions 
● Different question classes 
● Very heterogeneous features 
● Normalization problem 
● Missing features 
● Class imbalance 
31 © 2013 IBM Corporation
Merging and ranking 
32 © 2013 IBM Corporation
Merging and ranking 
1. John Fitzgerald Kennedy 2. Kennedy, 3. JFK 
● Different Scores 
● Merge to canonical form 
● Morphological 
● Pattern-based 
● Table Lookup 
● Partially generated from Wikipedia 
disabiguation pages 
33 © 2013 IBM Corporation
Example 
MYTHING IN ACTION: One legend says this 
was given by the Lady of the Lake & thrown 
back in the lake on King Arthur’s death. 
●Watson merged sword + Excalibur 
to “sword” (canonical form) 
● Preserved relation 
● more_specific(sword)->Excalibur 
34 © 2013 IBM Corporation
ML in Ranking 
● Experiments with logistic regression, support 
vector machines, linear and nonlinear 
kernels, ranking SVM, boosting, single and 
multilayer neural nets, decision trees, locally 
weighted learning 
● Finally: 
regularized logistic regression 
35 © 2013 IBM Corporation
Normalization 
● Q set of all candidate answers 
● Feature x_ij 
● j feature, i answer 
● missing values imputed 
36 © 2013 IBM Corporation
Ranking 
● Based on training set n > 10K 
● IBM SPSS Modeler 
37 © 2013 IBM Corporation
Evidence Sources 
38 © 2013 IBM Corporation
Automatic Learning 
● Read through text semantically 
● Statistically rank annotated text 
● generate new knowledge 
● Inventors patent inventions 0.8 
● officials submit resignations 0.7 
● people earn degrees at schools 0.9 
● fluid is a liquid 0.6 
● liquid is a fluid 0.5 
● vessels sink 0.7 
● people sink 8-balls (0.5) (in pool/0.8) 
39 © 2013 IBM Corporation
Next steps 
● “Jeopardy!” - Watson was 
● Open domain 
● Large training set 
● New “Watsons” are 
● Closed domain 
● Small, but growing training set 
40 © 2013 IBM Corporation
Demo 
● Bit.ly/go4bluemix 
41 © 2013 IBM Corporation
References 
[1] Jeffrey Kabot, “Deep Parsing” 
[2] Richard Nordquist, “slot and 
filler” 
[3] The Journal of Research and 
Development, Vol 56, 2012 
42 © 2013 IBM Corporation

Weitere ähnliche Inhalte

Andere mochten auch

Deep Parsing (2012)
Deep Parsing (2012)Deep Parsing (2012)
Deep Parsing (2012)Craig Trim
 
OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWERorg
 
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Indrajit Poddar
 
Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Ali Saif Mirza
 
Machine Learning and The Big Data Revolution
Machine Learning and The Big Data RevolutionMachine Learning and The Big Data Revolution
Machine Learning and The Big Data RevolutionRob Thomas
 
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...Wolters Kluwer Belgium
 
Lemur Tutorial at SIGIR 2006
Lemur Tutorial at SIGIR 2006Lemur Tutorial at SIGIR 2006
Lemur Tutorial at SIGIR 2006pogil
 
Bosch: Next Gen Manufacturing and IT
Bosch: Next Gen Manufacturing and ITBosch: Next Gen Manufacturing and IT
Bosch: Next Gen Manufacturing and ITRahul Neel Mani
 
Advanced Cellular Design and Automatic Optimisation in the Small Cell Era
Advanced Cellular Design and Automatic Optimisation in the Small Cell EraAdvanced Cellular Design and Automatic Optimisation in the Small Cell Era
Advanced Cellular Design and Automatic Optimisation in the Small Cell EraSteve Bowker
 
Bosch Connect: Under the Hood
Bosch Connect: Under the HoodBosch Connect: Under the Hood
Bosch Connect: Under the HoodLetsConnect
 
How the Bosch Group is making use of OSGi for IoT - Kai Hackbarth
How the Bosch Group is making use of OSGi for IoT - Kai HackbarthHow the Bosch Group is making use of OSGi for IoT - Kai Hackbarth
How the Bosch Group is making use of OSGi for IoT - Kai Hackbarthmfrancis
 
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...Ted Drake
 
Internet of Things with Bosch: From Concept to Code
Internet of Things with Bosch: From Concept to CodeInternet of Things with Bosch: From Concept to Code
Internet of Things with Bosch: From Concept to CodeMongoDB
 
Building Large-Scale Applications for the Internet of Things at Bosch
Building Large-Scale Applications for the Internet of Things at BoschBuilding Large-Scale Applications for the Internet of Things at Bosch
Building Large-Scale Applications for the Internet of Things at BoschMongoDB
 
왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능SeokWon Kim
 
PythonでDeepLearningを始めるよ
PythonでDeepLearningを始めるよPythonでDeepLearningを始めるよ
PythonでDeepLearningを始めるよTanaka Yuichi
 
Dr. Denner opening keynote at Bosch Connected World
Dr. Denner opening keynote at Bosch Connected World Dr. Denner opening keynote at Bosch Connected World
Dr. Denner opening keynote at Bosch Connected World James Watters
 
Ibm왓슨과 apple 시리
Ibm왓슨과 apple 시리Ibm왓슨과 apple 시리
Ibm왓슨과 apple 시리Saltlux zinyus
 

Andere mochten auch (20)

Deep Parsing (2012)
Deep Parsing (2012)Deep Parsing (2012)
Deep Parsing (2012)
 
OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1OpenPOWER SC16 Recap: Day 1
OpenPOWER SC16 Recap: Day 1
 
Reddit
RedditReddit
Reddit
 
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
Enabling Cognitive Workloads on the Cloud: GPUs with Mesos, Docker and Marath...
 
Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )
 
Machine Learning and The Big Data Revolution
Machine Learning and The Big Data RevolutionMachine Learning and The Big Data Revolution
Machine Learning and The Big Data Revolution
 
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...
Futuristische demonstratie uit de autosector (Bosch) - Belgian Insurance Conf...
 
Lemur Tutorial at SIGIR 2006
Lemur Tutorial at SIGIR 2006Lemur Tutorial at SIGIR 2006
Lemur Tutorial at SIGIR 2006
 
Bosch: Next Gen Manufacturing and IT
Bosch: Next Gen Manufacturing and ITBosch: Next Gen Manufacturing and IT
Bosch: Next Gen Manufacturing and IT
 
Advanced Cellular Design and Automatic Optimisation in the Small Cell Era
Advanced Cellular Design and Automatic Optimisation in the Small Cell EraAdvanced Cellular Design and Automatic Optimisation in the Small Cell Era
Advanced Cellular Design and Automatic Optimisation in the Small Cell Era
 
Bosch Connect: Under the Hood
Bosch Connect: Under the HoodBosch Connect: Under the Hood
Bosch Connect: Under the Hood
 
How the Bosch Group is making use of OSGi for IoT - Kai Hackbarth
How the Bosch Group is making use of OSGi for IoT - Kai HackbarthHow the Bosch Group is making use of OSGi for IoT - Kai Hackbarth
How the Bosch Group is making use of OSGi for IoT - Kai Hackbarth
 
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
 
Internet of Things with Bosch: From Concept to Code
Internet of Things with Bosch: From Concept to CodeInternet of Things with Bosch: From Concept to Code
Internet of Things with Bosch: From Concept to Code
 
Building Large-Scale Applications for the Internet of Things at Bosch
Building Large-Scale Applications for the Internet of Things at BoschBuilding Large-Scale Applications for the Internet of Things at Bosch
Building Large-Scale Applications for the Internet of Things at Bosch
 
왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능
 
PythonでDeepLearningを始めるよ
PythonでDeepLearningを始めるよPythonでDeepLearningを始めるよ
PythonでDeepLearningを始めるよ
 
Dr. Denner opening keynote at Bosch Connected World
Dr. Denner opening keynote at Bosch Connected World Dr. Denner opening keynote at Bosch Connected World
Dr. Denner opening keynote at Bosch Connected World
 
Ibm왓슨과 apple 시리
Ibm왓슨과 apple 시리Ibm왓슨과 apple 시리
Ibm왓슨과 apple 시리
 
Bosch M2M / IOT for Telco
Bosch M2M / IOT for TelcoBosch M2M / IOT for Telco
Bosch M2M / IOT for Telco
 

Ähnlich wie Watson Technical Deep Dive Cognitive System Overview

Labeling all the Things with the WDI Skill Labeler
Labeling all the Things with the WDI Skill Labeler Labeling all the Things with the WDI Skill Labeler
Labeling all the Things with the WDI Skill Labeler Kwame Porter Robinson
 
Data Analytics with DBMS
Data Analytics with DBMSData Analytics with DBMS
Data Analytics with DBMSGLC Networks
 
Future of ai on the jvm
Future of ai on the jvmFuture of ai on the jvm
Future of ai on the jvmAdam Gibson
 
Software architecture, Patterns for Scale
Software architecture, Patterns for ScaleSoftware architecture, Patterns for Scale
Software architecture, Patterns for ScaleiGbanam
 
Custom Machine Learning Recipes for the Enterprise
Custom Machine Learning Recipes for the EnterpriseCustom Machine Learning Recipes for the Enterprise
Custom Machine Learning Recipes for the EnterpriseSri Ambati
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport MeetingAlonso Torres
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBigML, Inc
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorialGiacomo Lanciano
 
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan Sciampacone
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan SciampaconeRuntime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan Sciampacone
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan SciampaconeZeroTurnaround
 
Devday @ Sahaj - Domain Specific NLP Pipelines
Devday @ Sahaj -  Domain Specific NLP PipelinesDevday @ Sahaj -  Domain Specific NLP Pipelines
Devday @ Sahaj - Domain Specific NLP PipelinesRajesh Muppalla
 
Full-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSFull-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSMongoDB
 
Notes on data-intensive processing with Hadoop Mapreduce
Notes on data-intensive processing with Hadoop MapreduceNotes on data-intensive processing with Hadoop Mapreduce
Notes on data-intensive processing with Hadoop MapreduceEvert Lammerts
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DSRoopesh Kohad
 
Software Engineering Primer
Software Engineering PrimerSoftware Engineering Primer
Software Engineering PrimerGeorg Buske
 
MongoDB Jump Start
MongoDB Jump StartMongoDB Jump Start
MongoDB Jump StartHaim Michael
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf
 
Computing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphComputing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphVitaly Gordon
 

Ähnlich wie Watson Technical Deep Dive Cognitive System Overview (20)

Labeling all the Things with the WDI Skill Labeler
Labeling all the Things with the WDI Skill Labeler Labeling all the Things with the WDI Skill Labeler
Labeling all the Things with the WDI Skill Labeler
 
No More SQL
No More SQLNo More SQL
No More SQL
 
Data Analytics with DBMS
Data Analytics with DBMSData Analytics with DBMS
Data Analytics with DBMS
 
Future of ai on the jvm
Future of ai on the jvmFuture of ai on the jvm
Future of ai on the jvm
 
Software architecture, Patterns for Scale
Software architecture, Patterns for ScaleSoftware architecture, Patterns for Scale
Software architecture, Patterns for Scale
 
Custom Machine Learning Recipes for the Enterprise
Custom Machine Learning Recipes for the EnterpriseCustom Machine Learning Recipes for the Enterprise
Custom Machine Learning Recipes for the Enterprise
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 Sessions
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
 
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan Sciampacone
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan SciampaconeRuntime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan Sciampacone
Runtime Innovation - Nextgen Ninja Hacking of the JVM, by Ryan Sciampacone
 
Devday @ Sahaj - Domain Specific NLP Pipelines
Devday @ Sahaj -  Domain Specific NLP PipelinesDevday @ Sahaj -  Domain Specific NLP Pipelines
Devday @ Sahaj - Domain Specific NLP Pipelines
 
Full-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSFull-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWS
 
Hadoop.mapreduce
Hadoop.mapreduceHadoop.mapreduce
Hadoop.mapreduce
 
Notes on data-intensive processing with Hadoop Mapreduce
Notes on data-intensive processing with Hadoop MapreduceNotes on data-intensive processing with Hadoop Mapreduce
Notes on data-intensive processing with Hadoop Mapreduce
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Software Engineering Primer
Software Engineering PrimerSoftware Engineering Primer
Software Engineering Primer
 
MongoDB Jump Start
MongoDB Jump StartMongoDB Jump Start
MongoDB Jump Start
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
 
Computing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic GraphComputing Professional Identity for the Economic Graph
Computing Professional Identity for the Economic Graph
 

Mehr von Romeo Kienzler

Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingParallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingRomeo Kienzler
 
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkCognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkRomeo Kienzler
 
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Romeo Kienzler
 
Blockchain Technology Book Vernisage
Blockchain Technology Book VernisageBlockchain Technology Book Vernisage
Blockchain Technology Book VernisageRomeo Kienzler
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Romeo Kienzler
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningRomeo Kienzler
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Romeo Kienzler
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTRomeo Kienzler
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataRomeo Kienzler
 
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Romeo Kienzler
 
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceScala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceRomeo Kienzler
 
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASTDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASRomeo Kienzler
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamRomeo Kienzler
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler
 
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14Romeo Kienzler
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichRomeo Kienzler
 
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Romeo Kienzler
 
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 HoursCloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 HoursRomeo Kienzler
 

Mehr von Romeo Kienzler (20)

Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network TrainingParallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network Training
 
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & FlinkCognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
 
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...
 
Blockchain Technology Book Vernisage
Blockchain Technology Book VernisageBlockchain Technology Book Vernisage
Blockchain Technology Book Vernisage
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
 
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoTDeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoT
 
Geo Python16 keynote
Geo Python16 keynoteGeo Python16 keynote
Geo Python16 keynote
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor Data
 
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
 
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A ServiceScala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
 
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAASTDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAAS
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
 
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
 
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014
 
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 HoursCloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
 

Kürzlich hochgeladen

『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxAndrieCagasanAkio
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxNIMMANAGANTI RAMAKRISHNA
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 

Kürzlich hochgeladen (11)

『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptx
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptx
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 

Watson Technical Deep Dive Cognitive System Overview

  • 1. Swiss Group for Artificial Intelligence and Cognitive Science Intelligent Systems and Applications Workshop 2014, University of Basel Watson Technical Deep Dive © 2013 IBM Corporation @RomeoKienzler, IBM Innovation Center Zurich
  • 2. (part of) my role @IBM ● Accelerate cognitive computing ● In Switzerland ● Through ● Academia ● Startups/ISV's ● Cloud Watson in the cloud: bit.ly/go4bluemix 2 © 2013 IBM Corporation
  • 3. What Watson is not ● Search Engine ● Database System ● HAL9000 3 © 2013 IBM Corporation
  • 4. 4 © 2013 IBM Corporation
  • 5. What Watson is ● Cognitive System (Marketing) ● Combination of ● Information Retrieval ● NLP ● Structured + Unstructured Data ! ● Runs on UIMA ● Based on supervised learning 5 © 2013 IBM Corporation
  • 6. What is a parser? ● Annotate sentence with ● Tags ● Relationships ● Probabilistic (e.g. Stanford) ● Rule based ● (E) Slot Grammar 6 © 2013 IBM Corporation
  • 7. Slot Grammar ● Simplified ● Lexicalist character ● High focus on words ● Low focus on structure ● Assign words to slots I can resist everything except temptation ● Subject (I) ● Verb (can resist) ● Object (everything except temptation) 7 © 2013 IBM Corporation
  • 8. PAS - Builder ● Predicate-Argument Structure ● Downstream to ESG ● Reduces complexity of ESG “John opened Bill's door (with his key) John's key opened Bill's door Bill's door opened Bill's door was opened (by John)” OPEN (John door key) | | | Agent Theme Instrument Many ESG trees reduce to same PAS 8 © 2013 IBM Corporation
  • 9. Relationships ● Relationship Extractor ● Combination of ● Manual pattern specifications ~30 types, high precision ● Statistical methods ~7000 types, low precision ● SVM's on DBPedia/Wikipedia 9 © 2013 IBM Corporation
  • 10. Relationships (2) “The Screwtape Letters” from a senior devil to an under devil are by this man better known for children’s books author(“this man”,“The Screwtape Letters”) 10 © 2013 IBM Corporation
  • 11. Ingestion 11 © 2013 IBM Corporation
  • 12. Ingestion ● Corpus creation ● Input format: TREC (Text Retrieval Conference) ● Multiple HTML pages in one HDFS file ● Parallel ingestion process (LiteScale) 12 © 2013 IBM Corporation
  • 13. Dictionary ● started w/ Wikipedia copus ● Keyword → Text structure ● Transformation of free text ● into Keyword → Text ● optimization objective 13 © 2013 IBM Corporation
  • 14. Knowledge Expansion ● Follow links in content ● Identify content keywords and link to new content ● → generate more content in Keyword → Text form 14 © 2013 IBM Corporation
  • 15. Question Analysis 15 © 2013 IBM Corporation
  • 16. Question Analysis ● Named entity recognition ● Type identification /Extract focus ● ESG/PAS ● Relationship detection 16 © 2013 IBM Corporation
  • 17. Question Analysis 1) Extract focus 2) Map to LAT 3) Broad Type classification 4) Detect if special handling is needed (e.g. nested question) 17 © 2013 IBM Corporation
  • 18. Query Decomposition 18 © 2013 IBM Corporation
  • 19. Query Decomposition ● Keyword identification ● LAT (Lexical Answer Type) ● IBM Pat. US20120078890 for confidence estimation of LAT ● optimization objective: choosing keywords out of nontrivial set of words based on ML 19 © 2013 IBM Corporation
  • 20. Query Decomposition In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city ● Focus: this Michigan City ● LAT: Michigan ● Keywords: 1894, C.W. Post, created, warm, cereal, drink, Postum, Michigan, City 20 © 2013 IBM Corporation
  • 21. Query Decomposition 21 © 2013 IBM Corporation
  • 22. Primary Search 22 © 2013 IBM Corporation
  • 23. Primary Search ● Lucene and Indri search engine ● Preprocessing generated keyword->text based documents ● Keyword associated with found document added to candidate answer list 23 © 2013 IBM Corporation
  • 24. Hypothesis generation 24 © 2013 IBM Corporation
  • 25. Supporting Evidence Retrieval Unlike most sea animals, in the sea horse this pair of sense organs can move independently of one another Question decomposition: Which [sense organ] of [Sea Horse] move independently? Hypothesis generation: A Sea Horse can move its eyes independently. A Sea Horse can move its ears independently. A Sea Horse can move its skin independently. A Sea Horse can move its nose independently. A Sea Horse can move its tung independently. 25 © 2013 IBM Corporation
  • 27. Supporting Evidence ● Generated Candidate Answer is ● ESG'd ● PAS'd ● searched against corpus ● LATs used to determine whether a candidate answer is an instance of the answer types 27 © 2013 IBM Corporation
  • 28. Supporting Evidence 28 © 2013 IBM Corporation
  • 29. Scoring 29 © 2013 IBM Corporation
  • 30. Scoring ● Optimization objective (confidence estimation framework) ● Relational (PRISMATIC, Dbpedia) ● Taxonomic,Geospacial ● Temporal, Source Reliability ● Gender, Name consistency ● Passage Support ● 30 Theory consistency © 2013 IBM Corporation
  • 31. Scoring challenges ● Feature significance different for ● Different questions ● Different question classes ● Very heterogeneous features ● Normalization problem ● Missing features ● Class imbalance 31 © 2013 IBM Corporation
  • 32. Merging and ranking 32 © 2013 IBM Corporation
  • 33. Merging and ranking 1. John Fitzgerald Kennedy 2. Kennedy, 3. JFK ● Different Scores ● Merge to canonical form ● Morphological ● Pattern-based ● Table Lookup ● Partially generated from Wikipedia disabiguation pages 33 © 2013 IBM Corporation
  • 34. Example MYTHING IN ACTION: One legend says this was given by the Lady of the Lake & thrown back in the lake on King Arthur’s death. ●Watson merged sword + Excalibur to “sword” (canonical form) ● Preserved relation ● more_specific(sword)->Excalibur 34 © 2013 IBM Corporation
  • 35. ML in Ranking ● Experiments with logistic regression, support vector machines, linear and nonlinear kernels, ranking SVM, boosting, single and multilayer neural nets, decision trees, locally weighted learning ● Finally: regularized logistic regression 35 © 2013 IBM Corporation
  • 36. Normalization ● Q set of all candidate answers ● Feature x_ij ● j feature, i answer ● missing values imputed 36 © 2013 IBM Corporation
  • 37. Ranking ● Based on training set n > 10K ● IBM SPSS Modeler 37 © 2013 IBM Corporation
  • 38. Evidence Sources 38 © 2013 IBM Corporation
  • 39. Automatic Learning ● Read through text semantically ● Statistically rank annotated text ● generate new knowledge ● Inventors patent inventions 0.8 ● officials submit resignations 0.7 ● people earn degrees at schools 0.9 ● fluid is a liquid 0.6 ● liquid is a fluid 0.5 ● vessels sink 0.7 ● people sink 8-balls (0.5) (in pool/0.8) 39 © 2013 IBM Corporation
  • 40. Next steps ● “Jeopardy!” - Watson was ● Open domain ● Large training set ● New “Watsons” are ● Closed domain ● Small, but growing training set 40 © 2013 IBM Corporation
  • 41. Demo ● Bit.ly/go4bluemix 41 © 2013 IBM Corporation
  • 42. References [1] Jeffrey Kabot, “Deep Parsing” [2] Richard Nordquist, “slot and filler” [3] The Journal of Research and Development, Vol 56, 2012 42 © 2013 IBM Corporation