Ibm cognitive seminar march 2015 watsonsim final

IBM Watson in the
Classroom
Wlodek Zadrozny (UNC Charlotte/formerly IBM Research)
Sean Gallagher (UNC Charlotte)
Watson Polymath Ideas
Valeria de Paiva (Nuance)
Lawrence S. Moss (Indiana University)
WatsonSim development
Walid Shalaby
Adarsh Avadhani and others

Similar work on Watson in the classroom
RPI
Columbia U.
UT Austin
CMU(?)
???

“Watson became possibly the first nonhuman
millionaire by besting its human competition”
Arguably, this event started the era of “cognitive
computing”

The Jeopardy! Challenge: Solved in 2011
No replication of the solution as of 2015
Broad/Open
Domain
Complex
Language
High
Precision
Accurate
Confidence
High
Speed
$600
In cell division, mitosis
splits the nucleus &
cytokinesis splits this
liquid cushioning the
nucleus
$200
If you're standing, it's the
direction you should
look to check out the
wainscoting.
$2000
Of the 4 countries in the
world that the U.S. does
not have diplomatic
relations with, the one
that’s farthest north
$1000
The first person
mentioned by name in
‘The Man in the Iron
Mask’ is this hero of a
previous book by the
same author.
Based on a slide from IBM

Two Research Challenges:
Replication of IBM Watson performance
Understanding why Watson works.
Watson heterogeneity
is perfect for
introducing students to
IR and NLP
Opportunity for MS and
Undergrad Student
Research
Data Sets are
Available
on
J-Archive
Deeper Research
Questions:
We know how Watson
works, but we don’t know
why it works
Watson Architecture
was described in
details
in
IBM J. R&D +
patents

Watson in the classroom: Spring 2014
 Motivation: teaching computer science using solved challenges
 Content: “Semantic Technologies in IBM Watson”
(provided by IBM)
 Students (20): Upper level undergrad(1/3), MS(2/3), 1PhD
 Since we didn’t have any code we decided to build a Watson simulator
Students followed the idea of IBM Watson architecture,
simplifying whenever possible, e.g. no UIMA
 Used as a way to learn:
Information Retrieval (Bing, Google, Lucene, Indri)
Elements of machine learning (using Weka, logistic regression, )
Elements of NLP: why NLU is difficult, POS tagging,
parsing (with OpenNLP)
Data preparation: regular expressions, polite data crawling, etc.
…

Teaching Objective: Learning IR and NLP
All students should learn all the technologies involved.
Grading should be based on the degree of mastery.

The Resulting System
as of Spring 2014
Shaded components were more complete.

Watsonsim Accuracy
- Bing, Lucene and Indri as search
- Wikipedia, Wikiquotes, and Shakespeare as sources
- Using n-gram scores, parse tree comparisons, LAT matching
- SVM based score aggregation

Integrating contributions of different
teams was often challenging
- We logged over 3500 runs, recording accuracies
- The peak is around 26.6% top accuracy
- 36.6% for the top three candidates

Current Status: Individual Study
 Started the Watson MOOC
 Reading Watson papers
 Extending WatsonSim as a practicum for individual studies
 Adding new scorers
 Adding question analyzers and classifiers
 Adding new sources
 Code available on github:
https://github.com/SeanTater/uncc2014watsonsim

Plans
 Cognitive computing class
 Use as a vehicle for individual study
and undergrad/MS research
 Research (proposal) around the “why” question

Two Research Challenges:
Replication of IBM Watson performance
Understanding why Watson works.
Deeper Research
Questions:
We know how Watson
works, but we don’t know
why it works
Main points:
1. Despite IBM Watson’s
success, we don’t know
why it works?
2. Figuring it out can be an
open collaborative project:
Polymath style
?

Why should the NLP researchers care?
 We have a mismatch between the academic/scientific
theory of meaning and NLU and what technical
experience seems to be telling us
 Inadequate theories might be limiting our ability to
make progress in NLP
 There are interesting research questions, given Watson
unorthodox approach to QA

A challenge in representing meaning
of text?
Schuetze 2013 argues that
“meaning is heterogeneous and
that semantic theory will always
consist of distinct modules that
are governed by different
principles”
 We don’t have a formal
theory that would support this
view
 A formal model of Watson
might be a starting point
The(?) meaning is constructed in multiple steps

Some questions for a formal model
of Watson?
 What is the role of search in computing meaning?
In Watson, formally speaking, it constrains the entities that end
up in the correct discourse model.
Can any collection of text be reorganized that way? Dynamically?
 Semantics of the deferred type evaluation/deferred meaning
computation?
 Formal model of scoring?
Meaning through interaction? Or intersection of constraints?
 Is it Watson-like model fundamental? Or an engineering feat?

Other sample research topics
 Role of formal semantics and theorem proving
“the overall performance of QA systems is directly
related to the depth of NLP resources”
“the prover boosts the performance of the QA system on TREC
questions by 30%” D.Moldovan in 2003
Similar results reported by MacCartney and Manning 2009
 Could interactive theorem provers, e.g. Coq and HOL, be
adapted to improve performance of QA systems, and NLP systems
in general? (Same question for automated provers).
 Role of natural logics

Conclusions
 Formal model of Watson might help in building more
realistic theories of natural language understanding
 Polymath model might be useful in creating a formal theory/model of
Watson
 More realistic theories might help in building better NLP
systems
 Experiments with Watson-models are possible
 Now, experiments with IBM Watson in the ‘cloud’ are possible
 Watson can be the basis for teaching NLP

Ibm cognitive seminar march 2015 watsonsim final

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Ähnlich wie Ibm cognitive seminar march 2015 watsonsim final

Ähnlich wie Ibm cognitive seminar march 2015 watsonsim final (20)

Mehr von diannepatricia

Mehr von diannepatricia (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Ibm cognitive seminar march 2015 watsonsim final