This document presents an overview of using machine learning methods for question classification. It discusses past research that has used features like words, parts of speech tags, and named entities with classifiers like SNoW and SVMs. Later works incorporated semantic features from WordNet, like hypernyms of the question's head word. The document outlines a plan to experiment with various feature types and classifiers using existing question classification datasets and open-source NLP tools. The goal is to automatically generate semantic features to improve over prior approaches.
1. Question Classification
Using Machine Learning Methods
Jennifer Lee
CSI 5386 Project Presentation
Fall 2008
2. Motivation
• An important step in QA
– To classify the question to the anticipated type
of the answer (semantically).
– More challenging than common search tasks.
• Q: What Canadian city has the largest
population?
Answer type: city
3. The Ambiguity Problem
• What is bipolar disorder?
• What do bats eat?
• What is the PH scale?
• Hard to categorize those questions into one
single class
– Need multiple class labels for a single
question.
4. Why Machine Learning?
• Manually constructed sets of rules to map a
question to its type is not efficient.
– Requires the analysis of a large number of
questions.
– Mapping questions into fine classes requires
the use of lexical items (specific words).
• A learned classifier enables one to define only a
small number of “type” features.
• Can be trained on a new taxonomy.
5. Li and Roth (2002):
Learning Question Classifier
• Uses the SnoW learning architecture.
– Hierarchical classifiers
– 6 coarse classes: ABBREVATION, ENTITY,
DESCRIPTION, HUMAN, LOCATION,
NUMERIC VALUE.
– 50 fine classes.
6. Li and Roth (cont)
• UIUC question classification dataset
– 5500 training (from TREC 8,9, including 500
rare questions).
– 500 test datasets from TREC 10.
• Six primitive feature types:
– Words, pos tags, chunks, named entities, head
chunks and semantically related words
• Semantically related word list for each question
– “away” belongs to the sensor Rel(distance).
7. Zhang and Lee (2003):
Question Classifcation using SVM
• Two kind of features:
– Bag of words and bag of ngrams.
• SVM with kernel tree
– Use LIBVSM (Chang and Lin, 2001).
– Take advantage of the syntactic structures of
questions.
– Compare with Nearest Neighbors, Naïve
Bayes, Decision Tree, SnoW.
9. Huang et al. (2008):
QC using Head Words and their Hypernyms
• In contrast to Li's, a compact feature set
was proposed:
– Head word
– Use WordNet to augment the semantic
features.
– Adopt Lesk's word sense disambiguation
algorithm
12. Plan for the project
• Experiment with different feature types:
– Head chunks, semantic features for head
chunk, namedentities, word grams and word
shape feature
• Use WordNet to automate the generation of
semantic features
– Find hypernyms.
– Apply Lesk's WSD to the head chunk.
14. Resources
• Java interface to WordNet:
– http://wordnet.princeton.edu/links#SQL
• A syntactic parser for extracting the head
chunk feature:
– Berkeley parser (Petrov and Klein, 2007).
• Use the Ngram Statistics Package
16. References
• Li, X. and D. Roth. 2002. Learning Question
Classifiers.The 19th international conference on
Computational linguistics, vol. 1, pp. 1–7.
• Zhang D. and W. S. Lee. 2003. Question
Classification using Support Vector Machines.
The ACM SIGIR conference in information
retrieval, pp. 26–32.
• Zhiheng Huang; Marcus Thint; Zengchang Qin.
Question Classification using Head Words and
their Hypernyms.
17. References (cont)
• D. Roth, G. Kao, X. Li, R. Nagarajan, V.
Punyakanok, N. Rizzolo, W. Yih, C. O. Alm, and
L. G. Moran. 2002. Learning components for a
question answering system. In TREC2001.
• Jonathan Brown – IR Lab. EntityTagged
Language Models for Question Classification in a
QA System.
• Donald Metzler, W. Bruce Croft Analysis of
statistical question classification for factbased
questions (2003).