SlideShare a Scribd company logo
1 of 13
Taking up the Gaokao Challenge:
An Information Retrieval Approach
Gong Cheng, Weixi Zhu, Ziwei Wang, Jianghui Chen, Yuzhong Qu
National Key Laboratory for Novel Software Technology
Nanjing University, China
Websoft
What is Gaokao?
• National Higher Education Entrance Examination,
a.k.a. China’s SAT, but
• being more difficult
• testing more subjects
• Chinese literature, Mathematics, English language, and
• Humanities (History, Geography, Political Education)
or Natural Sciences (Physics, Chemistry, Biology)
Why is Gaokao difficult for the computer?
• Questions in Gaokao challenge QA systems:
• multiple sentences to understand
• domain-specific expressions to parse (e.g., quotes, formulas, maps)
• unspecified, seemingly unbounded resources to search
Overview of the Approach
Recollecting
Relevance
Knowledge
Drawing
Evidence
ReasoningHuman:
Overview of the Approach
Recollecting
Relevance
Knowledge
Drawing
Evidence
ReasoningHuman:
Computer:
Stage 1: Retrieving Pages
• Retrieving Concept Pages
(Step 1/2: leftmost longest title matching)
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
Stage 1: Retrieving Pages
• Retrieving Concept Pages
(Step 2/2: context-based disambiguation)
Feudalism may refer to:
• Feudalism (in China), existed during the Shang and the Zhou dynasty, and was replaced by
centralization of authority during and after the Qin dynasty…
• Feudalism (in Europe), prevailed in the Middle Ages from the 5th to the 15th century…
• Feudalism (in Japan), originated from ritsuryo in the Heian period from the 8th to the 12th
century…
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
matching a disambiguation page Context helps disambiguation.
Stage 1: Retrieving Pages
• Retrieving Quote Pages
(exact content match)
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
matching the content of a page
a quote
Stage 2: Ranking and Filtering Pages
• Centrality-based Ranking
(centrality = cosine similarity)
• Three kinds of vector space:
• words in a page
• links in a page
• categories of a page
center of retrieved pages
Stage 2: Ranking and Filtering Pages
• Domain-based Filtering
(within historical categories,
i.e., categories whose names contain the word
history, and their descendant categories)
• Relevance-based Ranking
(relevance to the stem and options)
Stage 3: Assessing Options
• truth of an option
= extent to which question/pages can entail it
= extent to which pages from the stem can entail it
+ extent to which pages from it can entail the stem
(entailment = relevance measurement)
Experiments
• Real-life questions in Gaokao or mock Gaokao
• QS-A: 123 questions, answerable based on Wikipedia
• QS-B: 454 questions, out of the scope of Wikipedia
QS-A QS-B
43.09% 31.28%
Correctly answered questions
Details can be found
in our poster!

More Related Content

Viewers also liked

Small Group Discussion – Group 3
Small Group Discussion – Group 3Small Group Discussion – Group 3
Small Group Discussion – Group 3EduSkills OECD
 
Case study : Education in China
Case study : Education in ChinaCase study : Education in China
Case study : Education in ChinaG C
 
Education in China - a Snapshot
Education in China - a SnapshotEducation in China - a Snapshot
Education in China - a SnapshotEduSkills OECD
 
China's education system
China's education systemChina's education system
China's education systemMeRvin Jay Go
 
Education system in china
Education system in chinaEducation system in china
Education system in chinaAditi Sharma
 
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...We Are Social Singapore
 

Viewers also liked (6)

Small Group Discussion – Group 3
Small Group Discussion – Group 3Small Group Discussion – Group 3
Small Group Discussion – Group 3
 
Case study : Education in China
Case study : Education in ChinaCase study : Education in China
Case study : Education in China
 
Education in China - a Snapshot
Education in China - a SnapshotEducation in China - a Snapshot
Education in China - a Snapshot
 
China's education system
China's education systemChina's education system
China's education system
 
Education system in china
Education system in chinaEducation system in china
Education system in china
 
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...
We Are Social’s Guide to Social, Digital and Mobile in China (2nd Edition, Ja...
 

More from Gong Cheng

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondGong Cheng
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探Gong Cheng
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法Gong Cheng
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Gong Cheng
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索Gong Cheng
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探Gong Cheng
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索Gong Cheng
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationGong Cheng
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference reviewGong Cheng
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationGong Cheng
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGong Cheng
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析Gong Cheng
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Gong Cheng
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic DataGong Cheng
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationGong Cheng
 
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Gong Cheng
 
知识的摘要
知识的摘要知识的摘要
知识的摘要Gong Cheng
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Gong Cheng
 
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Gong Cheng
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachGong Cheng
 

More from Gong Cheng (20)

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and Beyond
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity Summarization
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the Web
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic Data
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
 
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
 
知识的摘要
知识的摘要知识的摘要
知识的摘要
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
 
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based Approach
 

Recently uploaded

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 

Recently uploaded (20)

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 

Taking up the Gaokao Challenge: An Information Retrieval Approach

  • 1. Taking up the Gaokao Challenge: An Information Retrieval Approach Gong Cheng, Weixi Zhu, Ziwei Wang, Jianghui Chen, Yuzhong Qu National Key Laboratory for Novel Software Technology Nanjing University, China Websoft
  • 2. What is Gaokao? • National Higher Education Entrance Examination, a.k.a. China’s SAT, but • being more difficult • testing more subjects • Chinese literature, Mathematics, English language, and • Humanities (History, Geography, Political Education) or Natural Sciences (Physics, Chemistry, Biology)
  • 3. Why is Gaokao difficult for the computer? • Questions in Gaokao challenge QA systems: • multiple sentences to understand • domain-specific expressions to parse (e.g., quotes, formulas, maps) • unspecified, seemingly unbounded resources to search
  • 4. Overview of the Approach Recollecting Relevance Knowledge Drawing Evidence ReasoningHuman:
  • 5. Overview of the Approach Recollecting Relevance Knowledge Drawing Evidence ReasoningHuman: Computer:
  • 6. Stage 1: Retrieving Pages • Retrieving Concept Pages (Step 1/2: leftmost longest title matching) The Effectiveness of Confucianism chapter of the Xunzi said: (The King of Zhou) ruled the land and founded seventy-one (feudal) states, of which fifty-three (governors) were from the Ji family. It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives. The King of Zhou would invest those relatives with
  • 7. Stage 1: Retrieving Pages • Retrieving Concept Pages (Step 2/2: context-based disambiguation) Feudalism may refer to: • Feudalism (in China), existed during the Shang and the Zhou dynasty, and was replaced by centralization of authority during and after the Qin dynasty… • Feudalism (in Europe), prevailed in the Middle Ages from the 5th to the 15th century… • Feudalism (in Japan), originated from ritsuryo in the Heian period from the 8th to the 12th century… The Effectiveness of Confucianism chapter of the Xunzi said: (The King of Zhou) ruled the land and founded seventy-one (feudal) states, of which fifty-three (governors) were from the Ji family. It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives. The King of Zhou would invest those relatives with matching a disambiguation page Context helps disambiguation.
  • 8. Stage 1: Retrieving Pages • Retrieving Quote Pages (exact content match) The Effectiveness of Confucianism chapter of the Xunzi said: (The King of Zhou) ruled the land and founded seventy-one (feudal) states, of which fifty-three (governors) were from the Ji family. It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives. The King of Zhou would invest those relatives with matching the content of a page a quote
  • 9. Stage 2: Ranking and Filtering Pages • Centrality-based Ranking (centrality = cosine similarity) • Three kinds of vector space: • words in a page • links in a page • categories of a page center of retrieved pages
  • 10. Stage 2: Ranking and Filtering Pages • Domain-based Filtering (within historical categories, i.e., categories whose names contain the word history, and their descendant categories) • Relevance-based Ranking (relevance to the stem and options)
  • 11. Stage 3: Assessing Options • truth of an option = extent to which question/pages can entail it = extent to which pages from the stem can entail it + extent to which pages from it can entail the stem (entailment = relevance measurement)
  • 12. Experiments • Real-life questions in Gaokao or mock Gaokao • QS-A: 123 questions, answerable based on Wikipedia • QS-B: 454 questions, out of the scope of Wikipedia QS-A QS-B 43.09% 31.28% Correctly answered questions
  • 13. Details can be found in our poster!