SlideShare ist ein Scribd-Unternehmen logo
1 von 18
 INTRODUCTION
 BACKGROUND & PROBLEM FORMULATION
 VISION-BASEDWEB ENTITYEXTRACTION
 STATISTICALSNOWBALL FOR PATTERNDISCOVERY
 INTERACTIVEENTITYINFORMATION INTEGRATION
 CONCLUSION
 REFERENCES
 Theneed forcollecting andunderstanding Web informationabout a real-world
entity (such as a personor a product)iscurrently fulfilledmanually through
search engines.
 Informationabouta single entity mightappear in thousands of Web pages. Even
if a searchenginecould findall therelevantWeb pages about an entity,the user
would needtoshift through allthese pages to geta complete view of the entity.
 EntityCube: Anautomatically generated entity relationshipgraphbasedon
knowledgeextracted frombillionsofwebpages
 Entity Retrieval: Entity search engines can return a ranked list of entities
most relevant for a user query.
 Entity Relationship/Fact Mining and Navigation: Entity search engines
enable users to explore highly relevant information during searches to
discover interesting relationships/facts about the entities associated with
their queries.
 Prominence Ranking: Entity search engines detect the popularity of an
entity and enable users to browse entities in different categories ranked
by their prominence during a given time period.
A. Web Entities : We define the concept of Web Entity as the principal data
units about which Web information is to be collected, indexed and
ranked. Web entities are usually recognizable concepts such as people,
organization, locations, products, papers, conferences, or journals, which
have relevance to the application domain. Different types of entities are
used to representthe informationfordifferentconcepts.
B. Entity Search Engine:
 First, a crawler fetches web data related to the targeted entities & the crawled
data is classifiedinto differententity types.
 The information is put into the web entity store & entity search engines can be
constructed based on thestructured informationin theentitystore.
Visual Layout Features :
 Web pages usually contain many explicitor implicit visual separators
such as lines, blank area, image, font size and color, element size &
position.
Text Features :
 Text content is themost natural featuretouse for entity extraction.
 Inwebpages, thereare a lotofHTML elementswhichonly contain veryshort
text fragments.
 We do not furthersegment these shorttext fragmentsintoindividual words.
Instead, we consider them asthe atomic labelingunits for webentity
extraction. For long text sentences/paragraphswithin webpages,however, we
furthersegment them intotext fragmentsusing algorithmslikeSemi-CRF.
 The web information about a single entity may be distributed in diverse web
sources, the web entity extraction task should integrate all the knowledge
pieces extracted fromdifferentwebpages.
 The most challenging problem in entity information integration is name
disambiguation.
 Solve the name disambiguation problem with users we have introduced the
novel entityframework‘iknoweb’.
iKnoweb:
 iKnowebinteractivelyinvolves human intelligence forentity knowledge
miningproblems.
 It is acrowdsourcing approach whichcombine both the power ofknowledge
miningalgorithmsand user contributions.
iKnoweb Overview:
 One importantconcept we propose in iKnowebis MaximumRecognition Units
(MRU),whichservesas atomic units inthe interactivename disambiguation
process.
 MRU: It is a groupof knowledgepieceswhicharefully automatically assigned
tothe same entity identifierwith100% confidence that theyrefertothe same
entity.
 Detecting Maximum RecognitionUnits
 Question Generation
 MRU and Question Re-Ranking
 Network Effects
 InteractionOptimization
 Solves the name disambiguation problems together withusers in both
Microsoft Academic Search and EntityCube.
The statistical snowballwork to automatically discover text patterns
from billions of web pages leveraging the information redundancy
property of the Web is introduced. iKnoweb, an interactive knowledge
mining framework, which collaborates with the end users to connect the
extracted knowledge pieces mined from Web and builds an accurate
entity knowledge web.
 Eugene Agichtein, Luis Gravano: Snowball: extractingrelations from large
plain-textcollections. In Proceedings of the fifth ACM conference on
Digital libraries, pp.85-94, June 02-07, 2000, San Antonio, Texas, United
States.
 C.Cortes andV.Vapnik.Support-vector networks. MachineLearing, Vol.
20, Nr. 3 (1995),p.273-297,1995.
Statistical entity extraction from web

Weitere ähnliche Inhalte

Ähnlich wie Statistical entity extraction from web

Semantic Search Engine That Reads Your Mind
Semantic Search Engine That Reads Your MindSemantic Search Engine That Reads Your Mind
Semantic Search Engine That Reads Your MindAmit Sheth
 
Web Search Engine, Web Crawler, and Semantics Web
Web Search Engine, Web Crawler, and Semantics WebWeb Search Engine, Web Crawler, and Semantics Web
Web Search Engine, Web Crawler, and Semantics WebAatif19921
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCSCJournals
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYAmit Sheth
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Massimiliano Geraci
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey dannyijwest
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey dannyijwest
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsBen DeMott
 
Searchland: Search quality for Beginners
Searchland: Search quality for BeginnersSearchland: Search quality for Beginners
Searchland: Search quality for BeginnersValeria de Paiva
 
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldHumantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldGrant Simmons
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewIOSR Journals
 
Quality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsQuality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsZemanta
 
Quality, quantity, web and semantics
Quality, quantity, web and semanticsQuality, quantity, web and semantics
Quality, quantity, web and semanticsAndraz Tori
 

Ähnlich wie Statistical entity extraction from web (20)

Semantic Search Engine That Reads Your Mind
Semantic Search Engine That Reads Your MindSemantic Search Engine That Reads Your Mind
Semantic Search Engine That Reads Your Mind
 
Web Search Engine, Web Crawler, and Semantics Web
Web Search Engine, Web Crawler, and Semantics WebWeb Search Engine, Web Crawler, and Semantics Web
Web Search Engine, Web Crawler, and Semantics Web
 
Peerbelt_Presentation
Peerbelt_PresentationPeerbelt_Presentation
Peerbelt_Presentation
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector Machine
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
 
Web Content Mining
Web Content MiningWeb Content Mining
Web Content Mining
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey
 
Search V Next Final
Search V Next FinalSearch V Next Final
Search V Next Final
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementations
 
Searchland: Search quality for Beginners
Searchland: Search quality for BeginnersSearchland: Search quality for Beginners
Searchland: Search quality for Beginners
 
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven WorldHumantics | Optimizing Your Content Strategy in an Entity-Driven World
Humantics | Optimizing Your Content Strategy in an Entity-Driven World
 
Grant Simmons - Advanced Search Summit Napa 2021
Grant Simmons - Advanced Search Summit Napa 2021Grant Simmons - Advanced Search Summit Napa 2021
Grant Simmons - Advanced Search Summit Napa 2021
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
 
Quality, Quantity, Web and Semantics
Quality, Quantity, Web and SemanticsQuality, Quantity, Web and Semantics
Quality, Quantity, Web and Semantics
 
Quality, quantity, web and semantics
Quality, quantity, web and semanticsQuality, quantity, web and semantics
Quality, quantity, web and semantics
 
search engines
search enginessearch engines
search engines
 
G017254554
G017254554G017254554
G017254554
 

Kürzlich hochgeladen

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 

Kürzlich hochgeladen (20)

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 

Statistical entity extraction from web

  • 1.
  • 2.  INTRODUCTION  BACKGROUND & PROBLEM FORMULATION  VISION-BASEDWEB ENTITYEXTRACTION  STATISTICALSNOWBALL FOR PATTERNDISCOVERY  INTERACTIVEENTITYINFORMATION INTEGRATION  CONCLUSION  REFERENCES
  • 3.  Theneed forcollecting andunderstanding Web informationabout a real-world entity (such as a personor a product)iscurrently fulfilledmanually through search engines.  Informationabouta single entity mightappear in thousands of Web pages. Even if a searchenginecould findall therelevantWeb pages about an entity,the user would needtoshift through allthese pages to geta complete view of the entity.
  • 4.  EntityCube: Anautomatically generated entity relationshipgraphbasedon knowledgeextracted frombillionsofwebpages
  • 5.  Entity Retrieval: Entity search engines can return a ranked list of entities most relevant for a user query.  Entity Relationship/Fact Mining and Navigation: Entity search engines enable users to explore highly relevant information during searches to discover interesting relationships/facts about the entities associated with their queries.  Prominence Ranking: Entity search engines detect the popularity of an entity and enable users to browse entities in different categories ranked by their prominence during a given time period.
  • 6. A. Web Entities : We define the concept of Web Entity as the principal data units about which Web information is to be collected, indexed and ranked. Web entities are usually recognizable concepts such as people, organization, locations, products, papers, conferences, or journals, which have relevance to the application domain. Different types of entities are used to representthe informationfordifferentconcepts.
  • 7. B. Entity Search Engine:  First, a crawler fetches web data related to the targeted entities & the crawled data is classifiedinto differententity types.  The information is put into the web entity store & entity search engines can be constructed based on thestructured informationin theentitystore.
  • 8. Visual Layout Features :  Web pages usually contain many explicitor implicit visual separators such as lines, blank area, image, font size and color, element size & position.
  • 9. Text Features :  Text content is themost natural featuretouse for entity extraction.  Inwebpages, thereare a lotofHTML elementswhichonly contain veryshort text fragments.  We do not furthersegment these shorttext fragmentsintoindividual words. Instead, we consider them asthe atomic labelingunits for webentity extraction. For long text sentences/paragraphswithin webpages,however, we furthersegment them intotext fragmentsusing algorithmslikeSemi-CRF.
  • 10.
  • 11.  The web information about a single entity may be distributed in diverse web sources, the web entity extraction task should integrate all the knowledge pieces extracted fromdifferentwebpages.  The most challenging problem in entity information integration is name disambiguation.  Solve the name disambiguation problem with users we have introduced the novel entityframework‘iknoweb’.
  • 12. iKnoweb:  iKnowebinteractivelyinvolves human intelligence forentity knowledge miningproblems.  It is acrowdsourcing approach whichcombine both the power ofknowledge miningalgorithmsand user contributions. iKnoweb Overview:  One importantconcept we propose in iKnowebis MaximumRecognition Units (MRU),whichservesas atomic units inthe interactivename disambiguation process.  MRU: It is a groupof knowledgepieceswhicharefully automatically assigned tothe same entity identifierwith100% confidence that theyrefertothe same entity.
  • 13.
  • 14.  Detecting Maximum RecognitionUnits  Question Generation  MRU and Question Re-Ranking  Network Effects  InteractionOptimization
  • 15.  Solves the name disambiguation problems together withusers in both Microsoft Academic Search and EntityCube.
  • 16. The statistical snowballwork to automatically discover text patterns from billions of web pages leveraging the information redundancy property of the Web is introduced. iKnoweb, an interactive knowledge mining framework, which collaborates with the end users to connect the extracted knowledge pieces mined from Web and builds an accurate entity knowledge web.
  • 17.  Eugene Agichtein, Luis Gravano: Snowball: extractingrelations from large plain-textcollections. In Proceedings of the fifth ACM conference on Digital libraries, pp.85-94, June 02-07, 2000, San Antonio, Texas, United States.  C.Cortes andV.Vapnik.Support-vector networks. MachineLearing, Vol. 20, Nr. 3 (1995),p.273-297,1995.