SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Linked Data for Information Extraction 
Challenge 2014 
Tasks and Results 
Robert Meusel and Heiko Paulheim
2 
Task 
Creation of an information extraction system that scrape 
structured information from HTML web sites. 
 Training dataset was created from HTML pages, which are 
annotated using Microformats hCard. 
 The data is a subset of the WebDataCommons Microformats 
Dataset. 
 The original data is provided by the Common Crawl Foundation, 
the largest public available collection of web crawls 
Linked Data for Information Extractin Challenge 2014 - Task and Results
3 
The Common Crawl Foundation (CC) 
 Non-profit foundation dedicated to building and maintaining 
an open crawl of the Web 
 9 crawl corpora from 2008 till 2014 available so far 
 Crawling Strategies: 
• Earlier crawled using BFS (with link discovery) seeded with a large list of ranked 
Seeds (PageRank), current crawls are gathered using a >6billion URL seed list 
from the blekko search index 
• By this, all crawls represent the popular part of the Web 
 Data availability 
• CC provides three different datasets for each crawl 
• All data can be freely downloaded from AWS S3 
Linked Data for Information Extractin Challenge 2014 - Task and Results
4 
The WebDataCommons Project 
Extraction of Structured Data from the Common Crawl Corpora 
 Extracts information annotated with the Markup languages 
Microformats, Microdata and RDFa 
 Till now, three different datasets gathered from crawls of 2010, 
2012, and 2013 
RDFa 
Microdata 
Microformats 
Linked Data for Information Extractin Challenge 2014 - Task and Results
5 
Extracting the Data 
 Webmaster markup their information within the HTML page 
directly using one of the three markup languages 
 Using Any23 (http://any23.apache.org/) those information are 
extracted as RDF triples 
Any23 
1. _:node1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://schema.org/Product> . 
2. _:node1 <http://schema.org/Product/name> "Predator Instinct FG 
Fuu00DFballschuh"@de . 
3. _:node1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://schema.org/Offer> . 
4. _:node1 <http://schema.org/Offer/price> "u20AC 219,95"@de . 
5. _:node1 <http://schema.org/Offer/priceCurrency> "EUR"@de . 
6. … 
Linked Data for Information Extractin Challenge 2014 - Task and Results
6 
The Original Dataset of 2013 
 Over 1.7 million domains using at least one markup language 
 Over 17 billion quads with over 4 billion records (typed entities) 
 hCard the most dominant among domains 
Linked Data for Information Extractin Challenge 2014 - Task and Results
7 
Extraction of Challenge Dataset 
 Selected a subset of over 10k web pages from the corpus 
including over 450k extracted triples (annotated with MF hCard) 
• Training: 9 877 web pages / 373 501 triples 
• Test: 2 379 web pages / 85 248 triples 
Linked Data for Information Extractin Challenge 2014 - Task and Results
8 
Creation of the Gold Standard 
 Input: Annotated HTML Pages & Triples (extracted with Any23) 
 After extraction of triples, all hCard tags are replaced 
• Replacement by random generated tags 
• stable per page, but different across pages 
• Replacement of comments: as CMS systems like to comment 
<!– here is the name of the company --> 
 Output 
• Training: 
• Annotated HTML Page 
• Cleaned HTML Page 
• Triples 
• Testing: 
• Cleaned HTML Page 
• Triples (not public) 
Linked Data for Information Extractin Challenge 2014 - Task and Results
9 
Overview: Dataset Creation and Evaluation Process 
Linked Data for Information Extractin Challenge 2014 - Task and Results
10 
Evaluation 
 Methodology: We consider each triple within extracted 
statements (submission) and extracted statements (Any23 from 
original test HTML pages) as equal if they have the same 
predicate and object for one page. 
 Baseline: Each page has at least one statement declaring there 
is one VCard 
_:1 rdf:type hcard:Vcard . 
Linked Data for Information Extractin Challenge 2014 - Task and Results
11 
Challenge Results 
 We got one submission (which you will learn about in some 
minutes) 
 The submission outperforms the baseline for Recall and F-Measure 
 The Gold Standard is not perfect, as within the data, we also 
find names and other attributes without a giving a type 
(whenever webmasters did not model this) Even a perfect 
extraction system would not reach a precision of 1. 
Linked Data for Information Extractin Challenge 2014 - Task and Results
12 
Outlook: LD4IE Challenge 2015 
 Include more classes (e.g. Microdata and/or RDFa) 
 Add negative examples to generate a more realistic setting 
• as today, systems can assume there is something within the test sample 
• challenge of making sure, that in the negative examples there is no not marked 
data included 
 Improve representativity of the challenge dataset 
• Wide-spread CMS systems automatically allow marking up of articles, posts etc. 
• Eliminate such bias, if present for next challenges 
<html> 
Linked Data for Information Extractin Challenge 2014 - Task and Results 
<html> 
MF:hCard 
</html> 
<html> 
</html> 
<html> 
MF:hCard 
</html> 
</html> 
<html> 
Microdata 
</html> 
<html> 
RDFa 
</html>

Weitere ähnliche Inhalte

Was ist angesagt?

Sagnik_AnalytixLabs_Projects
Sagnik_AnalytixLabs_ProjectsSagnik_AnalytixLabs_Projects
Sagnik_AnalytixLabs_Projects
Sagnik Jena
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical Domains
Chris Bizer
 
Geant4 Model Testing Framework: From PAW to ROOT
Geant4 Model Testing Framework:  From PAW to ROOTGeant4 Model Testing Framework:  From PAW to ROOT
Geant4 Model Testing Framework: From PAW to ROOT
Roman Atachiants
 
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
Graph Structure in the Web - Revisited. WWW2014 Web Science TrackGraph Structure in the Web - Revisited. WWW2014 Web Science Track
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
Chris Bizer
 

Was ist angesagt? (19)

A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
 
Sagnik_AnalytixLabs_Projects
Sagnik_AnalytixLabs_ProjectsSagnik_AnalytixLabs_Projects
Sagnik_AnalytixLabs_Projects
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical Domains
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case study
 
An Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAn Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset Profiles
 
Neo4j_allHands_04112013
Neo4j_allHands_04112013Neo4j_allHands_04112013
Neo4j_allHands_04112013
 
Geant4 Model Testing Framework: From PAW to ROOT
Geant4 Model Testing Framework:  From PAW to ROOTGeant4 Model Testing Framework:  From PAW to ROOT
Geant4 Model Testing Framework: From PAW to ROOT
 
Methodology for the publication of Linked Open Data from small and medium siz...
Methodology for the publication of Linked Open Data from small and medium siz...Methodology for the publication of Linked Open Data from small and medium siz...
Methodology for the publication of Linked Open Data from small and medium siz...
 
Linked Data Notifications Distributed Update Notification and Propagation on ...
Linked Data Notifications Distributed Update Notification and Propagation on ...Linked Data Notifications Distributed Update Notification and Propagation on ...
Linked Data Notifications Distributed Update Notification and Propagation on ...
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 
Or2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflowOr2019 DSpace 7 Enhanced submission &amp; workflow
Or2019 DSpace 7 Enhanced submission &amp; workflow
 
CCCB Germline Variant Analysis on Cloud Platform
CCCB Germline Variant Analysis on Cloud PlatformCCCB Germline Variant Analysis on Cloud Platform
CCCB Germline Variant Analysis on Cloud Platform
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
The Linked Data Lifecycle
The Linked Data LifecycleThe Linked Data Lifecycle
The Linked Data Lifecycle
 
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
Graph Structure in the Web - Revisited. WWW2014 Web Science TrackGraph Structure in the Web - Revisited. WWW2014 Web Science Track
Graph Structure in the Web - Revisited. WWW2014 Web Science Track
 
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
Robert Isele | eccenca CorporateMemory - Semantically integrated Enterprise D...
 
Web scraping
Web scrapingWeb scraping
Web scraping
 

Ähnlich wie Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014

The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
Robert Meusel
 
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
Ahmad Assaf
 

Ähnlich wie Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014 (20)

The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
 
JavaOne2013 Leveraging Linked Data and OSLC
JavaOne2013 Leveraging Linked Data and OSLCJavaOne2013 Leveraging Linked Data and OSLC
JavaOne2013 Leveraging Linked Data and OSLC
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
contentDM
contentDMcontentDM
contentDM
 
Nadee2018
Nadee2018Nadee2018
Nadee2018
 
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
Enabling Self-service Data Provisioning Through Semantic Enrichment of Data |...
 
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
10-Step Methodology to Building a Single View with MongoDB
10-Step Methodology to Building a Single View with MongoDB10-Step Methodology to Building a Single View with MongoDB
10-Step Methodology to Building a Single View with MongoDB
 
project_phrase I.pptx
project_phrase I.pptxproject_phrase I.pptx
project_phrase I.pptx
 
Open Source, The Natural Fit for Content Management in the Enterprise
Open Source, The Natural Fit for Content Management in the EnterpriseOpen Source, The Natural Fit for Content Management in the Enterprise
Open Source, The Natural Fit for Content Management in the Enterprise
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Ibm connect 2014_presentation - cust109
Ibm connect 2014_presentation - cust109Ibm connect 2014_presentation - cust109
Ibm connect 2014_presentation - cust109
 
Sree saranya
Sree saranyaSree saranya
Sree saranya
 
Sree saranya
Sree saranyaSree saranya
Sree saranya
 
L017418893
L017418893L017418893
L017418893
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014

  • 1. Linked Data for Information Extraction Challenge 2014 Tasks and Results Robert Meusel and Heiko Paulheim
  • 2. 2 Task Creation of an information extraction system that scrape structured information from HTML web sites.  Training dataset was created from HTML pages, which are annotated using Microformats hCard.  The data is a subset of the WebDataCommons Microformats Dataset.  The original data is provided by the Common Crawl Foundation, the largest public available collection of web crawls Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 3. 3 The Common Crawl Foundation (CC)  Non-profit foundation dedicated to building and maintaining an open crawl of the Web  9 crawl corpora from 2008 till 2014 available so far  Crawling Strategies: • Earlier crawled using BFS (with link discovery) seeded with a large list of ranked Seeds (PageRank), current crawls are gathered using a >6billion URL seed list from the blekko search index • By this, all crawls represent the popular part of the Web  Data availability • CC provides three different datasets for each crawl • All data can be freely downloaded from AWS S3 Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 4. 4 The WebDataCommons Project Extraction of Structured Data from the Common Crawl Corpora  Extracts information annotated with the Markup languages Microformats, Microdata and RDFa  Till now, three different datasets gathered from crawls of 2010, 2012, and 2013 RDFa Microdata Microformats Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 5. 5 Extracting the Data  Webmaster markup their information within the HTML page directly using one of the three markup languages  Using Any23 (http://any23.apache.org/) those information are extracted as RDF triples Any23 1. _:node1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Product> . 2. _:node1 <http://schema.org/Product/name> "Predator Instinct FG Fuu00DFballschuh"@de . 3. _:node1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Offer> . 4. _:node1 <http://schema.org/Offer/price> "u20AC 219,95"@de . 5. _:node1 <http://schema.org/Offer/priceCurrency> "EUR"@de . 6. … Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 6. 6 The Original Dataset of 2013  Over 1.7 million domains using at least one markup language  Over 17 billion quads with over 4 billion records (typed entities)  hCard the most dominant among domains Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 7. 7 Extraction of Challenge Dataset  Selected a subset of over 10k web pages from the corpus including over 450k extracted triples (annotated with MF hCard) • Training: 9 877 web pages / 373 501 triples • Test: 2 379 web pages / 85 248 triples Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 8. 8 Creation of the Gold Standard  Input: Annotated HTML Pages & Triples (extracted with Any23)  After extraction of triples, all hCard tags are replaced • Replacement by random generated tags • stable per page, but different across pages • Replacement of comments: as CMS systems like to comment <!– here is the name of the company -->  Output • Training: • Annotated HTML Page • Cleaned HTML Page • Triples • Testing: • Cleaned HTML Page • Triples (not public) Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 9. 9 Overview: Dataset Creation and Evaluation Process Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 10. 10 Evaluation  Methodology: We consider each triple within extracted statements (submission) and extracted statements (Any23 from original test HTML pages) as equal if they have the same predicate and object for one page.  Baseline: Each page has at least one statement declaring there is one VCard _:1 rdf:type hcard:Vcard . Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 11. 11 Challenge Results  We got one submission (which you will learn about in some minutes)  The submission outperforms the baseline for Recall and F-Measure  The Gold Standard is not perfect, as within the data, we also find names and other attributes without a giving a type (whenever webmasters did not model this) Even a perfect extraction system would not reach a precision of 1. Linked Data for Information Extractin Challenge 2014 - Task and Results
  • 12. 12 Outlook: LD4IE Challenge 2015  Include more classes (e.g. Microdata and/or RDFa)  Add negative examples to generate a more realistic setting • as today, systems can assume there is something within the test sample • challenge of making sure, that in the negative examples there is no not marked data included  Improve representativity of the challenge dataset • Wide-spread CMS systems automatically allow marking up of articles, posts etc. • Eliminate such bias, if present for next challenges <html> Linked Data for Information Extractin Challenge 2014 - Task and Results <html> MF:hCard </html> <html> </html> <html> MF:hCard </html> </html> <html> Microdata </html> <html> RDFa </html>