3. What will we cover?
• What is structured data?
• Why do we care?
• Markup
• Guidelines
• Tools
• Conclusion
4. What is Structured Data?
The markup we add
to our templates to
place our html
content within a
machine-readable
structured context
Gary Larson
5. Why should we care?
• It’s all about data
• Semantic web
• Big data
• Open data
• Linked data
– Facebook social / open graph
– Google knowledge graph
7. More specific benefits
• Gives the content context
http://www.windsorstar.com/life/
Shaggy+dodges+police+handlers+roams+downtown+Moines+after/7639285/story.html
http://www.cargurus.com/Cars/
2010-Dodge-Ram-Pickup-2500-Pictures-c22019_pi35843510?picturesTabFilter=EXTERIOR
8. More specific benefits
• Gives the content context
• Recommended by big search engines
• Rich snippets
• SEO, SERPs
10. RDFa
• Resource Description Framework in attributes
• Based on RDF
• 2004 - Proposed
• 2012 - RDFa 1.1, non xml documents
• Entities, properties and values (triples)
• Most complex of the 3 types
11. Example HTML
<div>
<h1>Bomberman DS review</h1>
By Stuart Andrews
Reviewed 27 Jul 2005 04:00
<div>8 / 10</div>
<div>Back in the glory years of the SNES, Bomberman
was the Counter-Strike or Halo 2 of the
day...</div>
<div>
12. RDFa example
<div xmlns:v="http://rdf.data-vocabulary.org/#"
typeof="v:Review">
<h1 property="v:itemreviewed">Bomberman DS review</h1>
By <span property="v:reviewer">Stuart Andrews</span>
Reviewed <span property="v:dtreviewed">27 Jul 2005
04:00</span>
<div rel="v:rating">
<span property="v:rating">8</span> / <span
property="v:best">10</span>
<meta content="1" property="v:worst" /></div>
<div property="v:description">Back in the glory years of
the SNES, Bomberman was the Counter-Strike or Halo 2 of
the day…</div>
</div>
13. Microformats
• Uses html classes and rel attributes
• 2004 - concept introduced
• Simplest of the 3 types
• http://microformats.org/
14. Microformats example
<div class=“hReview”>
<h1 class=“item”>Bomberman DS review</h1>
By <a class=“reviewer” rel="author" href="/stuart-
andrews">Stuart Andrews</a>
Reviewed <span class=“dtreviewed”>27 Jul 2005 04:00<span
class="value-title" title="2005-07-27"></span></span>
<div><span class=“rating”>8</span> / <span
class=“best”>10</span></div>
<div class=“description”>Back in the glory years of the
SNES, Bomberman was the Counter-Strike or Halo 2 of the
day...</div>
<div>
15. Microformats 2 example
<div class=“h-review”>
<h1 class=“p-item”>Bomberman DS review</h1>
By <a class=“p-reviewer” rel="author" href="/stuart-
andrews">Stuart Andrews</a>
Reviewed <span class=“dt-reviewed”>27 Jul 2005 04:00<span
class="value-title" title="2005-07-27"></span></span>
<div><span class=“p-rating”>8</span> / <span class=“p-
best”>10</span></div>
<div class=“e-description”>Back in the glory years of the
SNES, Bomberman was the Counter-Strike or Halo 2 of the
day...</div>
<div>
16. Microdata
• Much younger
• Extension to HTML5
• Compromise between the complexity of
RDFa and easy but limited
microformats
• Some browsers add enhanced features
• hCalendar - add to calendar
• hCard - add to address book
17. Microdata example
<div itemscope itemtype="http://data-
vocabulary.org/Review">
<h1 itemprop="itemreviewed”>Bomberman DS review</h1>
By <span itemprop="reviewer”>Stuart Andrews</span>
Reviewed <time itemprop="dtreviewed" datetime="2009-
01-06”>27 Jul 2005 04:00</time>
<div><span itemprop="rating">8</span> / <span
itemprop=“best”>10</span></div>
<div itemprop=“description”>Back in the glory years
of the SNES, Bomberman was the Counter-Strike or
Halo 2 of the day...</div>
<div>
18. Microdata DOM API
var cats =
document.getItems("http://example.com/feline");
• Limited browser support
• MicrodataJS - lib and jQuery plugin that emulates
the DOM API
19. schema.org
• Collaboration of Google, Bing, Yahoo!
and Yandex - 2011
• Shared markup vocabulary
• Based on microdata
• Accounts for 99% of microdata markup
• http://schema.org
20. Schema.org example
<div itemscope itemtype=“http://schema.org/Review”>
<h1 itemprop="name">Bomberman DS review</h1>
By <span itemprop="author">Stuart Andrews</span>
Reviewed <span itemprop="dateCreated" content="2005-
07-27">27 Jul 2005 04:00</span>
<div itemprop="reviewRating" itemscope
itemtype="http://schema.org/Rating">
<span itemprop="ratingValue">8</span> / <span
itemprop="bestRating">10</span>
<meta itemprop="worstRating" content="1"/></div>
<div itemprop="reviewBody">Back in the glory years of
the SNES, Bomberman was the Counter-Strike or Halo
2 of the day…</div>
</div>
21. So what should we use?
• Depends
• Overall aim
• Data complexity
• Other page markup
22. So what should we use?
•
RDFa
•
represent complex data
•
require specific ontologies
•
Microformats
•
easy to implement
•
require browser enhancements
•
Microdata / schema.org
•
search engine focused
•
unified vocabuary for most common ontologies
I would recommend checking schema.org first,
then checking the other types if this doesn’t meet your requirements
23. Guidelines
• Don’t mark up hidden content,
use meta tags instead
• Mark up as much as you can
accurately
• Required video fields (google)
• Take care mixing vocabularies
and entities, esp when nesting Bill
Watterson
• Always always test - markup
and rich snippet preview
24. Tools
• Google rich snippets testing tool
• Bing testing tool
• Data Highlighter - stopgap only
• Webmaster tools structured data tab
• Browser plugins
– Microdata.reveal – chrome
– Operator - firefox
• Many more
25. Any other business
Facebook open graph
Twitter cards
Google custom search markup -
Pagedata, meta tags, page dates
26. Conclusion
http://www.graphicshunt.com/funny/images/pimpin_aint_easy-12621.htm
But it's worth it – get pimping that content!
27. Thank you!
Links
- Open / Linked Data
http://bitly.com/bundles/loonytoons/1
- Structured Data Info and Tools
http://bitly.com/bundles/loonytoons/4
Me
- @loonytoons
- http://loonyblurb.net
Hinweis der Redaktion
First order of business – 2 caveats no lol cats I’m not an expert and structured data markup, vocabularies and ontologies are constantly updating, as are how browers handle these
First order of business – 2 caveats no lol cats I’m not an expert and structured data markup, vocabularies and ontologies are constantly updating, as are how browers handle these
Publishing organisation based on content / data Ordered database, with confusing layer of html on top Structured data in this context refers to the markup - gives our data context
Google want to move from being an information engine to being a knowledge engine Google places and info panel Search for leonardo, and leonardo da vinci Background to the current environment we are operating in
Help search engines find the data and place it accurately in context Reviews / event etc
Search engines have been playing down the SEO benefits, but it's a no brainer If it's clearer what your page is about they can target search terms much better
Structured data already exists outside of html RDF is part of that Ontologies already exist This is transposing all of that to html
Structured data for the web - hence complex because of its extension and history
Namespace for the entity Properties and value
Namespace for the entity Properties and value Rating is a linked entity There are many other ontologies - bbc, new york times - this is not limited to RDFa but only some ontologies are represented by the other markup types They are flexible, extensible, you can create your own Facebook open graph protocol is actually a minimal implemntation of RDFa and you can see it looks very similar http://ogp.me/ Most complex - grown out of RDF, lots of ontologies and prefixs, no centralised place to discover all this, lots of ways of joining data and representing data datatyping (typeof), associating more than one type per object, embed-ability in languages other than HTML, ability to easily publish and mix vocabularies
Microformats.org demo
Easy to use But it gets lost in the markup Also it could cause you some styling issues when added as it uses very generic names Under constant development, microformats 2 has just come out Among other things this adds prefixes to the class names to make it clearer they relate to microformats
Under constant development, microformats 2 has just come out Among other things this adds prefixes to the class names to make it clearer they relate to microformats 'h-*' for root class names, e.g. 'h-card' 'p-*' for simple (text) properties, e.g. 'p-name' 'u-*' for URL properties, e.g. 'u-photo' 'dt-*' for datetime properties, e.g. 'dt-bday' 'e-*' for embedded markup properties, e.g. 'e-note'
Easy to see why they wanted a compromise
Microdata api - 2012 supported by opera and firefox MicrodataJS - js lib and jquery plugin that emulates the DOM API Not too much to say about this Mostly because of schema.org --->
Provides good docs - demo
This is really the schema to use search engine backing Easy to implement Good docs Schema.org - designed to be easily interchanged with RDFa 1.1 Can extend it and if it gains traction will be moved into core
Mix and match Use RDFa if you need to RDFa – complex data, unique / unusual ontologies - representing general data structure use elsewhere on the web Microformats – simple – use if you want the browser functionality Don't use for complex data setups Schema.org – simple-ish to implement - rich snippet / search engine focused
Mix and match Use RDFa if you need to RDFa – complex data, unique / unusual ontologies - representing general data structure use elsewhere on the web Microformats – simple – use if you want the browser functionality Don't use for complex data setups Schema.org – simple-ish to implement - rich snippet / search engine focused
Troubleshooting tips and form http://support.google.com/webmasters/bin/request.py?contact_type=rich_snippets_feedback It can take a while for rich snippets to work on your site though they appear in the test tool Might also be related to google’s perceived “value” of your site
Schemacreator.com Python libs and node.js parser and others Microdata - Javascript api (html5)
Schema.org faq quote - “but over time you can expect that more data will be used in more ways. In addition, since the markup is publicly accessible from your web pages, other organizations may find interesting new ways to make use of it as well” Can also get involved - discussions, ontologies etc