Entity-Based Search – What is it? What are entities? Where do they come from? And how does Google use them? This presentation answers questions about the history of Google, structured data, the Knowledge Graph, how Rank Brain ties into it and what SEOs need to know about it all.
16. #SearchLeeds @schachin Kristine Schachinger
Unstructured data (or unstructured information) is information that
either does not have a pre-defined data model or is not organized in a
pre-defined manner. Unstructured information is typically text-heavy,
but may contain data such as dates, numbers, and facts as well.
https://www.google.co.uk/search?q=definition+unstructured+data&oq=definition+unstructured+data&aqs=chrome..69i57j0l5.5175j0j7&sourceid=chrome&ie=UTF-8
18. #SearchLeeds @schachin Kristine Schachinger
TF-IDF
Term Frequency Inverse
Document Frequency
ie the frequency of keywords
https://moz.com/blog/7-advanced-seo-concepts
19. #SearchLeeds @schachin Kristine Schachinger
As queries number in the trillions
unstructured data becomes inefficient.
Data needs structure.
22. #SearchLeeds @schachin Kristine Schachinger
Graph-based knowledge representation has been researched for
decades and the term knowledge graph does not constitute a new
technology. Rather, it is a buzzword reinvented by Google and
adopted by other companies and academia to describe different knowledge
representation applications.
Knowledge Graphs
http://ceur-ws.org/Vol-1695/paper4.pdf
23. #SearchLeeds @schachin Kristine Schachinger
Enter Semantic Search
https://web.archive.org/web/20090516213508/http://blog.searchenginewatch.com/090512-201139
24. #SearchLeeds @schachin Kristine Schachinger
https://web.archive.org/web/20090516213508/http://blog.searchenginewatch.com/090512-201139
What is Semantic Search?
27. #SearchLeeds @schachin Kristine Schachinger
Google Squared
Google Squared returns search results in a spreadsheet format. It
structures the unstructured data on web pages. So a search for Small
Dogs returns results with names, description, size, weight, origin,
etc., in columns and rows.
Google is looking for data structures on the web that imply facts, and
then grabbing it for Squared results. “It takes an incredible amount of
compute power to create one of those squares,” she says.”
~Techcrunch
https://techcrunch.com/2009/05/12/what-is-google-squared-it-is-how-google-will-crush-wolfram-alpha-exclusive-video/
28. #SearchLeeds @schachin Kristine Schachinger
https://searchengineland.com/up-close-google-squared-19313
Before the Knowledge Graph
29. #SearchLeeds @schachin Kristine Schachinger
https://searchengineland.com/up-close-google-squared-19313
Before the Knowledge Graph
30. #SearchLeeds @schachin Kristine Schachinger
Google Squared
“Check out this example below, a Square for the search “dog breeds.” It’s cool that you can
add major or minor medical concerns to the list of columns, but the selection of examples is
really strange.
Call it structured data if you like,
I call it a surefire recipe for making a bad dog buying decision.”
https://readwrite.com/2009/06/03/google_squared_is_live_who_knew_structured_data_co/
34. #SearchLeeds @schachin Kristine Schachinger
https://searchengineland.com/up-close-google-squared-19313
“Strings to Things"
The Holy Grail of Search?
NLP (Natural Language Processing).
36. #SearchLeeds @schachin Kristine Schachinger
However, this was the early stages of Google
moving search from strings (unstructured data)
or the “bag of words” approach
to “things” (structured data).
“Strings to Things"
37. #SearchLeeds @schachin Kristine Schachinger
“Things” are known objects
with known (or learned) relationships.
“Strings to Things"
38. #SearchLeeds @schachin Kristine Schachinger
https://searchengineland.com/up-close-google-squared-19313
Before THE Knowledge Graph – Wonder Wheel
39. #SearchLeeds @schachin Kristine Schachinger
https://searchengineland.com/up-close-google-squared-19313
Before the Knowledge Graph – Wonder Wheel
42. #SearchLeeds @schachin Kristine Schachinger
Knowledge Graphs are based on known relationships.
THE Knowledge Graph is Google’s graph database.
THE Knowledge Graph
43. #SearchLeeds @schachin Kristine Schachinger
The Knowledge Graph (Google) is seeded by things known.
Instead of just text without meaning, The KG is a relational
graph with known objects and mapped relationships.
THE Knowledge Graph
44. #SearchLeeds @schachin Kristine Schachinger
"Four years ago this July, Google acquired Metaweb,
bringing Freebase and linked open data to Google," he wrote.
Google software engineer Barak Michener
http://www.eweek.com/database/google-releases-cayley-open-source-graph-database
THE Knowledge Graph Seeds
45. #SearchLeeds @schachin Kristine Schachinger
Also includes trusted sources such as the
CIA Fact Book, Wikipedia, Wikidata etc.
http://www.eweek.com/database/google-releases-cayley-open-source-graph-database
THE Knowledge Graph Seeds
47. #SearchLeeds @schachin Kristine Schachinger
The Knowledge Graph enables you to search for things, people or places
that Google knows about—landmarks, celebrities, cities, sports teams,
buildings, geographical features, movies, celestial objects, works of art
and more—and instantly get information that’s relevant to your query
THE Knowledge Graph
51. #SearchLeeds @schachin Kristine Schachinger
Knowledge Graph entities
The Knowledge Graph has millions of entries that describe real-world entities like people, places, and things. These
entities form the nodes of the graph.
The following are some of the types of entities found in the Knowledge Graph:
Book
BookSeries
EducationalOrganization
Event
GovernmentOrganization
LocalBusiness
Movie
MovieSeries
MusicAlbum
MusicGroup
MusicRecording
Organization
Periodical
Person
Place
SportsTeam
TVEpisode
TVSeries
VideoGame
VideoGameSeries
WebSite
58. #SearchLeeds @schachin Kristine Schachinger
Hummingbird
The name was derived from
the speed and accuracy of the hummingbird.
“Strings to Things"
59. #SearchLeeds @schachin Kristine Schachinger
Hummingbird Arrives 2013
Google moves from matching keyword terms to
Google trying to process Natural Language Queries.
“Strings to Things"
61. #SearchLeeds @schachin Kristine Schachinger
Hummingbird adds a semantic layer to the search
algorithms utilizing structured data.
“Strings to Things"
62. #SearchLeeds @schachin Kristine Schachinger
Hummingbird adds a
semantic layer to the
search algorithms like
synonyms and close
variants.
https://moz.com/blog/7-advanced-seo-concepts
63. #SearchLeeds @schachin Kristine Schachinger
Hummingbird adds a
semantic layer to the
search algorithms that
uses “semantic distance
and term relationships”.
https://moz.com/blog/7-advanced-seo-concepts
64. #SearchLeeds @schachin Kristine Schachinger
Hummingbird adds a
semantic layer to the
search algorithms that
uses “phrase based
Indexing and co-
occurrence.”
https://moz.com/blog/7-advanced-seo-concepts
65. #SearchLeeds @schachin Kristine Schachinger
Page Segmentation.
This part of the
algorithm determines
meaning through
placement.
https://moz.com/blog/7-advanced-seo-concepts
66. #SearchLeeds @schachin Kristine Schachinger
Entity Salience.
This part of the
algorithm determines
meaning through known
relationships.
https://moz.com/blog/7-advanced-seo-concepts
67. #SearchLeeds @schachin Kristine Schachinger
Google doesn’t process
Natural Language.
This means we must add an “interpreter”.
68. #SearchLeeds @schachin Kristine Schachinger
Enter Structured Data & Schema
https://web.archive.org/web/20090516213508/http://blog.searchenginewatch.com/090512-201139
70. #SearchLeeds @schachin Kristine Schachinger
What is Structured Data?
Structured data for SEO purposes is on-page markup that
enables search engines to better understand the information
currently on your site’s web pages, and then use this information
to improve search results listing by better matching user intent.
71. #SearchLeeds @schachin Kristine Schachinger
What is Structured Data?
This structured data is defined by using schema to act as the
interpreter. This is the definition we add to the page using
schema code.
Google allows 3 types.
• RDFa
• Microdata
• JSON-LD
72. #SearchLeeds @schachin Kristine Schachinger
Schema
JSON-LD is the recommended schema code.
JSON-LD stands for JavaScript Object Notation for Linked Data
This is just a way to implement schema outside the HTML mark-up
structure. RDFa and Microformats required the code to be implemented
via HTML.
73. #SearchLeeds @schachin Kristine Schachinger
Schema
Benefit is it can be removed from the HTML structure, which
makes it easier to write, implement, and maintain.
For a good breakdown on what JSON is at the code level.
Portent’s JSON Implementation Guide is very helpful.
https://www.portent.com/blog/seo/json-ld-implementation-guide.htm
75. #SearchLeeds @schachin Kristine Schachinger
Schema
IMPORTANT! Test your JSON-LD.
Use the Google Structured Mark-Up Helper.
https://search.google.com/structured-data/testing-tool
76. #SearchLeeds @schachin Kristine Schachinger
Schema
NOTE this tool only tells you if it is semantically correct, NOT if
you are using the proper schema.
Make sure to check with Google’s Guides on schema
implementation. Improper use or implementation can result in
a manual action.
• https://developers.google.com/search/docs/guides/intro-structured-data
• https://developers.google.com/search/docs/guides/prototype
77. #SearchLeeds @schachin Kristine Schachinger
Schema
IMPORTANT! Your JSON content MUST match what is in the
page exactly. If they differ, you will likely get a manual action as
Google sees this as cloaking.
87. #SearchLeeds @schachin Kristine Schachinger
Uses Structured Data, Entities & Known Relationships
Person, Place, Thing = Noun = Entities.
Nouns or Persons/Places/People/Things are what we call entities. Entities are
known to Google and their meaning is defined in the databases Google references.
Rank Brain.
95. #SearchLeeds @schachin Kristine Schachinger
Adding semantic mark-up
(structured data via schema) allows us to tell
Google what WE SAY our site is about and WHAT
RELATIONSHIPS we define within it.
96. #SearchLeeds @schachin Kristine Schachinger
We can act as the interpreter and help “teach”
Google the context of our content.
98. #SearchLeeds @schachin Kristine Schachinger
We can help give Google a clearer understanding.
And in the end that helps us help Google better
answer the questions users ask and to better
surface our content where we want it.
We give our data meaning Google understands.