Google's search algorithms have evolved from relying solely on keyword matching and link analysis to incorporating semantic understanding enabled by knowledge graphs and machine learning. Over time, Google has moved from processing unstructured "bags of words" to understanding entities and their relationships in order to better match user intent. The introduction of techniques like Hummingbird and the Knowledge Graph allowed Google to incorporate semantic interpretations and contextual information into search rankings.
10. @schachin #StateOfSearchKristine Schachinger
In ONE SECOND today, there were
http://www.internetlivestats.com/google-search-statistics/
http://www.internetlivestats.com/google-search-statistics/
14. @schachin #StateOfSearchKristine Schachinger
Unstructured data (or unstructured information) is information that
either does not have a pre-defined data model or is not organized in a
pre-defined manner. Unstructured information is typically text-heavy,
but may contain data such as dates, numbers, and facts as well.
https://www.google.co.uk/search?q=definition+unstructured+data&oq=definition+unstructured+data&aqs=chrome..69i57j0l5.5175j0j7&sourceid=chrome&ie=UTF-8
15. @schachin #StateOfSearchKristine Schachinger
Unstructured data (or unstructured information) is information that
either does not have a pre-defined data model or is not organized in a
pre-defined manner. Unstructured information is typically text-heavy,
but may contain data such as dates, numbers, and facts as well.
https://www.google.co.uk/search?q=definition+unstructured+data&oq=definition+unstructured+data&aqs=chrome..69i57j0l5.5175j0j7&sourceid=chrome&ie=UTF-8
This is known as the “Bag of Words” approach.
23. @schachin #StateOfSearchKristine Schachinger
“Graph-based knowledge representation has been
researched for decades and the term knowledge graph
does not constitute a new technology.
Rather, it is a buzzword reinvented by Google and
adopted by other companies and academia to describe
different knowledge representation applications.”
Knowledge Graphs
http://ceur-ws.org/Vol-1695/paper4.pdf
37. @schachin #StateOfSearchKristine Schachinger
The Knowledge Graph enables you to search for things, people or places
that Google knows about—landmarks, celebrities, cities, sports teams,
buildings, geographical features, movies, celestial objects, works of art
and more—and instantly get information that’s relevant to your query
THE Knowledge Graph
41. @schachin #StateOfSearchKristine Schachinger
Knowledge Graph entities
The Knowledge Graph has millions of entries that describe real-world entities like people, places, and things. These
entities form the nodes of the graph.
The following are some of the types of entities found in the Knowledge Graph:
Book
BookSeries
EducationalOrganization
Event
GovernmentOrganization
LocalBusiness
Movie
MovieSeries
MusicAlbum
MusicGroup
MusicRecording
Organization
Periodical
Person
Place
SportsTeam
TVEpisode
TVSeries
VideoGame
VideoGameSeries
WebSite
THE Knowledge Graph ENTITIES
51. @schachin #StateOfSearchKristine Schachinger
The Knowledge Graph (Google) is seeded by things known.
Instead of just text without meaning, The KG is a relational
graph with known objects and mapped relationships.
THE Knowledge Graph
56. @schachin #StateOfSearchKristine Schachinger
KEY FACTOR word2vec:
Vector space models (VSMs) represent (embed) words in
a continuous vector space where semantically similar
words are mapped to nearby points ('are embedded
nearby each other').
Hummingbird
https://www.tensorflow.org/tutorials/representation/word2vec
58. @schachin #StateOfSearchKristine Schachinger
“…words that appear in the same contexts share semantic meaning. The different
approaches that leverage this principle can be divided into two categories: count-
based methods (e.g. Latent Semantic Analysis), and predictive
methods (e.g. neural probabilistic language models).”
Hummingbird
https://www.tensorflow.org/tutorials/representation/word2vec
67. @schachin #StateOfSearchKristine Schachinger
Entity Salience.
This part of the algorithm
determines meaning
through known
relationships.
+
2018-19 Google adds the
“topic layer” to the
knowledge graph
(categorical classification)
https://moz.com/blog/7-advanced-seo-concepts
68. @schachin #StateOfSearchKristine Schachinger
So Hummingbird moves from
strict word count based modeling
(ie keyword counts) to
probabilistic modeling
(ie predictive interpretation)
via known word vectors+nodes (relationships).
Hummingbird
73. @schachin #StateOfSearchKristine Schachinger
What is Structured Data?
Structured data for SEO purposes is on-page markup that
enables search engines to better understand the information
currently on your site’s web pages, and then use this information
to improve search results listing by better matching user intent.
74. @schachin #StateOfSearchKristine Schachinger
What is Structured Data?
This structured data is defined by using schema to act as the
interpreter. This is the definition we add to the page using
schema code.
Google allows 3 types.
• RDFa
• Microdata
• JSON-LD PREFERS
75. @schachin #StateOfSearchKristine Schachinger
Schema
JSON-LD is the recommended schema code.
JSON-LD stands for JavaScript Object Notation for Linked Data
This is just a way to implement schema outside the HTML mark-up
structure. RDFa and Microformats required the code to be implemented
via HTML.
76. @schachin #StateOfSearchKristine Schachinger
Schema
Benefit is it can be removed from the HTML structure, which
makes it easier to write, implement, and maintain.
Resources.
For a good breakdown on what JSON is at the code level.
Portent’s JSON Implementation Guide is very helpful.
https://www.portent.com/blog/seo/json-ld-implementation-guide.htm
And Google has a section in the Developer Guides
https://developers.google.com/search/docs/guides/intro-structured-data
82. @schachin #StateOfSearchKristine Schachinger
We can help give Google a clearer understanding.
That helps us help Google better answer
the questions users ask
and to better surface our content for those users
We give our data meaning
Google Understands
92. @schachin #StateOfSearchKristine Schachinger
• Words go in.
• Words get assigned a mathematical address in a vector.
• Similar and related words sit close to each other in the vector space.
• Words are retrieved based on your query and the words it locates in the “best fit” vector.
• These word “interpretations” are used to return results.
• If the relationships are weak or unknown, enter Rank Brain.
• Behind the scenes, data is continually fed into the machine learning process, so as to make
those results more relevant the next time.
Rank Brain – Known Relationships.
102. @schachin #StateOfSearchKristine Schachinger
Rank Brain vs Neural Matching.
Rank Brain = concepts
Neural Matching = linking words to concepts
“…neural matching, – AI method to better connect words to concepts.” - Google
103. @schachin #StateOfSearchKristine Schachinger
Rank Brain vs Neural Matching.
A Google patent related to Rank Brain and Neural Matching
describes a system that uses traditional ranking factors to decide
what is relevant, but NOT what is in the top 10.
Which may be re-ordered post retrieval according to
“ad hoc retrieval” methods and ”dynamic relevancy”
https://www.searchenginejournal.com/google-neural-matching/271125/
115. @schachin #StateOfSearchKristine Schachinger
Sesame Street and Search: Why is BERT Special?
• Language modeling is an effective task for using unlabeled data to pretrain neural networks in NLP
• Traditional language models take the previous n tokens and predict the next one. In contrast, BERT trains a
language model that takes both the previous and next tokens into account when predicting.
• BERT is also trained on a next sentence prediction task to better handle tasks that require reasoning about
the relationship between two sentences (e.g. question answering)
• BERT uses the Transformer architecture for encoding sentences.
• BERT performs better when given more parameters, even on small datasets.
125. @schachin #StateOfSearchKristine Schachinger
Write holistic content.
Use terms that are semantically related.
For a detailed explanation Google explains here > https://www.youtube.com/watch?v=vzoe2G5g-w4&feature=youtu.be&t=32m19s
126. @schachin #StateOfSearchKristine Schachinger
Write holistic content.
Use terms that are semantically related.
For a detailed explanation Google explains here > https://www.youtube.com/watch?v=vzoe2G5g-w4&feature=youtu.be&t=32m19s
127. @schachin #StateOfSearchKristine Schachinger
Write holistic content.
DOES YOUR CONTENT HAVE DEPTH AND WIDTH?
For a detailed explanation Google explains here > https://www.youtube.com/watch?v=vzoe2G5g-w4&feature=youtu.be&t=32m19s
128. @schachin #StateOfSearchKristine Schachinger
Write holistic content.
Use terms that are semantically related.
For a detailed explanation Google explains here > https://www.youtube.com/watch?v=vzoe2G5g-w4&feature=youtu.be&t=32m19s
135. @schachin #StateOfSearchKristine Schachinger
Takeaways.
• Think Search Queries NOT Simple Keywords
• Write in natural, conversational language
• Write using holistic content
• Focus on depth and breadth with related terms
• Add Structured Data
• Use well formed text (ie questions) when you can.
Takeaways.
138. @schachin #StateOfSearchKristine Schachinger
• Learn how to code tensorflow
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0
• NLU vs NLP: What’s the Difference?
https://www.bmc.com/blogs/nlu-vs-nlp-natural-language-understanding-
processing/
• BERT State of Art Pre Learning AI
https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
• GITHUB or BERT
https://github.com/google-research/bert
• GOOGEL PAPER: BERT: Pre-training of Deep Bidirectional Transformers for
Language Understanding
https://arxiv.org/abs/1810.04805
• Google Brings in BERT to Improve its Search Results
https://techcrunch.com/2019/10/25/google-brings-in-bert-to-improve-its-
search-results/
• Google Blog on BERT
https://www.blog.google/products/search/search-language-understanding-bert/
• Answering Questions Using the Knowledge Graph
https://gofishdigital.com/answering-questions-using-knowledge-graphs/
More About BERT and NLP.