SlideShare ist ein Scribd-Unternehmen logo
1 von 41
SEO &
STATISTICS
BUSINESS WORLD APPLICATIONS
Micah Fisher-Kirshner
Whatis
SearchEngineOptimization
The practice of increasing the quantity and quality
of traffic to your website through organic search
engine results via improvements in content, links,
user experience, and technical site structure.
The Boring Explanation
I forge
global fame
T h e M y s t e r i o u s E x p l a n a t i o n
Powerful
Ominous
May be a double-edge sword…
How SEO Needs Data Science
Today’sGoogle is
complicated
M u l t i - a l g o r i t h m i c
Content algorithms
Link algorithms
Technical structure
With much more:
UX, Domains, by Industries, by Intent,
by Device, etc.
Multifactorial
Machine learning
Neural networks
Time-based
Offline processes
Algorithmson
Content
Machine learning on determining low quality sites
based on content quality patterns at scale
originally need offline processes
Panda
Algorithmson
Links
Machine learning on determining patterns of
artificially bought links at scale originally needing
offline processes
Penguin
Algorithmson
Technical Structure
Multifactorial that understands what is an ad, what
the structures are, and what is above the fold,
penalizing anything with too much ad space
Page Layout (Ads)
Algorithmson
UserIntents
A neural network that better understands the intent
of a search, such as the use of prepositions like
“to” in a phrase
BERT
Algorithmson
Devices
Separated out what ranking factors mattered on
desktop queries vs mobile queries
Mobilegeddon
MicahFisher-Kirshner
Work Life Personal Life
• 10+ years doing SEO
• Director of SEO & Content @
Turn/River Capital
• President & Founder @
BayAreaSearch.org
• Econometrics background
• Modern nerd
• High school buddy of your professor
(unfortunately)
Doing Good Data Science
Necessityof
good models
S t e p - b y - s t e p
Transparent
Methodological
Industry expertise
Deferential
Data collection
Interaction effects
Interpretations
Categories
Visualizations
Reviews
TheRight
DataCollection
Consistent Removals
• Mobile vs desktop
• Time-series vs one-time
• Source quality
• Low correlations
Ondevice
rankings
On
temporal
effects
Factoring
Interactions
Positives Negatives
• Authority
• Branding
• Violations
• Spam
On
generating
branding
HavingaSolid
Interpretation
Reasonable Outliers
• Incredulous
• Endogeneity
• Large websites
• Esoteric subjects
On
critical
thinking
external links
attribution model
technical bug
brand validation
social shares
direct traffic
Google penalty
buying ads
social shares
direct traffic
Google penalty
buying ads
On
real-world
situations
Defining
Categories
Groups Subsets
• Search volume
• Positional
• Query intent
• Heteroskedasticity
On
desires
On
fluctuations The best place to hide a dead body is page two of Google
Proper
Visualizations
Graphs Numbers
• Scatterplots
• Error ranges
• Regression formats
• Confidence levels
On
terminology
highly
significant
On
studies
YourWork
Reviewed
Validation Predictive
• Peer review
• Avoiding overfits
• Consistent
• Reusable
On
avoiding
chagrin
On
notdoing
harm
On
selection
bias
Difficultywith SEO Data Science
Missing
features
W o r k i n g w i t h w h a t y o u h a v e
Costly analyses
Missing tools
Large uncertainties
Company priorities
Need to be heard more?
Need more enticing copy?
Need to improve what is important?
Need clarity in what you do?
Astartingplan
y = 𝜷0 + 𝜷1x1 + 𝜷2x2 + 𝜷3x3 + 𝜷4x4 + ...
+ 𝜷5x1x2 + 𝜷6x1x3 + 𝜷7x2x3 + ...
+ 𝜷8x1x2x4 + ... + 𝜷nxn + 𝜺
Withmultipleareas
y = 𝜷0 + 𝜮𝜷2(CONTENT)
+ 𝜮𝜷3(LINKS) + 𝜮𝜷4(DOMAIN)
+ 𝜮𝜷5(STRUCTURE) + 𝜮𝜷6(UX) + 𝜺
Each havetensof factors
y = ... + 𝜷2a(exact phrase title tag) + 𝜷2b(title tag length) + 𝜷2c(order in
title tag) + 𝜷2d(brand name used) + 𝜷2e(title tag relevance) + 𝜷2f(exact
phrase h1) + 𝜷2g(exact phrase largest font) + 𝜷2h(h1 relevance) +
𝜷2i(largest font relevance) + 𝜷2j(exact phrase body copy) + 𝜷2k(BM25
score) + 𝜷2l(Flesch-Kincaid readability) + 𝜷2m(exact phrase first 100
words) + 𝜷2n(exact phrase URL) + 𝜷2o(URL relevance) + 𝜷2p(exact
phrase title tag * order in title tag) + 𝜷2q(exact phrase title tag * largest font
* URL) + 𝜷2r(exact phrase title tag * BM25 score) + ... + 𝜺
On
fluctuatio
ns
Changing
of the
guard
Thankyou!
Twitter, LinkedIn, and the Internet
micahfk

Weitere ähnliche Inhalte

Was ist angesagt?

IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11
Steve Kemish
 
S E O For Social Media Boot Camp Rockford 200907
S E O For  Social  Media  Boot  Camp  Rockford 200907S E O For  Social  Media  Boot  Camp  Rockford 200907
S E O For Social Media Boot Camp Rockford 200907
Avery Cohen
 

Was ist angesagt? (17)

IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11
 
Small business conference marshall sponder - updated 6-24-12-ms
Small business conference   marshall sponder - updated 6-24-12-msSmall business conference   marshall sponder - updated 6-24-12-ms
Small business conference marshall sponder - updated 6-24-12-ms
 
An introduction session to SEO
An introduction session to SEOAn introduction session to SEO
An introduction session to SEO
 
SMXL Milan Schema Markup Masterclass
SMXL Milan Schema Markup MasterclassSMXL Milan Schema Markup Masterclass
SMXL Milan Schema Markup Masterclass
 
Social Media: Podcasting, Blogging and Social Networking
Social Media: Podcasting, Blogging and Social NetworkingSocial Media: Podcasting, Blogging and Social Networking
Social Media: Podcasting, Blogging and Social Networking
 
S E O For Social Media Boot Camp Rockford 200907
S E O For  Social  Media  Boot  Camp  Rockford 200907S E O For  Social  Media  Boot  Camp  Rockford 200907
S E O For Social Media Boot Camp Rockford 200907
 
Meet Up Internet Marketing Presentation
Meet Up Internet Marketing PresentationMeet Up Internet Marketing Presentation
Meet Up Internet Marketing Presentation
 
Digital Marketing Pillar by Anuj Tanwar
Digital Marketing Pillar by Anuj Tanwar Digital Marketing Pillar by Anuj Tanwar
Digital Marketing Pillar by Anuj Tanwar
 
Search EngineOptimization - 26 feb 2016
Search EngineOptimization - 26 feb 2016Search EngineOptimization - 26 feb 2016
Search EngineOptimization - 26 feb 2016
 
What the * is SEO
What the * is SEOWhat the * is SEO
What the * is SEO
 
Career Website Analytics - Webinar by J Walter Thompson INSIDE
Career Website Analytics - Webinar by J Walter Thompson INSIDECareer Website Analytics - Webinar by J Walter Thompson INSIDE
Career Website Analytics - Webinar by J Walter Thompson INSIDE
 
Understanding Media Analytics and Reporting
Understanding Media Analytics and ReportingUnderstanding Media Analytics and Reporting
Understanding Media Analytics and Reporting
 
Branding Every Touchpoint: Bringing Your Culture to Your Candidates
Branding Every Touchpoint: Bringing Your Culture to Your CandidatesBranding Every Touchpoint: Bringing Your Culture to Your Candidates
Branding Every Touchpoint: Bringing Your Culture to Your Candidates
 
Seo and analytics wk 2
Seo and analytics wk 2Seo and analytics wk 2
Seo and analytics wk 2
 
Rob Garner on Google Personalization, SMX Toronto March 2010
Rob Garner on Google Personalization, SMX Toronto March 2010Rob Garner on Google Personalization, SMX Toronto March 2010
Rob Garner on Google Personalization, SMX Toronto March 2010
 
Search Engine Optimization in 2016
Search Engine Optimization in 2016Search Engine Optimization in 2016
Search Engine Optimization in 2016
 
Clicks, Conversions and Crawls
Clicks, Conversions and CrawlsClicks, Conversions and Crawls
Clicks, Conversions and Crawls
 

Ähnlich wie SEO & Statistics Presentation by Micah Fisher-Kirshner for UC Davis Graduate Students in 2020

Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLarenUsing SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
Caleb Whitmore
 
Seo basics
Seo basicsSeo basics
Seo basics
ROI-DNA
 
Seo Training By Anand Saini
Seo Training By Anand SainiSeo Training By Anand Saini
Seo Training By Anand Saini
Dr,Saini Anand
 
Online marketing workshop april13
Online marketing workshop april13Online marketing workshop april13
Online marketing workshop april13
Sam shetty
 
Seo conference-microsoft 2013 short
Seo conference-microsoft 2013 shortSeo conference-microsoft 2013 short
Seo conference-microsoft 2013 short
Gerry Grant
 

Ähnlich wie SEO & Statistics Presentation by Micah Fisher-Kirshner for UC Davis Graduate Students in 2020 (20)

Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...Data analytics and SEO to grow your international business | John Caldwell | ...
Data analytics and SEO to grow your international business | John Caldwell | ...
 
SEO Training Masterclass by Siddharth Lal, Bruce Clay India for DMAasia
SEO Training Masterclass by Siddharth Lal, Bruce Clay India for DMAasiaSEO Training Masterclass by Siddharth Lal, Bruce Clay India for DMAasia
SEO Training Masterclass by Siddharth Lal, Bruce Clay India for DMAasia
 
Mastering Organic SEO by Siddharth Lal, BruceClay
Mastering Organic SEO by Siddharth Lal, BruceClayMastering Organic SEO by Siddharth Lal, BruceClay
Mastering Organic SEO by Siddharth Lal, BruceClay
 
Search Smart Marketing Westport CT Library Presentation October 2014
Search Smart Marketing Westport CT Library Presentation October 2014Search Smart Marketing Westport CT Library Presentation October 2014
Search Smart Marketing Westport CT Library Presentation October 2014
 
CHPRMS Fall 2018 Conference - SEO Fundamentals
CHPRMS Fall 2018 Conference - SEO FundamentalsCHPRMS Fall 2018 Conference - SEO Fundamentals
CHPRMS Fall 2018 Conference - SEO Fundamentals
 
Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLarenUsing SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
Using SEO in Google Analytics | Analytics Pros Webinar by Mark McLaren
 
Seo basics
Seo basicsSeo basics
Seo basics
 
Seo Training By Anand Saini
Seo Training By Anand SainiSeo Training By Anand Saini
Seo Training By Anand Saini
 
Decworkshop
DecworkshopDecworkshop
Decworkshop
 
Online marketing workshop april13
Online marketing workshop april13Online marketing workshop april13
Online marketing workshop april13
 
Web Analytics Training Course
Web Analytics Training CourseWeb Analytics Training Course
Web Analytics Training Course
 
How to Increase Web Site Conversions with Persuasive Design
How to Increase Web Site Conversions with Persuasive DesignHow to Increase Web Site Conversions with Persuasive Design
How to Increase Web Site Conversions with Persuasive Design
 
SEO for Small Business (or as We Like to Call It: Online Community Building)
SEO for Small Business (or as We Like to Call It: Online Community Building)SEO for Small Business (or as We Like to Call It: Online Community Building)
SEO for Small Business (or as We Like to Call It: Online Community Building)
 
The Power of SEO - Nordic eMarketing
The Power of SEO - Nordic eMarketingThe Power of SEO - Nordic eMarketing
The Power of SEO - Nordic eMarketing
 
Seo conference-microsoft 2013 short
Seo conference-microsoft 2013 shortSeo conference-microsoft 2013 short
Seo conference-microsoft 2013 short
 
Are You Invisible
Are You InvisibleAre You Invisible
Are You Invisible
 
Data analytics and SEO to grow your international business
Data analytics and SEO to grow your international businessData analytics and SEO to grow your international business
Data analytics and SEO to grow your international business
 
9 Steps to Search Engine Optimization (SEO) Success
9 Steps to Search Engine Optimization (SEO) Success9 Steps to Search Engine Optimization (SEO) Success
9 Steps to Search Engine Optimization (SEO) Success
 
Web Analytics & Online Monitoring Tools Training Seminar - Vorian Agency
Web Analytics & Online Monitoring Tools Training Seminar - Vorian AgencyWeb Analytics & Online Monitoring Tools Training Seminar - Vorian Agency
Web Analytics & Online Monitoring Tools Training Seminar - Vorian Agency
 
Web Analytics Training for Business Link
Web Analytics Training for Business LinkWeb Analytics Training for Business Link
Web Analytics Training for Business Link
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Rise and fall of Kulula.com, an airline won consumers by different marketing ...
Rise and fall of Kulula.com, an airline won consumers by different marketing ...Rise and fall of Kulula.com, an airline won consumers by different marketing ...
Rise and fall of Kulula.com, an airline won consumers by different marketing ...
 
Social Media Marketing Portfolio - Maharsh Benday
Social Media Marketing Portfolio - Maharsh BendaySocial Media Marketing Portfolio - Maharsh Benday
Social Media Marketing Portfolio - Maharsh Benday
 
Best 5 Graphics Designing Course In Chandigarh
Best 5 Graphics Designing Course In ChandigarhBest 5 Graphics Designing Course In Chandigarh
Best 5 Graphics Designing Course In Chandigarh
 
Enhancing Business Visibility PR Firms in San Francisco
Enhancing Business Visibility PR Firms in San FranciscoEnhancing Business Visibility PR Firms in San Francisco
Enhancing Business Visibility PR Firms in San Francisco
 
[Expert Panel] New Google Shopping Ads Strategies Uncovered
[Expert Panel] New Google Shopping Ads Strategies Uncovered[Expert Panel] New Google Shopping Ads Strategies Uncovered
[Expert Panel] New Google Shopping Ads Strategies Uncovered
 
How consumers use technology and the impacts on their lives
How consumers use technology and the impacts on their livesHow consumers use technology and the impacts on their lives
How consumers use technology and the impacts on their lives
 
Micro-Choices, Max Impact Personalizing Your Journey, One Moment at a Time.pdf
Micro-Choices, Max Impact Personalizing Your Journey, One Moment at a Time.pdfMicro-Choices, Max Impact Personalizing Your Journey, One Moment at a Time.pdf
Micro-Choices, Max Impact Personalizing Your Journey, One Moment at a Time.pdf
 
SP Search Term Data Optimization Template.pdf
SP Search Term Data Optimization Template.pdfSP Search Term Data Optimization Template.pdf
SP Search Term Data Optimization Template.pdf
 
Discover Ardency Elite: Elevate Your Lifestyle
Discover Ardency Elite: Elevate Your LifestyleDiscover Ardency Elite: Elevate Your Lifestyle
Discover Ardency Elite: Elevate Your Lifestyle
 
VIP Call Girls Dongri WhatsApp +91-9833363713, Full Night Service
VIP Call Girls Dongri WhatsApp +91-9833363713, Full Night ServiceVIP Call Girls Dongri WhatsApp +91-9833363713, Full Night Service
VIP Call Girls Dongri WhatsApp +91-9833363713, Full Night Service
 
Elevating Your Digital Presence by Evitha.pdf
Elevating Your Digital Presence by Evitha.pdfElevating Your Digital Presence by Evitha.pdf
Elevating Your Digital Presence by Evitha.pdf
 
Aligarh Hire 💕 8250092165 Young and Hot Call Girls Service Agency Escorts
Aligarh Hire 💕 8250092165 Young and Hot Call Girls Service Agency EscortsAligarh Hire 💕 8250092165 Young and Hot Call Girls Service Agency Escorts
Aligarh Hire 💕 8250092165 Young and Hot Call Girls Service Agency Escorts
 
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best PracticesInstant Digital Issuance: An Overview With Critical First Touch Best Practices
Instant Digital Issuance: An Overview With Critical First Touch Best Practices
 
Distribution Ad Platform_ The Role of Distribution Ad Network.pdf
Distribution Ad Platform_ The Role of  Distribution Ad Network.pdfDistribution Ad Platform_ The Role of  Distribution Ad Network.pdf
Distribution Ad Platform_ The Role of Distribution Ad Network.pdf
 
Cartona.pptx. Marketing how to present your project very well , discussed a...
Cartona.pptx.   Marketing how to present your project very well , discussed a...Cartona.pptx.   Marketing how to present your project very well , discussed a...
Cartona.pptx. Marketing how to present your project very well , discussed a...
 
Crypto Quantum Leap - Digital - membership area
Crypto Quantum Leap -  Digital - membership areaCrypto Quantum Leap -  Digital - membership area
Crypto Quantum Leap - Digital - membership area
 
W.H.Bender Quote 61 -Influential restaurant and food service industry network...
W.H.Bender Quote 61 -Influential restaurant and food service industry network...W.H.Bender Quote 61 -Influential restaurant and food service industry network...
W.H.Bender Quote 61 -Influential restaurant and food service industry network...
 
Unlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich ManuscriptUnlocking the Mystery of the Voynich Manuscript
Unlocking the Mystery of the Voynich Manuscript
 
personal branding kit for music business
personal branding kit for music businesspersonal branding kit for music business
personal branding kit for music business
 
10 Email Marketing Best Practices to Increase Engagements, CTR, And ROI
10 Email Marketing Best Practices to Increase Engagements, CTR, And ROI10 Email Marketing Best Practices to Increase Engagements, CTR, And ROI
10 Email Marketing Best Practices to Increase Engagements, CTR, And ROI
 

SEO & Statistics Presentation by Micah Fisher-Kirshner for UC Davis Graduate Students in 2020

Hinweis der Redaktion

  1. Hi and thank you for having me here today to talk about SEO & Statistics in a business world setting, particularly around optimizing for Google’s search engine.
  2. So first a simple introduction into what SEO, search engine optimization is. The boring explanation on it is that it is a field of marketing where we work to increase the quantity and quality of search engine traffic to one’s website from the non-paid, aka organic, search results through the use of optimizations around content, links, user experience, and technical site structure. https://searchengineland.com/guide/seo
  3. The more mysterious explanation is that /I forge global fame/ which just sounds so much more _powerful_, _omnimous_, but may be a double-edge sword But really, I just spend my days helping people ranking higher for their articles or product pages.
  4. With that intro out of the way, let’s open our books to Chapter 1 on how SEO needs data science… that’s how you do this in class, right Mike?
  5. Historically Google used to claim that it had a very complicated and heavy algorithm, but honestly, it mostly came down to having enough links with the right anchor text dictating what would rank highly on Google, hence this cartoon’s joke.
  6. Today, that is certainly no longer the case as Google really has developed a multi-algorithmic system that uses many of the recent advances in statistics. It is multifactorial, uses machine learning, even now using neural networks, takes a time-based approach on a number of processes that have sometimes had to be done offline and later pulled back in. And these systems are used across the numerous areas of what we look at with SEO: Content, Links, the Technical Structure of a website, and much more (UX, Domains, and whether by industries, intent, devices, etc.). Let’s take a quick look into this to understand where this is used and provide a better understanding of the work SEO has to go through to determine what it going on with a website on Google.
  7. One of the most well-known and first machine learning systems was used around content, nicknamed Panda, that determined at scale what sites were low quality. Here’s an example page of what would create a low-quality site if this was constant across the website. Imagine you had typed in “blue bicycles” on Google and landing on this page here, tell me what do you find frustrating, lacking, or odd about this? So, I’m looking for an actual blue bicycle, not a design of that one a bed sheet, so the relevance of this page is way off, but also that there is only one result, providing a poor choice of products, especially on a color that probably hues more towards green than a really blue color. And if a website was filled with a lot of this kind of junk, Google was able to determine that this website should, overall, be downgraded across its search results. https://www.searchenginejournal.com/google-algorithm-history/panda-update/ https://www.seo-theory.com/google-panda/
  8. The link focused machine learning system, called Penguin, came out after to determine at scale who were likely violating Google’s guidelines by artificially buying links. Imagine Google, upon crawling the web, finds a number of webpages with a pattern like this article here. What do you notice that seem out of place? So, you have an article talking about health and getting fit with random sets of links talking about pay day loans. And if you see this consistently going to the same website over and over, the likelihood is that this is a purposeful-driven campaign by the site to boost their own rankings. How do they know that part? *shrugs* Its probably why they later modified this algorithm to just ignore the links instead. https://webmasters.googleblog.com/2012/04/another-step-to-reward-high-quality.html
  9. Annoying ads are often the bane of the Internet, and the amount was getting too much for even Google, forcing it to develop an algorithm that could determine based on what showed above the fold, whether there were too many ads being displayed and penalizing those pages from showing up on Google as users became frustrated with trying to find out where the main content really was. https://www.searchenginejournal.com/google-algorithm-history/page-layout/ https://webmasters.googleblog.com/2012/01/page-layout-algorithm-improvement.html
  10. One of the more recent advances, and personal favorite of mine, is Google finally getting to understand the importance in the use of prepositions with user’s intent. In this example, they were finally able to recognize that the “to” here meant that this was a Brazilian looking for a visa to travel to the US rather than vice-versa. Something that, to humans, is relatively simple, but required the use of neural networks to finally begin solving. Interestingly, unlike most of their major changes, this was one of those that SEOs like myself could do nothing about optimizing, because it was about developing a better set of search results based on the actual user intent vs based on finding better results by factoring in what matters more on a website. https://blog.google/products/search/search-language-understanding-bert
  11. For our final example, Google stays relevant by recognizing that the majority of users now use some form of a mobile device when browsing the Internet. As such, it developed systems that separated out factors that mattered on the mobile, but not the desktop, such as font size, and size of ads allowed. https://www.searchenginejournal.com/google-algorithm-history/mobile-friendly-update/
  12. So, a little segway to why I’m here today talking about SEO and Statistics. I’ve been doing SEO for over 10 years and the Director of SEO and Content at Turn/River Capital, a growth equity firm based in San Francisco focused heavily on using data to help our portfolio companies grow. I’m also the Founder and President of BayAreaSearch.org, an SEO meet-up association based in San Francisco as well. Personally, I have an econometrics background with a Master’s from UCSD and as you can probably guess from the photo here, I’m a bit of a nerd loving anything complicated. Oh, and I’m a high school buddy of your professor.
  13. Alright, let’s jump into the use of good data science with SEO and the things we need to keep in mind.
  14. Given the complications in today’s world with Google, it really requires the step-by-step process of having good models with data collection, interactions, interpretations, categorization of factors, visuals, and reviews, all of which founded upon being transparent and methodological working in conjunction with those in the industry while being deferential in how we talk about what we find. Let’s go into why this is the case.
  15. To the class here, why is it important, in general and in SEO, for us to have the right data collection? What themes do you think are important with collecting data? Well, we need to make sure that what we collect is consistent, i.e.: are we looking at mobile or desktop results – because Google varies the rankings on that. We need to make sure that we’re doing this over a time period instead of a one-time snapshot as blips in the data can happen. Source quality with where we get our data is very important as well, especially if we find that many of the variables are poorly correlated, which can affect our model and interpretations thereafter.
  16. For an example on device differentiation and why it is important to highlight that different, here’s how the query looks on the desktop and how it differs on the mobile device. We see two pages from the same site and even a video set of results higher up on mobile than we do on the desktop, suggesting different user intents as well.
  17. And, depending on our query terms, we have to keep in mind why it is important to run it repeatedly over time – if we use the example of Halloween school masks, well, what you’re looking for when it gets close to Halloween does dramatically differ and is something that Google takes into account as a result.
  18. So, jumping to the interactions that matter, what might be some areas in business that affects your view of those companies? What might be some online areas that Google might use to approximate this? Well, on the positive side, if you are seen as an authority in the space, knowing what you are talking about, this would be an area that Google wants to capture positively. The same goes for whether you’ve created a known brand, if people expect to see you come up for a certain topic, Google needs to make sure it accounts for that. On the flip side, if you are doing any kind of legal or guideline violations and just doing a lot of spam, it has to have ways to develop factors to account for this and reduce your ability to win out for doing such nefarious tactics.
  19. What can that look like? Well, on the branding side – if you’re well known as a college for international relations such that users are commonly searching for the brand plus the phrase, with topics highlighting it in a positive way, it may be a function of how Google will try to include it in a set of options when you start looking for the generic phrases like getting a Master’s in International Relations…
  20. Nonetheless it behooves us to think critically about what our interpretations really are of the data. What kind of problems have you noticed with analysts on data and what examples do you think might play a part in the SEO industry? Generally, your interpretations need to be reasonable, or at least backed up with an extraordinary amount of proof otherwise, and especially you will always need to double-check that you haven’t put in an endogeneity, where a causal factor actually affects the independent one you’re trying to solve for. Plus, there are always some kind of outliers, large sites often can get away with doing more of good or bad things because they have a larger pool to work with. And if you’re looking at fairly esoteric topics, there may just not be enough information out there to really develop good algorithms around.
  21. So, let’s talk about our critical thinking skills, because this has occurred sadly too often in the SEO industry on these kinds of topics, heavily leading to this kind of “it must be aliens” mentality without thinking what may really be playing a part. For example, a belief that social shares matter when it may just be the external links gained afterwards that impacts your rankings. Or that you have data problems with what attribution model you’re using given the lack of resources put into your model. doppelganger
  22. And really, whether what you’re modelling is applicable to the real-world. A real-world query for buying university clothing against a random string of letters is not going to generate a comparable effect of what you’re trying to analyze. The type of data that goes into Google’s algorithm requires a lot of content or information behind it and if it’s new or esoteric, the algorithms being used will in turn differ.
  23. Then there’s the need to understand whether what has been analyzed works across different categories – grouping by how often a query has been searched to how high up on Google a query is. Further when bucketing into subsets the users’ intent behind the query matters, and all that would need to be determined especially if you come up against the dreaded heteroskedasiticity issue of data that fans outwards.
  24. To give you an idea of what I mean by user intents: a user looking for software will have a set of pages and features that differ from one looking for information. Let me ask you, what do you expect to see on a page when looking for software vs looking for information?
  25. And with positions as a potential group, well, the deeper you go on Google, the more things fluctuate moving about as people don’t often go that deep which may mean that different algorithms play a part past the first page of Google. And this has created a classic joke within the SEO community that the best place to hide a dead body is page two of Google.
  26. Now, data analysis is only a part of what you are doing with data science – how you properly display your work makes a huge difference in understanding your points. What do you think are areas that matter when it comes to showing you work? We can basically break it down into two parts: Your graphs, where scatterplots and your error ranges are shown and your numbers, where you use proper regression layouts and notes about confidence your confidence intervals.
  27. So your terminology matters greatly – unfortunately within our industry, there’s been an abuse of the use of “highly” and “significant” when referencing basic correlation studies of “0.2” and less. All because they feel the use of Pearson’s coefficient and an analysis from that is sufficient to call it.
  28. The reason we see that in the SEO world is that there aren’t data scientists with knowledge of SEO (or partners with good SEOs) – so we get these kinds of results claiming causations or reasons for what factors lead to rankings based off one to one correlations. But leaving aside any knowledge of SEO, what are some issues with these three graphs? Well, as I mentioned before we see low correlation numbers being tossed out as highly significant, or just omitted entirely under a nebulous “important/not important”, to customized graphs where you spend 30 minutes just trying to understand what they are trying to get across.
  29. Which leads us to the final important theme today, getting your work reviewed! What are the things and reasons for getting your work reviewed? Well, being validated through peer reviews showcases you did your work correctly as well as making sure you avoid overfitting your data to make it work versus letting the data speak for itself. But it also helps to be used for a predictive measure, assuming that what you’ve created is consistent as well as reusable for others to leverage going forward. And there’s more to consider….
  30. Good reviewed works avoids getting embarrassed. We had a prominent industry folk nearly a decade ago present a correlation study showing that there was a correlation with Facebook Share data only to be slapped down in the next session saying they don’t even see that data in the first place. Reputation matters when it comes to what you do, which means you need to dot your ‘i’s and cross your ‘t’s
  31. And there’s the ethical component to consider for yourself and the industry. Unfortunately, a lot of SEO does involve thoughts about how to get links to get my site to rank higher for my desired queries without considering the negative impact they are doing within and without the industry by pushing bogus studies that promote themselves as being so accurate because it analyzed a million results.
  32. And we’ve only really talked about Google today, mainly because most Americans use it. Yet, that in of itself can be a selection bias depending on what you’re analyzing. Be honest, who here uses Bing over Google? Few/None, well, what’s the demographics behind that? Young, techy, and highly educated. Which can lead to problems when you’re trying to analyze claims of bias without taking that into account on politically sensitive topics.
  33. All that together still doesn’t fully grasp the difficulty that comes with doing data science in SEO.
  34. We often have to work with what we have because of missing features. Yes, we’re working to determine what websites need – be it being heard more, better enticing copy, improving what is supposed to be important, or being clearer in what a website does. However, it is costly with the tools we have, and unfortunately we still after over a decade of doing SEO, don’t have all the tools we really need to do a real job of analyzing Google, which of course means large uncertainties, and in a business environment, projects with uncertainties become a lower priority.
  35. So let’s tie it back to some simplistic math and what one would ideally want to do with studying how something ranks on Google for SEO – a linear regression with interaction effects.
  36. We would need separate areas for each major area that we optimize for, obviously including by device, intent, etc. not shown here.
  37. And determine within each area, such as the Content one here, how many of those factors we think matters in order to build the model we are trying to “hack” in to.
  38. Yet, as I mentioned, the tool sets early on were small and even as more have emerged, the right kind of tools still don’t really exist. I use this example for the marketing technology landscape as an exaggeration of what has been going on, because more tools are out there and choice paralysis and understanding which ones to use does make it difficult without that industry expertise.
  39. But again, most of us fell into being SEOs, few with advanced degrees, much less ones around statistics, so that makes it difficult for the field to expand in the ways that we really need to do proper analyses. https://outspokenmedia.com/seo/survey-results-educational-background-of-digital-marketers-seos-revealed/
  40. Fortunately, that is changing – that frustration is being turned around with a new group of developer-focused SEOs building open-source automations in Python or R to help supplement what is missing. And that’s where data science comes in to help really build out these new tools and analyze what is really going on behind the scenes with Google – all of course to help improve the user experience of what users want when searching on Google. https://github.com/jamesaphoenix/Python_For_SEO/ https://www.semrush.com/blog/python-content-briefs-seo/ https://www.searchenginejournal.com/python-seo-data-reference-guide/287927/ https://canonicalized.com/log-file-analysis-seo/
  41. Thank you for your time today and feel free to follow me around the web if interested under my moniker: micahfk