Get practical technical SEO advice from SEO experts: Hosted by Jason Barnard, "The Brand SERPs guy" with three great speakers: Tom Pool, Technical SEO Director at BlueArray; Faisal Anderson, Technical SEO EMEA @ LiveArea and Julien Deneuville, Owner / Freelance consultant @ Databulle.
In this short ~20 minute talk they present bite-sized technical SEO advice covering everything you need to know and more about server log file analysis and crawling for SEO. There talks are offered free to the SEO community working from home during the coronavirus pandemic.
Watch a recording of the stream to go with these slides here:
https://www.youtube.com/watch?v=Mw3YEYsVQOE
Visit https://www.authoritas.com for more SEO advice and SEO tools and data to help you drive more organic traffic to your ecommerce stores.
SEO Server Log File Analysis - What You Should Be Looking For - Tea-Time SEO Series of Daily SEO Live Talks
1.
2. Log File Analysis - What You Should Be
Looking For
HOST: Jason Barnard
Speakers:
● Tom Pool
● Faisal Anderson
● Julien Deneuville
AUTHORITAS
● SEO Jo Blogs - Growth Marketer
● Carrie Shepherd - Marketing Executive
3. Faisal Anderson
● Technical SEO EMEA @ LiveArea
● SEO for global eCommerce brands
● Loves Python and Tech SEO
● Speaker, trainer and Moz contributor
@faisalanderson
4. Tom Pool
● Technical SEO Director at BlueArray
● Look after technical output of the
agency
● Love speaking & training on all things
SEO (especially Tech SEO)
● Don’t drink tea (or coffee) but cake is
good!@cptntommy
5. ● Owner / Freelance consultant @
Databulle
● Technical SEO / Data / Python
● OnCrawl ambassador
● Also speaker / trainer / event
organizer
● I live in Reims, France 🍾
Julien Deneuville
@diije
6. Server Log File Analysis - What Should You Be Looking For?
● Communicate what data you need clearly (Understand your client’s
setup).
● Take a structured approach to Log File Analysis (Crawl Behaviour, Crawl
Budget Waste, Site Health).
● Use automation and data visualisation to explain ROI and a clear picture
of the issue to clients.
Top Tips from Faisal Anderson
7. Communicate what data you need clearly.
● Depending on the server configuration of your client, log files will be in
different places.
● Ensure when communicating what log files you need you know:
○ If they are using Load Balancing, could separate log files across
different servers.
○ Whether a CDN will store log files elsewhere.
○ Whether you are getting the correct hostname for your log data. (e.g.
https, http)
Tip 1 - Understand your Client’s setup
8. Crawl Behaviour, Crawl Budget Waste, Site Health
● Separating your analysis into stages can help diagnosing a problem
where the origins are not exactly clear or you are doing a general audit.
● I use a three stage process:
○ Analyse Crawl Behaviour
○ Analyse areas of Crawl Budget Waste
○ Analyse the Site Health (how the site responds to crawling)
● Use the right tool for the job (Excel vs Screaming Frog Log File Analyser vs
Jupyter + Python)
Tip 2 - Take a structured approach
9. Using Automation and Data Visualisation
● As with all SEO, we need stakeholder buy in for any decision. Using
visualisation we can show clearer ROI for abstract SEO concepts. X metric
leads to Y outcome, not X is wrong.
e.g. URL by Click-depth/Subdirectory vs Requests, Status Codes over
Time.
● Automate with Python and visualise with Seaborn
Tip 3 - Demonstrate Clear ROI
10. Server Log File Analysis - What Should You Be Looking For?
● Crawl Budget Wastes
● Real Googlebots / Pretend Googlebots
● Combining Crawl Data with Logs can really supercharge insights
● Referrer data can be a goldmine - if logs are set up to capture this!
● Bonus - Learn Pandas with Python for ease of data manipulation
Top Tips from Tom Pool
11. Some areas include:
● Duplicate content, due to parameters, forms or other weird things
● Most & Least crawled URLs - why? Why are pages being crawled
more/less? Is there maybe some waste occurring?
● Consider linking from most linked pages more, to less crawled - IF THIS
MAKES SENSE (don’t link from most popular page to a T&C’s page for an
obscure service)
Tip 1 - Crawl Budget Wastes
12. Real Googlebots / Pretend Googlebots
● Make sure that the logs you are looking at have been verified.
● I’ve been guilty as crawling as Googlebot - I’m sure there is some fake
data in your logs too!
● Don’t make assumptions based on fake data.
Tip 2 -
13. Combining Crawl Data with Logs can really supercharge
insights
● Use [crawl data] x [log data] to really amp things up
● Are there URLs found in one dataset that aren’t found in the other? Why?
● Does internal linking found in the crawl match up with log data?
○ I’ve seen instances where the pages that are crawled by Google
reflect the IA of the site
Tip 3 -
14. Referrer data can be a goldmine - if logs are set up to capture
this!
● Referrer data is awesome
● If possible, set logs up to capture this data
● Then you can see where requests have come from
● Can also ID popular entry & exit pages
● Which site sends the most referral traffic
● Not just bots! User data also!
Tip 4 -
15. Investigate, Investigate & Investigate.
● Investigate as much as possible.
● Once you think you’ve done, investigate some more
● Utilise all tools at your disposal. Personal favourite is Pandas (python) for
data manipulation.
Tip 4.5 -
16. Server Log File Analysis - What Should You Be Looking For?
● Check if your data is reliable
● Use logs as a monitoring tool
● Work with IT / Data teams
● Custom KPI: hits / session
Top Tips from Julien Deneuville
17. Compare volumes
● Are the required data fields there?
○ You need at least URL or path, response code, user agent, timestamp
○ Bonus: IP address, referrer, method, response weight, response time, ...
● Are there as many hits as expected?
○ Compare number of distinct URLs with your crawl of the website
○ Compare number of hits with the data from (old) Google Search Console
Tip 1 - Reliable data
18. Watch bots behavior
● Crawl errors
● Crawl per page type
● User agent (desktop vs mobile)
Tip 2 - Logs for monitoring
19. Share tools, data and costs
● You definitely need to work with IT to access log files
● Your IT team probably already monitors server logs (for purposes other
than SEO)
● Your Data / BI team might have tools that you can use
● If they don’t, share the costs with them for a premium tool such as
OnCrawl
○ SEO monitoring for you
○ Monitoring / QA for the devs
○ Valuable data for BI
○ ...
Tip 3 - Work with other teams
20. Hits per session
● How many Googlebot hits needed
for each SEO visit?
● Quick overview of which parts of
your site need attention
Tip 4 - Custom KPI
21. Thank you - over to Q and A
● All great tips from our experts:
● Tom Pool
● Faisal Anderson
● Julien Deneuville
@faisalanderson
@cptntommy
@diije
22. “Google For Jobs"
● Marco Bonomo
● Matt Hunt
Thursday 7th May 2020 @ 4 p.m. SEO Advice, tea and cake with...