BrightonSEO, July 2021 - To better understand a website's content search engines developed Web Rendering Services and are now able to render pages more or less like a normal user. Those Web Rendering Services are strictly connected to other phases of the crawling-indexing-ranking pipeline - if a rendering fails, it may affect all of them. In this session Giacomo will guide you through the process of understanding why rendering could be a problem also for non-Javascript pages, how to manually debug page rendering, the difference between understanding WRSs' capabilities and debugging problems on a website, and eventually how to test pages at scale.
2. Hi, I’m Giacomo. Technical
Director at Verve Search.
Technical background and
previous experiences in
development.
@giacomozecchini
#brightonSEO
3. Today we are going to talk about
rendering errors, the challenges
of debugging at scale and a new
approach to solve these issues.
@giacomozecchini
#brightonSEO
4. The search engine's rendering
process is very similar to
Schrödinger's cat paradox.
https://en.wikipedia.org/wiki/Schrödinger's_cat
@giacomozecchini
#brightonSEO
5. A hypothetical cat page may be
considered simultaneously both
alive correctly rendered and
dead not correctly rendered.
@giacomozecchini
#brightonSEO
7. Search engines get web pages
and put them in web rendering
services.
https://developers.google.com/search/docs/guides/javascript-seo-basics
@giacomozecchini
#brightonSEO
8. Inside the web rendering
services, the pages are rendered
similarly to a browser.
https://developers.google.com/search/docs/guides/javascript-seo-basics
@giacomozecchini
#brightonSEO
9. Then, the search engines can
extract all information they
need from those rendered
pages.
https://developers.google.com/search/docs/guides/javascript-seo-basics
@giacomozecchini
#brightonSEO
10. This is an oversimplification of a
complex process.
https://www.youtube.com/watch?v=Qxd_d9m9vzo
@giacomozecchini
#brightonSEO
11. If you want to know more about
this I’d suggest to watch Martin
Splitt’s TechSEO Boost 2019
talk.
https://www.youtube.com/watch?v=Qxd_d9m9vzo
@giacomozecchini
#brightonSEO
17. When is a page not
correctly rendered?
@giacomozecchini
#brightonSEO
18. A page is “not correctly
rendered” when is not possible
for the WRS to get an asset or
when an error blocks the
process.
@giacomozecchini
#brightonSEO
19. Not only pages with Javascript
have problems!
@giacomozecchini
#brightonSEO
20. Let's have a look at a few
examples...
@giacomozecchini
#brightonSEO
21. HTTP / DNS / Network errors
@giacomozecchini
#brightonSEO
https://developers.google.com/search/docs/advanced/crawling/http-network-errors
Crawler
WRS
Cache
SEARCH ENGINE
* Icons made by Freepik from www.flaticon.com
22. Robots.txt blocks a resources
@giacomozecchini
#brightonSEO
https://developers.google.com/search/docs/advanced/robots/intro
Crawler
WRS
Cache
SEARCH ENGINE
* Icons made by Freepik from www.flaticon.com
25. Cache mismatch, user
permission for specific
features (e.g. geolocation),
service worker registration,
Javascript syntax errors, etc.
@giacomozecchini
#brightonSEO
26. What if a page is not
correctly rendered?
@giacomozecchini
#brightonSEO
27. If WRS can’t get your CSS the
page layout won’t be correct and
you may also have Mobile
Usability issues.
@giacomozecchini
#brightonSEO
28. If WRS can’t get or execute your
JS files correctly, your page may
be blank or broken.
@giacomozecchini
#brightonSEO
29. Eventually, WRS may need to
render again your page, which
means slower indexing.
@giacomozecchini
#brightonSEO
46. I started my research by getting
and printing the information I
needed on the page with some
Javascript, in a hidden <DIV>.
@giacomozecchini
#brightonSEO
47. <html>
…
<div id="info" style="display:none"></div>
…
<script>
…
function getInformation(){
// do stuff!
}
…
var div = document.getElementById("info");
var p = document.createElement("p");
p.innerText = getInformation();
div.appendChild(p);
…
</script>
…
</html>
@giacomozecchini
#brightonSEO
This prints the
information you need
in the DIV at
rendering time and
then you can get
them in Search
Console view crawled
page HTML.
48. But waiting for a page to be
crawled, rendered and indexed
again is time consuming and not
scalable.
@giacomozecchini
#brightonSEO
49. It’s a nice way of discovering
new things but you still have to
manually check all pages.
@giacomozecchini
#brightonSEO
50. Then, I thought of using 1x1 px
images, appending errors or
information in the URL:
https://www.example.com/image.jpg
?u=page_url&e=error
@giacomozecchini
#brightonSEO
51. The idea was to look in the
server access log and find all
errors that occurred during the
rendering.
@giacomozecchini
#brightonSEO
52. But Google’s WRS doesn’t
download images during the
rendering of a page.
@giacomozecchini
#brightonSEO
60. @giacomozecchini
#brightonSEO
CHROMIUM INSTANCE
* Icons made by Freepik from www.flaticon.com
SEARCH ENGINE
Crawler
SERVER
What if one of those Javascript sends a non
cacheable POST request to an external server?!
POST
REQUEST
61. @giacomozecchini
#brightonSEO
There are multiple ways of
sending POST requests in JS:
Fetch API
https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch
Navigator.sendBeacon()
https://developer.mozilla.org/en-US/docs/Web/API/Navigator/sendBeacon
XMLHttpRequest.send()
https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/send
63. @giacomozecchini
#brightonSEO
TIME URL CATEGORY ERROR
25/10/1985 09:00:00 https://www.example.com Fetch https://www.example.com/style.css
21/10/2015 07:28:00 https://www.example.com/about.html Fetch https://www.example.com/app.js
12/11/1955 06:38:00 https://www.example.com Javascript File: https://www.example.com/app.js Line: 3 Col: 2
Error: Uncaught ReferenceError: APP is not defined
When you have everything in a database you can query the tables
and do all your analysis. You can also have automatic alerts, etc.
68. Debugging example #2
Know if there is a problem
downloading CSS or JS files
@giacomozecchini
#brightonSEO
69. <html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<script>
…
window.addEventListener('error', function(err) {
if (isDownloadError(err)){
sendMessageToServer(err);
}
}, true);
…
</script>
…
</head>
…
</html>
@giacomozecchini
#brightonSEO
If there is an error and it's a CSS or
JS load error you can send a
message back to the server. This
works for HTTP/DNS/Network errors,
Robots.txt, fetch timeouts, etc.
71. There are some products out
there but all of them focus on
users and not on search
engines.
@giacomozecchini
#brightonSEO
72. Search engines are different
and you need to solve
different problems.
@giacomozecchini
#brightonSEO
73. You should be careful adding
new code to your website!
@giacomozecchini
#brightonSEO
74. Web Performance issues
You don’t want to slow down
the user experience with
something you need only for
search engines.
@giacomozecchini
#brightonSEO
75. Web Performance issues
Check for the User-Agent and
run the script only for search
engines.
@giacomozecchini
#brightonSEO
76. Crawl budget
You don’t want to consume
your crawl budget on these
requests.
@giacomozecchini
#brightonSEO
77. Crawl budget
Host your debugging server on
a different domain or
subdomain.
@giacomozecchini
#brightonSEO
78. There are many other possible
problems, you just need to find
a solution for them.
@giacomozecchini
#brightonSEO