2. About me
Entrepreneur:
> 2002: Gamona.de
> 2006: JobAustralia Ltd.
> 2008: Evenity GmbH
> 2012: OnPage.org GmbH
Sidenotes:
> 2003/2004: World travel
> 2013: Young-Entrepreneur of the Year,
Startup of the Year
(by Association of German Internet, eco)
Best Bavarian Startup
(by kfw Bank)
Studies:
> Business Informatics
(University of Marburg / University of Hagen)
> Dialog- & Online-Marketing
(Bavarian Academy of Marketing)
3.
4. Agenda
- Basics
- Title Tag
- Good Architecture
- Webmaster Tools
- Advanced
- Duplicate Content
- The Crawl Budget
- Blocking / Deindexing Content
5. Basics: Title Tag
What is the „Title Tag“?
<!DOCTYPE HTML>
<html lang="de">"
<head>"
<title>Der offizielle Online-Shop des
Hofbräuhaus München | Fanartikel
online exklusiv shoppen </title>!
...
8. Basics: Title Tag
- Myth Buster: The Title Tag is one of the primary ranking factors
- True story: The title has high impact on click rates („CTR“)!
- Side-Effect: A good CTR and a low Bounce-Rate lead to
better rankings.
By the way: Those Metrics are called „User Intent Data“
- Keyword Scoring: A keyword mention in the title probably has
a weightening bonus (compared to a mention throughout the body).
9. Basics: Title Tag
Bad Title Better Title
Titles with Focus on User Intent
Rule of thumb:
Title should invite users to click, but don´t
promise anything your site can´t keep
10. Basics: Title Tag
Special Title for Social Networks
Why?
Higher Focus on Call-To-Action. Adopt language of audience.
<meta property="og:title" content="Zieh dir die neusten Board…“ />
15. Agenda
- Basics
- Title Tag
- Good Architecture
- Webmaster Tools
- Advanced
- Duplicate Content
- The Crawl Budget
- Blocking / Deindexing Content
16. Advanced: Duplicate Content
What is „Duplicate Content“?
hofbraeuhaus-shop.de www.hofbraeuhaus-shop.de
= =
Three different URLs, with each 100% content
www.hofbraeuhaus-shop.de/
17. And even more …
and so on … in total 8 different URLs for the same page (!)
www.hofbraeuhaus-shop.de/index.php
https://www.hofbraeuhaus-shop.de/index.php
https://hofbraeuhaus-shop.de/index.php
https://hofbraeuhaus-shop.de/
Advanced: Duplicate Content
18. - Myth Buster: Duplicate Content leads to ranking penalties
- True story: A lot of duplicate content will bore Search Engines and
they will rather crawl other domains with more „original“ content
- This means: Duplicate Content does not cause a penalty,
but it will lower your „Crawl Budget“, which means that
less pages get the chance to get crawled + indexed.
- Sidenode: By the time anyone in the internet links to your wrong
(Sub-)Domain or protocol (http/https) and you do not capture it
correctly you are running into DC problems!
Advanced: Duplicate Content
19. Advanced: The Crawl Budget
- „Crawl Budget“
= The Amount of time the (Google-)Bot spends on your domain
- Search Engines have to allocate their own ressources
-> SE Primary aim: Providing the best results for its own users
-> Will focus on pages with original content
-> Trying not to spend too much time/money on spam pages
-> Brand + Unique Content is important
20. Advanced: The Crawl Budget
- Conclusion
-> Most times: „Less is more!“
-> Focus on a rather small amount of pages which provide
outstanding content
nytimes.com jameda.de moo.com
21. Advanced: The Crawl Budget
- Conclusion
-> If you own tons (10k+) pages and all (!) of them provide good
original content, which also receives updates time by time
(user reviews for instance), make sure you provide a good
site structure
-> Every page should be reachable within 4-5 clicks from the
homepage. Sitemap.xml and navigable sitemaps will help.
23. Advanced: Blocking / Deindexing Content
In case you are not able to delete „low quality“ content
or duplicate content there are several techniques to help
Search Engines to better understand your Website:
-> „noindex“ Flag
-> „nofollow“ Flag
-> Canonical Tag
-> robots.txt
-> Redirects
-> Webmaster Tools
24. „noindex“ Flag
Can be set in:
Response-Header or Meta Section of Document
Will lead to:
Google will crawl this pages and will afterwards see, that it is
not meant to be displayed in the search results.
Advanced: Blocking / Deindexing Content
25. „noindex“ Flag
Pro:
Block contents you dont want to see in search results
Contra:
You crawl budget gets consumed, as the Search Engine still
needs to crawl the page before it can see it is „noindexed“.
All links on this page will be crawled as well (!).
Advanced: Blocking / Deindexing Content
26. „nofollow“ Flag
Can be set in:
Response-Header, Meta Section of Document or as
Attribute of Hyperlinks
Will lead to:
The links marked as nofollow won’t pass any link juice
(but still will be crawled!)
Advanced: Blocking / Deindexing Content
27. „nofollow“ Flag
Pro:
Used to remove the „recommendation“ character of
link (-> no page rank / link juice is passed)
Contra:
Widely misused as many people think it tells the Search
Engine Bot not to crawl the linked page.
When used on internal links it can harm your own pages,
as link juice is thrown away without a need.
Advanced: Blocking / Deindexing Content
28. „Canonical“-Tag
Can be set in:
Response-Header or Meta Section of Document
Will lead to:
Similar result as the „noindex“-flag but combined with
the information, that there is another URL which is the
one supposed to rank in search engines.
<link rel="canonical" href="http://example.com/unterseite.html"/>
Advanced: Blocking / Deindexing Content
29. „Canonical“-Tag
Pro:
It helps the Search Engines to determine which URL is the
original content and which other URLs are just copies of that.
Good tool to handle wildcard subdomains and other stuff,
if the IT guys can’t fix the DC Problem correctly.
Contra:
Crawl-Budget is spended, as the Bot will see this information
only after crawling the page. It rather helps on smaller DC Problems.
Advanced: Blocking / Deindexing Content
<link rel="canonical" href="http://example.com/unterseite.html"/>
30. „unavailable-after“ Flag
!
What is it for:
Helps Google to understand if a page has a expiry date.
Useful if you know, that an item won’t be in stock again.
Tells the Bot that this page is irrelevant in future and the
crawl budget should rather be used other pages.
Could be used in combination with a canonical tag.
But: Use with care!
<META NAME="GOOGLEBOT" CONTENT="unavailable_after: 25-Aug-2014 15:00:00 EST“/>
Advanced: Blocking / Deindexing Content
31. robots.txt
!
What is it for:
Block pages based on URL patterns.
You can set up rules based on UserAgent (=> Bots).
User-agent: Googlebot
Disallow: /search/
!
User-agent: *
Disallow: /cache/
PS: Do not put admin interfaces into robots.txt
Advanced: Blocking / Deindexing Content
32. robots.txt
Pro:
Easily mark complete folders / patterns as „disallowed“,
without editing code.
Contra:
Pages may still get listed in search results (because of external links).
Heavy blocking via robots.txt may result in „headless“ link-graphs.
Advanced: Blocking / Deindexing Content
34. - In case you’re IT team can not handle some kind of Duplicate Content
issues, the Webmaster Tools can be used to block certain pages from
crawling - so Search Engines can focus on your „real“ content.
- But keep in mind: You better handle those DC problems at their
Source instead of „trouble-shooting“ …
Advanced: Blocking / Deindexing Content
Google Webmaster Tools
35. Block pages based on URL parameters
Advanced: Blocking / Deindexing Content
36. Thanks!
Jan Hendrik Merlin Jacob
! @jhmjacob
" fb.me/jhmjacob
# hjacob.com/blog/
(you can find the slides here!)
!
OnPage.org GmbH
! @onpage_org
" fb.me/onpage.org
# de.onpage.org
!