Search Engine Friendly Design (Focus on Online Shops), SMX Munich 2014

Search-Engine Friendly Design
Focus on Online Shops
Jan Hendrik Merlin Jacob
! @jhmjacob 
" fb.me/jhmjacob 
# hjacob.com/

About me
Entrepreneur:
> 2002: Gamona.de
> 2006: JobAustralia Ltd.
> 2008: Evenity GmbH
> 2012: OnPage.org GmbH 
Sidenotes:
> 2003/2004: World travel
> 2013: Young-Entrepreneur of the Year,
Startup of the Year  
(by Association of German Internet, eco)
Best Bavarian Startup 
(by kfw Bank) 
Studies:
> Business Informatics  
(University of Marburg / University of Hagen)
> Dialog- & Online-Marketing  
(Bavarian Academy of Marketing)

Agenda
- Basics
- Title Tag
- Good Architecture
- Webmaster Tools 
- Advanced
- Duplicate Content
- The Crawl Budget
- Blocking / Deindexing Content

Basics: Title Tag
What is the „Title Tag“?
<!DOCTYPE HTML> 
<html lang="de">"
<head>"
<title>Der offizielle Online-Shop des
Hofbräuhaus München | Fanartikel
online exklusiv shoppen </title>!
...

Basics: Title Tag
What is the „Title Tag“?

Basics: Title Tag
- Myth Buster: The Title Tag is one of the primary ranking factors
- True story: The title has high impact on click rates („CTR“)!
- Side-Eﬀect: A good CTR and a low Bounce-Rate lead to  
better rankings. 
By the way: Those Metrics are called „User Intent Data“
- Keyword Scoring: A keyword mention in the title probably has 
a weightening bonus (compared to a mention throughout the body).

Basics: Title Tag
Bad Title Better Title
Titles with Focus on User Intent
Rule of thumb:  
Title should invite users to click, but don´t
promise anything your site can´t keep

Basics: Title Tag
Special Title for Social Networks
Why? 
Higher Focus on Call-To-Action. Adopt language of audience.
<meta property="og:title" content="Zieh dir die neusten Board…“ />

Basics: Good Architecture
Inspired by: spottedpanda.com

Basics: Google Webmaster Tools
What are the Google Webmaster Tools?
PS: Bing also oﬀers nice Webmaster Tools!
webmasters.google.com

See the amount of indexed pages

Setting up the preferred Domain

Advanced: Duplicate Content
What is „Duplicate Content“?
hofbraeuhaus-shop.de www.hofbraeuhaus-shop.de
= =
Three diﬀerent URLs, with each 100% content
www.hofbraeuhaus-shop.de/

And even more …
and so on … in total 8 diﬀerent URLs for the same page (!)
www.hofbraeuhaus-shop.de/index.php
https://www.hofbraeuhaus-shop.de/index.php
https://hofbraeuhaus-shop.de/index.php
https://hofbraeuhaus-shop.de/

- Myth Buster: Duplicate Content leads to ranking penalties
- True story: A lot of duplicate content will bore Search Engines and  
they will rather crawl other domains with more „original“ content
- This means: Duplicate Content does not cause a penalty,  
but it will lower your „Crawl Budget“, which means that 
less pages get the chance to get crawled + indexed.
- Sidenode: By the time anyone in the internet links to your wrong 
(Sub-)Domain or protocol (http/https) and you do not capture it  
correctly you are running into DC problems!

Advanced: The Crawl Budget
- „Crawl Budget“  
= The Amount of time the (Google-)Bot spends on your domain
- Search Engines have to allocate their own ressources 
-> SE Primary aim: Providing the best results for its own users 
-> Will focus on pages with original content 
-> Trying not to spend too much time/money on spam pages 
-> Brand + Unique Content is important

- Conclusion 
-> Most times: „Less is more!“ 
-> Focus on a rather small amount of pages which provide 
outstanding content
nytimes.com jameda.de moo.com

- Conclusion 
-> If you own tons (10k+) pages and all (!) of them provide good 
original content, which also receives updates time by time  
(user reviews for instance), make sure you provide a good  
site structure 
-> Every page should be reachable within 4-5 clicks from the  
homepage. Sitemap.xml and navigable sitemaps will help.

Advanced: Blocking / Deindexing Content
In case you are not able to delete „low quality“ content  
or duplicate content there are several techniques to help 
Search Engines to better understand your Website: 
 
-> „noindex“ Flag 
-> „nofollow“ Flag
-> Canonical Tag 
-> robots.txt 
-> Redirects 
-> Webmaster Tools

„noindex“ Flag 
 
Can be set in:  
Response-Header or Meta Section of Document 
 
Will lead to: 
Google will crawl this pages and will afterwards see, that it is  
not meant to be displayed in the search results.

„noindex“ Flag 
 
Pro: 
Block contents you dont want to see in search results 
 
Contra: 
You crawl budget gets consumed, as the Search Engine still  
needs to crawl the page before it can see it is „noindexed“. 
All links on this page will be crawled as well (!).

„nofollow“ Flag 
 
Can be set in:  
Response-Header, Meta Section of Document or as 
Attribute of Hyperlinks 
 
Will lead to: 
The links marked as nofollow won’t pass any link juice  
(but still will be crawled!)

„nofollow“ Flag 
 
Pro:  
Used to remove the „recommendation“ character of  
link (-> no page rank / link juice is passed) 
 
Contra: 
Widely misused as many people think it tells the Search 
Engine Bot not to crawl the linked page. 
When used on internal links it can harm your own pages, 
as link juice is thrown away without a need.

„Canonical“-Tag 
 
Can be set in:  
Response-Header or Meta Section of Document 
 
Will lead to: 
Similar result as the „noindex“-ﬂag but combined with 
the information, that there is another URL which is the 
one supposed to rank in search engines.
<link rel="canonical" href="http://example.com/unterseite.html"/>

„Canonical“-Tag 
 
Pro:  
It helps the Search Engines to determine which URL is the  
original content and which other URLs are just copies of that. 
Good tool to handle wildcard subdomains and other stuﬀ,  
if the IT guys can’t ﬁx the DC Problem correctly. 
 
Contra: 
Crawl-Budget is spended, as the Bot will see this information  
only after crawling the page. It rather helps on smaller DC Problems.
<link rel="canonical" href="http://example.com/unterseite.html"/>

„unavailable-after“ Flag
!
What is it for:  
Helps Google to understand if a page has a expiry date.  
Useful if you know, that an item won’t be in stock again.  
Tells the Bot that this page is irrelevant in future and the  
crawl budget should rather be used other pages.  
Could be used in combination with a canonical tag. 
But: Use with care!
<META NAME="GOOGLEBOT" CONTENT="unavailable_after: 25-Aug-2014 15:00:00 EST“/>

robots.txt
!
What is it for:  
Block pages based on URL patterns.  
You can set up rules based on UserAgent (=> Bots).
User-agent: Googlebot
Disallow: /search/
!
User-agent: *
Disallow: /cache/
PS: Do not put admin interfaces into robots.txt

robots.txt 
 
Pro:  
Easily mark complete folders / patterns as „disallowed“,  
without editing code. 
 
Contra: 
Pages may still get listed in search results (because of external links). 
Heavy blocking via robots.txt may result in „headless“ link-graphs.

See blocked pages

- In case you’re IT team can not handle some kind of Duplicate Content 
issues, the Webmaster Tools can be used to block certain pages from 
crawling - so Search Engines can focus on your „real“ content.
- But keep in mind: You better handle those DC problems at their 
Source instead of „trouble-shooting“ …
Google Webmaster Tools

Block pages based on URL parameters

Thanks!
Jan Hendrik Merlin Jacob
! @jhmjacob 
" fb.me/jhmjacob 
# hjacob.com/blog/ 
(you can ﬁnd the slides here!) 
!
 
OnPage.org GmbH
! @onpage_org 
" fb.me/onpage.org 
# de.onpage.org 
!

Search Engine Friendly Design (Focus on Online Shops), SMX Munich 2014

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Empfohlen

Empfohlen (20)

Search Engine Friendly Design (Focus on Online Shops), SMX Munich 2014