2. Agenda
• Introduction to SEO
• SEO Techniques
• Implementation of SEO in Endeca
• Implementation of SEO in ATG
3. SEO Introduction
• Search Engine Optimization (SEO) is a term used to
describe a variety of techniques for making pages
more accessible to web spiders (also known as web
crawlers or robots), the scripts used by Internet
search engines to crawl the Web to gather pages for
indexing. The goal of SEO is to increase the ranking
of the indexed pages in 'natural' search results.
5. SEO Introduction (contd)
• How Crawlers find pages:
• Web Crawlers uses sophisticated computer
programs to determine the list of urls, how many
and when from hundreds and thousands of
webservers.
• It begins the crawling process with the list of urls,
generated from past crawl process and augmented
with sitemap urls. As it crawls, it detects new links
(hrefs , image SRC) and adds it to list of urls to crawl
further.
6. SEO Introduction (contd)
• Indexing & Search
• As web crawlers visits pages, it gathers information
from pages and keywords, locations are indexed, so
enabling search and lookup faster.
• Just like index of a book with keywords and page
numbers.
• As you search using keywords, search engine using
sophisticated ranking algorithms to determine best
possible matches and retrieves the search results.
7. SEO Introduction (contd)
• Disallow crawlers from indexing your pages
• Using robot.txt –
• A file used to specify the urls of the site that should
not be crawled. (eg: service agreement, terms and
conditions).
• Also used to Specify the location of the sitemap xml
files
• Should be placed in root of the web site folder.
8. SEO Introduction (contd)
• Robot.txt format
• User agent: *
• Disallow: /terms.html
• Sitemap:http://www.example.com/sitemap.xml
• Exclude individual page or links from indexing:
• <meta name="robots" content="noindex"/>
• <a href="www.example.com" rel="nofollow"/>
• <meta name="robots" content="nofollow"/>
10. • Way of increasing the page ranking in search engine results.
• Make it look more like static URL.
• short friendly urls <2048 charac. with minimum query parameters
• Include as much information in URL in the form of key words to
increase the ranking of the indexed page.
URL Recoding
11. • Examples:
• (Bad SEO links for product pages – Rogers.com)
http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=PhoneThe
nPlan&productType=normal&productId_Detailed=IP6PL64GRY
http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=PhoneThe
nPlan&productType=normal&productId_Detailed=IP6PL64GLD
URL Recoding (contd)
13. • There is no differentiation between Gold and Gray models, due to
same dynamic URL for both Gray and Gold phones varying only in
query parameters.
• Doesn't satisfy customers of what he looks for due to unfriendly
SEO urls.
URL Recoding (contd)
14. • Good examples: (to be recoded to below URLs)
http://www.rogers.com/wireless/phones/IPhone6-36GB-Gray
http://www.rogers.com/wireless/phones/IPhone6-36GB-Gold
• Benefits:
• Improved page ranking
• Customers got what he looked for in the very first search result.
URL Recoding (contd)
15. Customers got what he looked for in the very first search result.
Good image
URL Recoding (contd)
16. Different URLs pointing to same page, will reduce the ranking for the
particular page.
Eg: www.rogers.com
www.rogers.com/web/Rogers.portal
www.example.com/phones
www.example.com/phones.jsp
Using Link tag with proper Urls:
Using link tag under <head> tag in html, use a single consistent url as
a single url.
<link rel=”canonical” href=”www.example.com/phones”/>
Canonical Links
17. • Semantic HTML markup
• Avoid flash, javascript output, as crawlers are good at parsing
HTML
• proper <title>
• Meta tags <meta> tags, alt attribute for images
• help including keywords which ultimately increases page ranking.
SEO Tagging
19. • Site map helps web crawlers to access our site pages for indexing.
It includes URL paths to various site pages in the application to
index.
• Specified using XML file defined by www.sitemaps.org schema.
• Can contain multiple xml files linked using index sitemap file.
• Usually stored in root of the web application. Can be specified in
robot.txt.
• Sitemap xml can be submitted to search engines to validate.
Example Sitemap XML file:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
</url>
<url>
<loc>http://www.example.com/contact/</loc>
</url>
</urlset>
SiteMap
20. • Used In Category pages, Faced Navigation pages, also be used in
Product detail pages
• URL Recoding:
Non SEO or Traditional Endeca Urls: (constructed using
BasicUrlFormatter Endeca assembler API)
eg: rogers.com (category page, faceted navigation page)
http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=Pho
neThenPlan&productType=normal&N=11+52+4294948709&Nr=A
ND%28Language%3AEN%2CProvince%3AON%29
Implementing SEO techniques in
Endeca
21. • Optimized SEO friendly Endeca Urls: (using SEOUrlFormatter
Endeca Assembler API)
• Include keywords in Urls, to make it SEO friendly
http://www.rogers.com/wireless/smartphones/_/N-
11+52+4294948709?Nr=AND%28Language%3AEN%2CProvince%3
AON%29
http://www.rogers.com/wireless/smartphones/Android/_/N-
11+52+4294948709+277?Nr=AND%28Language%3AEN%2CProvinc
e%3AON%29
Implementing SEO techniques in
Endeca (contd)
22. Parts of the optimized Endeca Urls:
misc-path - /wireless/smartphones/Android
path-param-separator - _
path-params - N-11+52+4294948709
query string - ?Nr=AND%28Language%3AEN%2CProvince%3AON%29
Implementing SEO techniques in
Endeca (contd)
23. Configuring SEO friendly URLs in Endeca
• XML Configuration file – eg: endeca-seo-Config.xml
Easily configured using Spring framework
Core API classes – Endeca Assembler API – SEO classes -
BasicQueryBuilder, SeoUrlFormatter, SeoNavStateFormatter,
SeoNavStateCanonicalizer along with Endeca Experience Manager
Cartridge Handlers core APIs.
• Will be able to configure every aspect of URLs, like formatting,
canonicalizing, encoding.
• Sample reference XML configuration in appendix.
Implementing SEO techniques in
Endeca (contd)
24. • Site Map:
• Sitemap xml files generated using standalone batch command
RunSitemapGen.bat
• Configured using XML file specifying MDEX host, port, URL format
• Uses same xml file used for configuring application
• <URL_FORMAT_FILE>C:EndecaToolsAndFrameworks...WEB-
INFendeca-seo-config.xml</URL_FORMAT_FILE>
* Sample configuration file included in appendix.
Implementing SEO techniques in
Endeca (contd)
25. ATG Driven pages (usually in Product Detail pages )
URL Recoding:
• ATG SEO module detects Visitor either Human or Crawler using User
Agent Header from request and generates the page links
accordingly.
• Core API provided by ATG is in atg.repository.seo.* packages.
• Core ATG APIs:
atg.repository.seo.ItemLink servlet bean– translates to static
urls based on user agent.
atg.repository.seo.JumpServlet - incoming static URLs (for
example, if a user clicks a link returned by a Google search), and
translates these URLs into their dynamic equivalents.
Implementing SEO techniques in
ATG
26. URL Configuration:
Done using template classes provided Core ATG SEO.
URL Templates:
atg.repository.seo.DirectUrlTemplate
atg.repository.seo.IndirectUrlTemplate
# Url template format
urlTemplateFormat=/jump/{item.displayName}/productDetail/{item.p
arentCategory.displayName}/{item.id}/{item.parentCategory.id}/{locale
}
# Forward Url template
forwardUrlTemplateFormat={item.template.url,encode=false}?product
Id={item.id}&categoryId={item.parentCategory.id}&locale={locale}
&productPage=true
Implementing SEO techniques in
ATG (contd)
27. /atg/repository/seo/CatalogItemLink
$class=atg.repository.seo.ItemLink
# Map of UrlTemplateMapper components by item descriptor name
for this droplet
itemDescriptorNameToMapperMap=
product=/atg/repository/seo/ProductTemplateMapper
# Default parameter values
defaultRepository=/atg/commerce/catalog/ProductCatalog
defaultItemDescriptorName=product
Implementing SEO techniques in
ATG(contd)
28. /atg/repository/seo/ProductTemplateMapper
$class=atg.repository.seo.IndirectUrlTemplate
# Url template format
urlTemplateFormat=/jump/{item.displayName}/productDetail/{item.parentCategory.displayName}/{item.i
d}/{item.parentCategory.id}/{locale}
# Regex that matches above format
indirectRegex=.*/jump/[^/]*?/productDetail/[^/]*?/([^/].*?)/[^/]*?/([^/]*)(/.*?)*$
# Regex elements
regexElementList=
item | id |
/atg/commerce/catalog/ProductCatalog:product,
locale | string
# Forward Url template
forwardUrlTemplateFormat={item.template.url,encode=false}?productId={item.id}&categoryId={item.p
arentCategory.id}&locale={locale}&productPage=true
# Supported Browser Types
supportedBrowserTypes=
robot
Implementing SEO techniques in
ATG(contd)
29. ATG – Endeca Integration:
• Used in Experience Manager Cartridge Handlers
• ATG Nucleus component access Endeca SEO configuration spring
beans using ATG Spring Integration feature
• atg.nucleus.spring.NucleusPublisher
• <bean name="/NucleusPublisher" class="atg.nucleus.spring.NucleusPublisher"
• scope="singleton">
• <property name="nucleusPath">
• <value>/atg/spring/FromSpring</value>
• </property>
• </bean>
•
• <import resource="endeca-seo-url-config.xml"/>
• Now you use /atg/spring/FromSpring/[spring Bean Id] in your
Nucleus component
Implementing SEO techniques in
ATG(contd)
30. ATG – Endeca Integration:
A key bean in Endeca is
com.endeca.soleng.urlformatter.seo.SeoUrlFormatter
• Configure the
/atg/endeca/assembler/cartridge/manager/NavigationStateBuilder
component using the property
• urlFormatter = /atg/spring/FromSpring/seoUrlFormatter
Implementing SEO techniques in
ATG(contd)
31. Canonical Links
Using OOTB ATG
/atg/repository/seo/CanonicalItemLink
<link rel="canonical"
ref="http://www.example.com:80/crs/storeus/jump/
Dotted+Repp+Tie/productDetail/For+Him/xprod1001
/cat50067 " />
Implementing SEO techniques in
ATG(contd)
32. SiteMap Generation:
• Sitemap files are XML documents that contain URLs for the pages of your
site. These can be generated either manually using Dynamo Admin
console or in a scheduled mapper.
• Sitemap xmls are kept in root of the web application
ATG uses following OOTB components for sitemap generation:
Components is in folder /atg/sitemap/*
Implementing SEO techniques in
ATG(contd)
33. SiteMap Generation:
StaticSitemapGenerator - generates sitemap xml files for static urls
DynamicSitemapGenerator - generates files for dynamic urls.
SitemapIndexGenerator - generates index files for various sitemap files
generated by SiteMapGenerator components.
SitemapGeneratorService - Used for scheduled generation of sitemap xml files
and inserting entries in SiteMapRepository
SitemapWriterService - writes sitemap xml files using contents from
SiteMapRepository, should be run on every ATG instance.
Implementing SEO techniques in
ATG(contd)
34. SEO Tagging:
• ATG uses SEO tag repository for storing content of title, meta – keywords, description
tags.
• Register SEO tag repository using initialRepositories property of
/atg/repository/ContentRepositories component
<dsp:droplet name="/atg/dynamo/droplet/RQLQueryRange">
<dsp:param name="repository" value="/atg/seo/SEORepository" />
<dsp:param name="itemDescriptor" value="SEOTags" />
<dsp:param name="howMany" value="1" />
<dsp:param name="mykey" value="featured" />
<dsp:param name="queryRQL" value="key = :mykey" />
<dsp:oparam name="output">
<title><dsp:valueof param="element.title"/></title>
<dsp:getvalueof var="description" param="element.description"/>
<dsp:getvalueof var="keywords" param="element.keywords"/>
<meta name="description" content="${description}" />
<meta name="keywords" content="${keywords}"/>
</dsp:output>
</dsp:droplet>
Implementing SEO techniques in
ATG(contd)