SlideShare ist ein Scribd-Unternehmen logo
1 von 21
SEO AND SMW      Why it
                 matters that
                 people find
          -OR-   your hard
                 work
THEN
                                      COMES THE
Low Hanging Fruit                    „PANDA
 Quick fixes for consistent search   SLAP‟ AND
  engine ranking                      YOU GET TO
                                      RESTART…

Framework and Community
 Intensive updates
 Easier to implement while
  building wiki


SMW Specific SEO
 Provided features and framework
  limitations
SCHEMA.ORG ARRIVED
  A N D T H OUSA N DS OF I N C OM P LE TE BUSI N E SS P LA N S DI E D


Schema.org has been called everything from
  an online land grab to an RDFa killer. In
   reality Schema.org is the agreed upon
 shared vocabulary for all major US search
                  engines.
 Adoption of competing RDFa formats was stagnant at best.
 Competing microformats created an engineering burden for
  businesses
 Winners were picked beforehand and newcomers couldn‟t
  openly compete. Ex: hReview – Yelp, hCard – LinkedIn, etc.
 Schema.org is compatible with RDFa 1 .1 .
 THEY FINALLY AGREED TO ALL DO SOMETHING THE SAME WAY.
PANDA CAME INTO BEING
  I T STOP P E D BE I N G P ROF I TA BLE TO DO N OT H I N G ON LI N E


  Panda was the first Google algorithm to affectively
change several widely understood ranking policies and
     remove long time interlinking monopolies.

 Added date considerations to content downgrading pages
  composed of „evergreen content‟
 Placed greater importance on link partners downgrading
  pages with large numbers of obscure in links and poor out
  links
 Placed greater consideration on keyword strategies and
  forced domains to establish keyword „brands‟ related to
  quality content
 Divided the long time cross linking metric to reduce page
  ranks and nullify the ef fects of link farming and aggressive
  out linking.
COHERENT KEYWORD STRATEGY
DON ‟T BOT H E R PAY I NG W H E N N OBODY I S E VE N C OM P E T I NG

 Survey : What is the state of
  competition for Semantic
  Web, Linked Data, and
  Semantic Web Application?
   Competition for all related
    keywords and terms is low.
   Around 50,000 searches for
    these terms are executed a
    month and ignored.
   Key terms including wiki,
    application, browser, and
    projects combined with
    „semantic‟ and „linked data‟
    have zero competition.
   Optimizing for any of these
    terms will meet with limited
    competition, and Adwords
    purchasing will bid in the lowest
    pricing tier.
CONSOLIDATE YOUR DOMAINS
  2 0 URLS GOI N G TO T H E SA M E W E BSI TE I S RE A LLY BA D

                                  RewriteEngine on
Collect all top level            # FIND AND REDIRECT SECONDARY DOMAINS /
                                  SUB-DOMAINS
 domains                          rewritecond %{http_host} ^www.second-
                                  domain.com [nc]
                                  rewriterule ^(.*)$ http://www.domain.com/$1
Collect all sub                  [r=301,nc]

                                  rewritecond %{http_host} ^second-
 domains                          subdomain.domain.com [nc]
                                  rewriterule ^(.*)$ http://www.domain.com/$1
                                  [r=301,nc]
200 Redirect from                # DIFFERENT ENTRANCE DEFAULTS FOR SERVER
                                  CONFIGS
 URL to MediaWiki                 redirect 301 /index.html /wiki/index.php
                                  redirect 301 /index.shtml /wiki/index.php

 index.php                        redirect 301 /index.htm /wiki/index.php
                                  redirect 301 /index.asp /wiki/index.php
                                  redirect 301 /index.aspx /wiki/index.php

301 Redirect all                 redirect 301 /index.cfm /wiki/index.php
                                  redirect 301 /index.pl /wiki/index.php
                                  redirect 301 /default.html /wiki/index.php
 remaining domains                redirect 301 /default.htm /wiki/index.php
                                  redirect 301 /default.asp /wiki/index.php

                                  ErrorDocument 403 /notfound.html
                                  ErrorDocument 404 /notfound.html
                                  ErrorDocument 500 /wiki/index.php
TITLES, DESCRIPTIONS, AND HEADERS
          A LI T T LE A PATH Y H URT S A LOT OF PAG E RA N K


                            The Rules
 Title
   65 characters, Unique per page
   <site or section name> - <key words in semi-legible phrase>
 Description
   156 characters, unique per page, sentences can be semi -legible
 Domain /Page URL
   160 characters max with key words included
   Strip session ids and non-media file extensions
 Header Tags
   H1 & H2 : Include keywords and use within primary body blocking
    elements.
   H3 – H6 : When used in conjunction with H1 & H2 their page
    weighting is significantly increased
TITLES, DESCRIPTIONS, AND HEADERS
      A LI T T LE A PATH Y H URT S A LOT OF PAG E RA N K


Reasoning




                               Provided By Optify.net, April 2011


 These elements are the top rated concerns and all major
  SEO services recommend correcting them first
ROBOTS, SPIDERS, AND BLACK HOLES
 M E DI A W I KI BY DE SI G N CA N DE ST ROY A SE A RC H RA N KI N G


 Due to the complexity of Media Wiki storing
  transactions and open editing a robots.txt and
  sitemap.xml are REQUIRED.

   A wiki is technically the size of human vocabulary and
    99% duplicate content create / edit pages
   Page transactions create thousands of history logs
   Special pages and custom functionality will provide
    inappropriate entrance points


   These facts left unaddressed will result in
 heavy search engine penalties applied to your
ROBOTS, SPIDERS, AND BLACK HOLES
M E DI A W I KI BY DE SI G N CA N DE ST ROY A SE A RC H RA N KI N G

                                   S i te m a p : h t t p : / / w w w. u r l . c o m / s i t e m a p . x m l
Example Robots.txt                User-agent: *
  Bug Tracking Systems            D i s a l l ow : / s m w b u g s /
                                   D i s a l l ow : / we b s v n /
  Code Repositories               D i s a l l ow : t i t l e = B u g z i l l a
                                   D i s a l l ow : t i t l e = S p e c i a l : L o g
                                   D i s a l l ow : a c t i o n = a n n o t a t e
  Wiki Logging                    D i s a l l ow : a c t i o n = e d i t
                                   D i s a l l ow : a c t i o n = f o r m e d i t
  Wiki Editing                    D i s a l l ow : r e d l i n k = 1
                                   D i s a l l ow : m o d e = w y s i w y g
     WYSIWIG panes and            D i s a l l ow : a c t i o n = h i s t o r y
                                   D i s a l l ow : / Ta l k :
      default create pages         D i s a l l ow : t i t l e = Ta l k :
                                   D i s a l l ow : / S p e c i a l : S e a r c h
  History Logging                 D i s a l l ow : / S p e c i a l : Ve r s i o n
                                   D i s a l l ow : t i t l e = S p e c i a l : S e a r c h
  Wiki Talk Pages                 D i s a l l ow : t i t l e = S p e c i a l : U s e r L o g i n

  Selected Special
   Pages
MEDIA, ALT, AND TITLE ATTRIBUTES
    T H E 10 % OF SE A RC H RA N KI N G E VE RY BODY I G N ORE S


 Because early websites discovered keyword cramming and
  cloaking search engines started putting weight on file names,
  alt, and title attribute tags.
 Search engines made the mistake of assuming the Internet
  cares about screen readers and text browsers.
 This mistake can be exploited in a positive way by actually
  caring about screen readers and text browsers.
   File names should describe the contents – Name before upload
   Alt tags should be the normalized file name
   Title tags should describe the section the file is within
CREATING ANCHORS WITH KEYWORDS
  W H E RE WA S T H E LA ST ST RE E T SI G N RE A DI N G „G O H E RE ‟?


 Anchors on your wiki between sections should be 2-3 word
  descriptions of that section
 Pages will be ranked higher if keywords used in anchors
  match their titles
    When creating new wiki articles take into account the article name
     becomes the title of the page

Example Anchor Text             Page Title
“Download Semantic MediaWiki”   SMW : Download Semantic MediaWiki
“SMW Community Forum”           SMW : Community and Development Forum
“Purchase SMW+ License”         SMW+ : Purchase Semantic MediaWiki Plus
“Business Semantic MediaWiki”   SMW : Small Business and Academic Portal
CONSISTENT IN LINKS TO YOUR WIKI
     YOUR E - RE P UTATI ON M AT T ERS TO SE A RC H E N G I N E S


 Same rules apply to in links as internal anchors between
  pages and sections
 Links with the attribute rel=“nofollow” are not followed or
  applied to your wikis search engine ranking
    Wikipedia.com
    Lesser known search portals
    News sites with millions of daily readers
 Google Page Rank heavily drivenwith qualityquality metricsyourto panda
                         driven by in links by obscure in links to due
 site


 Buying in links is risky! Many agencies selling
a number of links for a price are dumping them
   on low ranked blogs who cloak the links.
CONSISTENT OUT LINKS FROM YOUR
                WIKI
    N OT E VE RY W E BSI TE SH OULD BE T RE AT ED T H E SA M E

 Your page ranking and search engine results are also af fected
  by the sites you link and HOW they are linked
   Over 6 months old with page rank of zero remove the link
   Large corporations don‟t need your page rank status. Use
    rel=“nofollow” for their pages
   Most links on your wiki will have an automated rel=“nofollow” added
    to them
   There are custom solutions for removing the nofollow attributes that
    should be used when linking other SMW sites and MediaWiki
    resources.
   Linking to “Authoritative” sites still improves SEO. Using <cite> tags
    when quoting content is now approved, but should be used sparingly.
SCHEMA.ORG RESULTS
       LI KE I T OR N OT A DOP T I ON I S H A P P E N I NG


The RDFa format, with
a regex expression
scanning for schema
identifiers, has
surpassed all other
formats in 2012.




                                   Source : webdatacommons.org, 2012
TRACK AND IMPROVE IMPORTANT PAGES
YOU CA N ‟T A LWAYS C H OOSE W H I C H PAG E I S RA N KE D H I G H LY


 An example of unexpected pages with high ranking:




 This is a help manual page that is listed on the first page for a
  keyword term
   Current content should not be modified but extended
   Out links from this page should be removed or kept few and relevant
   Links to internal targeted sections should be added and made visible
    on this page
   Semantic markup should be added to the page body and refined for
    the keyword giving the page a high rank
DON‟T CHEAT THE SEARCH ENGINES
   W I T H RI SK C OM E S RE WA RD… UN T I L YOU A RE BA N N ED

 Link Farming (panda)
   Buy a dozen domains and skin off one framework
   All domains link to your new website in slightly different ways
    cramming your keywords
 Paying the Link Farmers (panda)
   No. They cannot get you linked on 500, Page Rank 4+, websites for
    $99 without owning the Link Farm or degrading the Page Rank of
    naïve blog owners.
 Cloaking
   Display one thing to the SE Spider and something else to users
   EVEN FOR MOBILE SPIDERS DO NOT DO THIS
 Of f Page Linking and Spamming
   Search engines understand if your div is positioned off browser or
    display is set to none
   If you must do this for menus use HTML5 and specify <nav>, or in lieu
    of that use an ID or Class name of “menu”, “nav”, or “navigation”
LIMITATIONS OF SEO ON MEDIA WIKI
I T WA S C RE AT E D TO BE F UN C T I ONAL, N OT A LWAYS P OP ULA R


 Customization of page titles and descriptions is time
  consuming and requires custom extensions
   Pages titles are bound to the article title, namespace, and URL
 No way by default to define meta descriptions and keyword
  listings within the article
 Nofollow attributes are either always on or always of f, and
  require specialty code to remove them from a single anchor
 Most of the wiki can be classified as duplicate content and
  requires extensive robots.txt and sitemap.xml files
 Irregular entrance points as certain articles gain in popularity
  with no consideration for the entire topic of the wiki
 Quality articles with high ranking can be edited, and even if
  correctly edited content the search engines ranked highly is
  removed
SEARCH ENGINE WEBMASTER ACCOUNTS
A N ALY TI CS T RAC K. T H E SE AC C OUN T S E X P E DI T E SUBM I SSI ON


 Google Webmaster Tools
   http://www.google.com/webmasters/tools/
 Bing / Yahoo Webmaster Tools
   http://www.bing.com/toolbox/webmasters /
 DMOZ Open Directory
   http://www.dmoz.org/docs/en/add.html
 Bulk Sitemap Submission (Google, Bing, Ask)
   http://www.sitemapwriter.com/notify.php
 Region Specific Search Engines
   Yandex – Russia
   Baidu – China
   Full listing : http://www.searchenginecolossus.com/
SEO TOOLS AND WIKI SUGGESTIONS
              I N FORMATION , TOOLS, A N D W I SH LI ST S

 http://www.seowarrior.net / - SEO Warrior, O‟Reilly
   Updated versions of PERL scripts for site wide statistics compilation
   Book usually updated every 18 months, site keeps e -version
 Analytics Tracking
   1 or 2 Online Accounts (Google Analytics / Bing)
   1 to 3 Server Side Systems (Splunk SEO App, Pwik, Awstats)
 MediaWiki Extensions
     http://www.mediawiki.org/wiki/Extension:Advanced_Meta
     http://www.mediawiki.org/wiki/Extension:Add_HTML_Meta_and_Title
     Somebody create an SEO centric skin or update existing skins?
     Administration tools for bulk renaming files and pages complete with
      301 Redirect tags created
 SEO Tracking Pay Services
   http://www.seomoz.org/
   http://www.optify.net
QUESTIONS?   Comments, R
             equests, Idea
             s, Complaints
             ?

Weitere ähnliche Inhalte

Mehr von William Smith

Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoningWilliam Smith
 
Applied semantic technology and linked data
Applied semantic technology and linked dataApplied semantic technology and linked data
Applied semantic technology and linked dataWilliam Smith
 
NLP Linked Open Data "Is a" Solution
NLP Linked Open Data "Is a" SolutionNLP Linked Open Data "Is a" Solution
NLP Linked Open Data "Is a" SolutionWilliam Smith
 
AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
AURA Wiki - Knowledge Acquisition with a Semantic Wiki ApplicationAURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
AURA Wiki - Knowledge Acquisition with a Semantic Wiki ApplicationWilliam Smith
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening TalkWilliam Smith
 
SMWCon 2012 Linked Data Visualizations
SMWCon 2012 Linked Data VisualizationsSMWCon 2012 Linked Data Visualizations
SMWCon 2012 Linked Data VisualizationsWilliam Smith
 
Allen Institute Neurowiki Presentation
Allen Institute Neurowiki PresentationAllen Institute Neurowiki Presentation
Allen Institute Neurowiki PresentationWilliam Smith
 

Mehr von William Smith (7)

Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
Applied semantic technology and linked data
Applied semantic technology and linked dataApplied semantic technology and linked data
Applied semantic technology and linked data
 
NLP Linked Open Data "Is a" Solution
NLP Linked Open Data "Is a" SolutionNLP Linked Open Data "Is a" Solution
NLP Linked Open Data "Is a" Solution
 
AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
AURA Wiki - Knowledge Acquisition with a Semantic Wiki ApplicationAURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
AURA Wiki - Knowledge Acquisition with a Semantic Wiki Application
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening Talk
 
SMWCon 2012 Linked Data Visualizations
SMWCon 2012 Linked Data VisualizationsSMWCon 2012 Linked Data Visualizations
SMWCon 2012 Linked Data Visualizations
 
Allen Institute Neurowiki Presentation
Allen Institute Neurowiki PresentationAllen Institute Neurowiki Presentation
Allen Institute Neurowiki Presentation
 

Kürzlich hochgeladen

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

SMWCon 2012 SEO Presentation

  • 1. SEO AND SMW Why it matters that people find -OR- your hard work
  • 2. THEN COMES THE Low Hanging Fruit „PANDA Quick fixes for consistent search SLAP‟ AND engine ranking YOU GET TO RESTART… Framework and Community Intensive updates Easier to implement while building wiki SMW Specific SEO Provided features and framework limitations
  • 3. SCHEMA.ORG ARRIVED A N D T H OUSA N DS OF I N C OM P LE TE BUSI N E SS P LA N S DI E D Schema.org has been called everything from an online land grab to an RDFa killer. In reality Schema.org is the agreed upon shared vocabulary for all major US search engines.  Adoption of competing RDFa formats was stagnant at best.  Competing microformats created an engineering burden for businesses  Winners were picked beforehand and newcomers couldn‟t openly compete. Ex: hReview – Yelp, hCard – LinkedIn, etc.  Schema.org is compatible with RDFa 1 .1 .  THEY FINALLY AGREED TO ALL DO SOMETHING THE SAME WAY.
  • 4. PANDA CAME INTO BEING I T STOP P E D BE I N G P ROF I TA BLE TO DO N OT H I N G ON LI N E Panda was the first Google algorithm to affectively change several widely understood ranking policies and remove long time interlinking monopolies.  Added date considerations to content downgrading pages composed of „evergreen content‟  Placed greater importance on link partners downgrading pages with large numbers of obscure in links and poor out links  Placed greater consideration on keyword strategies and forced domains to establish keyword „brands‟ related to quality content  Divided the long time cross linking metric to reduce page ranks and nullify the ef fects of link farming and aggressive out linking.
  • 5. COHERENT KEYWORD STRATEGY DON ‟T BOT H E R PAY I NG W H E N N OBODY I S E VE N C OM P E T I NG  Survey : What is the state of competition for Semantic Web, Linked Data, and Semantic Web Application?  Competition for all related keywords and terms is low.  Around 50,000 searches for these terms are executed a month and ignored.  Key terms including wiki, application, browser, and projects combined with „semantic‟ and „linked data‟ have zero competition.  Optimizing for any of these terms will meet with limited competition, and Adwords purchasing will bid in the lowest pricing tier.
  • 6. CONSOLIDATE YOUR DOMAINS 2 0 URLS GOI N G TO T H E SA M E W E BSI TE I S RE A LLY BA D RewriteEngine on Collect all top level # FIND AND REDIRECT SECONDARY DOMAINS / SUB-DOMAINS domains rewritecond %{http_host} ^www.second- domain.com [nc] rewriterule ^(.*)$ http://www.domain.com/$1 Collect all sub [r=301,nc] rewritecond %{http_host} ^second- domains subdomain.domain.com [nc] rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc] 200 Redirect from # DIFFERENT ENTRANCE DEFAULTS FOR SERVER CONFIGS URL to MediaWiki redirect 301 /index.html /wiki/index.php redirect 301 /index.shtml /wiki/index.php index.php redirect 301 /index.htm /wiki/index.php redirect 301 /index.asp /wiki/index.php redirect 301 /index.aspx /wiki/index.php 301 Redirect all redirect 301 /index.cfm /wiki/index.php redirect 301 /index.pl /wiki/index.php redirect 301 /default.html /wiki/index.php remaining domains redirect 301 /default.htm /wiki/index.php redirect 301 /default.asp /wiki/index.php ErrorDocument 403 /notfound.html ErrorDocument 404 /notfound.html ErrorDocument 500 /wiki/index.php
  • 7. TITLES, DESCRIPTIONS, AND HEADERS A LI T T LE A PATH Y H URT S A LOT OF PAG E RA N K The Rules  Title  65 characters, Unique per page  <site or section name> - <key words in semi-legible phrase>  Description  156 characters, unique per page, sentences can be semi -legible  Domain /Page URL  160 characters max with key words included  Strip session ids and non-media file extensions  Header Tags  H1 & H2 : Include keywords and use within primary body blocking elements.  H3 – H6 : When used in conjunction with H1 & H2 their page weighting is significantly increased
  • 8. TITLES, DESCRIPTIONS, AND HEADERS A LI T T LE A PATH Y H URT S A LOT OF PAG E RA N K Reasoning Provided By Optify.net, April 2011  These elements are the top rated concerns and all major SEO services recommend correcting them first
  • 9. ROBOTS, SPIDERS, AND BLACK HOLES M E DI A W I KI BY DE SI G N CA N DE ST ROY A SE A RC H RA N KI N G  Due to the complexity of Media Wiki storing transactions and open editing a robots.txt and sitemap.xml are REQUIRED.  A wiki is technically the size of human vocabulary and 99% duplicate content create / edit pages  Page transactions create thousands of history logs  Special pages and custom functionality will provide inappropriate entrance points These facts left unaddressed will result in heavy search engine penalties applied to your
  • 10. ROBOTS, SPIDERS, AND BLACK HOLES M E DI A W I KI BY DE SI G N CA N DE ST ROY A SE A RC H RA N KI N G S i te m a p : h t t p : / / w w w. u r l . c o m / s i t e m a p . x m l Example Robots.txt User-agent: *  Bug Tracking Systems D i s a l l ow : / s m w b u g s / D i s a l l ow : / we b s v n /  Code Repositories D i s a l l ow : t i t l e = B u g z i l l a D i s a l l ow : t i t l e = S p e c i a l : L o g D i s a l l ow : a c t i o n = a n n o t a t e  Wiki Logging D i s a l l ow : a c t i o n = e d i t D i s a l l ow : a c t i o n = f o r m e d i t  Wiki Editing D i s a l l ow : r e d l i n k = 1 D i s a l l ow : m o d e = w y s i w y g  WYSIWIG panes and D i s a l l ow : a c t i o n = h i s t o r y D i s a l l ow : / Ta l k : default create pages D i s a l l ow : t i t l e = Ta l k : D i s a l l ow : / S p e c i a l : S e a r c h  History Logging D i s a l l ow : / S p e c i a l : Ve r s i o n D i s a l l ow : t i t l e = S p e c i a l : S e a r c h  Wiki Talk Pages D i s a l l ow : t i t l e = S p e c i a l : U s e r L o g i n  Selected Special Pages
  • 11. MEDIA, ALT, AND TITLE ATTRIBUTES T H E 10 % OF SE A RC H RA N KI N G E VE RY BODY I G N ORE S  Because early websites discovered keyword cramming and cloaking search engines started putting weight on file names, alt, and title attribute tags.  Search engines made the mistake of assuming the Internet cares about screen readers and text browsers.  This mistake can be exploited in a positive way by actually caring about screen readers and text browsers.  File names should describe the contents – Name before upload  Alt tags should be the normalized file name  Title tags should describe the section the file is within
  • 12. CREATING ANCHORS WITH KEYWORDS W H E RE WA S T H E LA ST ST RE E T SI G N RE A DI N G „G O H E RE ‟?  Anchors on your wiki between sections should be 2-3 word descriptions of that section  Pages will be ranked higher if keywords used in anchors match their titles  When creating new wiki articles take into account the article name becomes the title of the page Example Anchor Text Page Title “Download Semantic MediaWiki” SMW : Download Semantic MediaWiki “SMW Community Forum” SMW : Community and Development Forum “Purchase SMW+ License” SMW+ : Purchase Semantic MediaWiki Plus “Business Semantic MediaWiki” SMW : Small Business and Academic Portal
  • 13. CONSISTENT IN LINKS TO YOUR WIKI YOUR E - RE P UTATI ON M AT T ERS TO SE A RC H E N G I N E S  Same rules apply to in links as internal anchors between pages and sections  Links with the attribute rel=“nofollow” are not followed or applied to your wikis search engine ranking  Wikipedia.com  Lesser known search portals  News sites with millions of daily readers  Google Page Rank heavily drivenwith qualityquality metricsyourto panda driven by in links by obscure in links to due site Buying in links is risky! Many agencies selling a number of links for a price are dumping them on low ranked blogs who cloak the links.
  • 14. CONSISTENT OUT LINKS FROM YOUR WIKI N OT E VE RY W E BSI TE SH OULD BE T RE AT ED T H E SA M E  Your page ranking and search engine results are also af fected by the sites you link and HOW they are linked  Over 6 months old with page rank of zero remove the link  Large corporations don‟t need your page rank status. Use rel=“nofollow” for their pages  Most links on your wiki will have an automated rel=“nofollow” added to them  There are custom solutions for removing the nofollow attributes that should be used when linking other SMW sites and MediaWiki resources.  Linking to “Authoritative” sites still improves SEO. Using <cite> tags when quoting content is now approved, but should be used sparingly.
  • 15. SCHEMA.ORG RESULTS LI KE I T OR N OT A DOP T I ON I S H A P P E N I NG The RDFa format, with a regex expression scanning for schema identifiers, has surpassed all other formats in 2012. Source : webdatacommons.org, 2012
  • 16. TRACK AND IMPROVE IMPORTANT PAGES YOU CA N ‟T A LWAYS C H OOSE W H I C H PAG E I S RA N KE D H I G H LY  An example of unexpected pages with high ranking:  This is a help manual page that is listed on the first page for a keyword term  Current content should not be modified but extended  Out links from this page should be removed or kept few and relevant  Links to internal targeted sections should be added and made visible on this page  Semantic markup should be added to the page body and refined for the keyword giving the page a high rank
  • 17. DON‟T CHEAT THE SEARCH ENGINES W I T H RI SK C OM E S RE WA RD… UN T I L YOU A RE BA N N ED  Link Farming (panda)  Buy a dozen domains and skin off one framework  All domains link to your new website in slightly different ways cramming your keywords  Paying the Link Farmers (panda)  No. They cannot get you linked on 500, Page Rank 4+, websites for $99 without owning the Link Farm or degrading the Page Rank of naïve blog owners.  Cloaking  Display one thing to the SE Spider and something else to users  EVEN FOR MOBILE SPIDERS DO NOT DO THIS  Of f Page Linking and Spamming  Search engines understand if your div is positioned off browser or display is set to none  If you must do this for menus use HTML5 and specify <nav>, or in lieu of that use an ID or Class name of “menu”, “nav”, or “navigation”
  • 18. LIMITATIONS OF SEO ON MEDIA WIKI I T WA S C RE AT E D TO BE F UN C T I ONAL, N OT A LWAYS P OP ULA R  Customization of page titles and descriptions is time consuming and requires custom extensions  Pages titles are bound to the article title, namespace, and URL  No way by default to define meta descriptions and keyword listings within the article  Nofollow attributes are either always on or always of f, and require specialty code to remove them from a single anchor  Most of the wiki can be classified as duplicate content and requires extensive robots.txt and sitemap.xml files  Irregular entrance points as certain articles gain in popularity with no consideration for the entire topic of the wiki  Quality articles with high ranking can be edited, and even if correctly edited content the search engines ranked highly is removed
  • 19. SEARCH ENGINE WEBMASTER ACCOUNTS A N ALY TI CS T RAC K. T H E SE AC C OUN T S E X P E DI T E SUBM I SSI ON  Google Webmaster Tools  http://www.google.com/webmasters/tools/  Bing / Yahoo Webmaster Tools  http://www.bing.com/toolbox/webmasters /  DMOZ Open Directory  http://www.dmoz.org/docs/en/add.html  Bulk Sitemap Submission (Google, Bing, Ask)  http://www.sitemapwriter.com/notify.php  Region Specific Search Engines  Yandex – Russia  Baidu – China  Full listing : http://www.searchenginecolossus.com/
  • 20. SEO TOOLS AND WIKI SUGGESTIONS I N FORMATION , TOOLS, A N D W I SH LI ST S  http://www.seowarrior.net / - SEO Warrior, O‟Reilly  Updated versions of PERL scripts for site wide statistics compilation  Book usually updated every 18 months, site keeps e -version  Analytics Tracking  1 or 2 Online Accounts (Google Analytics / Bing)  1 to 3 Server Side Systems (Splunk SEO App, Pwik, Awstats)  MediaWiki Extensions  http://www.mediawiki.org/wiki/Extension:Advanced_Meta  http://www.mediawiki.org/wiki/Extension:Add_HTML_Meta_and_Title  Somebody create an SEO centric skin or update existing skins?  Administration tools for bulk renaming files and pages complete with 301 Redirect tags created  SEO Tracking Pay Services  http://www.seomoz.org/  http://www.optify.net
  • 21. QUESTIONS? Comments, R equests, Idea s, Complaints ?