SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Analysis of Websites as Graphs for SEO
Analysis of Websites as Graphs for SEO
Rubén Martínez – Junio 2015 – Open Analytics Madrid
Analysis of Websites as Graphs for SEO
Items	
  (books,	
  music,	
  etc)	
  used	
  to	
  be	
  arranged	
  in	
  5ght	
  silos	
  by	
  categories	
  
Analysis of Websites as Graphs for SEO
There is more to websites than meets the eye
Has	
  a	
  website	
  ever	
  been	
  this	
  boring?	
  
We	
  tend	
  to	
  think	
  of	
  websites	
  as	
  a	
  homepage	
  on	
  the	
  top	
  followed	
  by	
  a	
  second	
  layer	
  of	
  children	
  webpages	
  (categories),	
  	
  
a	
  third	
  level	
  below	
  (sub-­‐categories)	
  and	
  pages	
  of	
  items	
  (products,	
  ar5cles,	
  etc)	
  at	
  the	
  bo@om.	
  
Happily,	
  reality	
  is	
  not	
  so	
  simple!	
  
Analysis of Websites as Graphs for SEO
First-ever website - 1990
Source:	
  Tim	
  Berners-­‐Lee's	
  web	
  catalog	
  at	
  CERN.	
  
A	
  copy	
  is	
  available	
  at	
  h@p://www.w3.org/History/19921103-­‐hypertext/hypertext/WWW/TheProject.html	
  
Not	
  even	
  the	
  1st	
  ever	
  website	
  was	
  a	
  simple	
  hierarchical	
  tree	
  of	
  categories	
  and	
  sub-­‐categories	
  
Analysis of Websites as Graphs for SEO
Websites are graphs
Graph	
  theory	
  
	
  
A	
  graph	
  is	
  an	
  ordered	
  pair	
  G	
  =	
  (V,	
  E)	
  comprising	
  
a	
  set	
  V	
  of	
  ver5ces	
  or	
  nodes	
  together	
  with	
  a	
  set	
  
E	
  of	
  edges	
  or	
  links.	
  
	
  
Websites	
  
	
  
Websites	
  are	
  graphs	
  whose	
  webpages	
  are	
  
nodes	
  and	
  links,	
  directed	
  edges.	
  

Actual	
  websites	
  are	
  a	
  more	
  organic,	
  messy	
  business	
  
Visualiza5on	
  of	
  a	
  300-­‐pages	
  ecommerce	
  website	
  
Analysis of Websites as Graphs for SEO
Link analysis in graph theory
PageRank	
  is	
  a	
  link	
  analysis	
  algorithm.	
  It	
  outputs	
  a	
  probability	
  distribu;on	
  that	
  represents	
  the	
  likelihood	
  that	
  a	
  
person	
  clicking	
  on	
  links	
  will	
  arrive	
  at	
  any	
  par;cular	
  page.	
  
Google’s	
  reasonable	
  surfer	
  model	
  of	
  weigh5ng	
  of	
  hyperlinks	
  by	
  their	
  posi5on	
  on	
  the	
  page	
  
It	
  assigns	
  a	
  numerical	
  weigh5ng	
  to	
  each	
  element	
  of	
  a	
  hyperlinked	
  set	
  of	
  documents,	
  such	
  as	
  the	
  World	
  Wide	
  Web,	
  
with	
  the	
  purpose	
  of	
  "measuring"	
  its	
  rela5ve	
  importance	
  within	
  the	
  set.	
  	
  
Analysis of Websites as Graphs for SEO
Optimization of PageRank in websites
The	
  PageRank	
  is	
  diluted	
  with	
  every	
  level	
  down	
  the	
  structure	
  of	
  categories	
  and	
  sub-­‐categories.	
  	
  
This is a waste of expensive PageRank Same information on a leaner, more efficient web architecture
PageRank	
  is	
  not	
  as	
  important	
  in	
  SEO	
  as	
  it	
  used	
  to	
  be.	
  It	
  is	
  s5ll	
  useful	
  to	
  op5mise	
  web	
  architectures	
  
On-­‐page	
  SEO	
  is	
  mostly	
  about	
  analysing	
  graphs,	
  measuring	
  them	
  and	
  op5mising	
  them	
  empirically	
  and	
  itera5vely	
  
Analysis of Websites as Graphs for SEO
Steps of the analysis of websites
Crawling	
  
a	
  website	
  
Cleaning	
  
the	
  output	
  
of	
  inlinks	
  
csv	
  file	
  
	
  
Source,Des5na5on	
  
Visualizing	
  
the	
  graph	
  
Analysing	
  the	
  
rela5ons	
  of	
  
specific	
  nodes	
  
Parameterizing	
  
the	
  whole	
  graph	
  
SEO	
  experts	
  are	
  usually	
  presented	
  with	
  inefficient	
  websites	
  that	
  require	
  ra5onaliza5on	
  and	
  more	
  o_en	
  than	
  not,	
  
extensive	
  re-­‐indexa5on	
  on	
  Google.	
  
	
  
Understanding	
  and	
  parameterizing	
  the	
  graph	
  of	
  a	
  website	
  before	
  and	
  a_er	
  radical	
  changes	
  of	
  its	
  structure	
  is	
  key.	
  
We	
  build	
  a	
  comma	
  separated	
  value	
  file	
  with	
  pairs	
  of	
  URLs	
  linking	
  to	
  other	
  URLs.	
  	
  
The	
  csv	
  file	
  contains	
  the	
  data	
  of	
  the	
  connected	
  graph	
  that	
  can	
  be	
  visualized,	
  parameterized	
  and	
  analysed.	
  
Analysis of Websites as Graphs for SEO
Crawling and exporting a csv file of inlinks
1st	
  	
  step	
  –	
  Crawl	
  a	
  significant	
  sample	
  of	
  the	
  webpages	
  of	
  a	
  website	
  
Desktop	
  applica5ons	
  
•  Screaming	
  Frog	
  (fee	
  per	
  licence,	
  all	
  OS)	
  
•  Xenu	
  Link	
  Sleuth	
  (free,	
  Windows)	
  
	
  
Bash	
  scripts	
  using	
  command	
  tools	
  	
  -­‐	
  Beware	
  –	
  poorly	
  wri@en	
  scripts	
  might	
  not	
  be	
  polite.	
  
•  CURL	
  
•  Wget	
  
	
  
	
  
(2nd	
  step	
  -­‐	
  Scrape	
  if	
  you	
  have	
  to	
  get	
  specific	
  snippets	
  of	
  text	
  from	
  the	
  crawled	
  pages)	
  
Scrapy	
  in	
  Python	
  
$	
  pip	
  install	
  scrapy	
  
	
  
	
  
(3rd	
  step	
  Extract	
  data	
  if	
  you	
  have	
  to	
  get	
  specific	
  URLs	
  linked	
  from	
  the	
  scraped	
  text)	
  
Beau5ful	
  Soup	
  
A	
  Python	
  library	
  for	
  pulling	
  data	
  out	
  of	
  HTML	
  and	
  XML	
  files.	
  
	
  
Analysis of Websites as Graphs for SEO
Cleansing & grooming of the output .csv file
Output:	
  csv	
  files	
  with	
  the	
  crawled	
  inlinks	
  
	
  
Origin,	
  Des5na5on	
  
URL	
  1,	
  URL	
  2	
  
URL	
  2,	
  URL	
  3	
  
URL	
  1,	
  URL	
  3	
  
…	
  
URL	
  n,	
  URL	
  m	
  
	
  
Clean	
  and	
  filter:	
  best	
  with	
  bash	
  one-­‐liners	
  
	
  
#!/bin/bash	
  
	
  
FILE=	
  
DOMAIN=	
  
	
  
cut	
  -­‐f2,3	
  $FILE	
  |	
  
sed	
  -­‐e	
  "s/http://$DOMAIN//g"	
  -­‐e	
  	
  "s/http://www."$DOMAIN"//g"	
  -­‐e	
  's/t/,/g'	
  |	
  
grep	
  –vi	
  ".jpg|http:|.css|.js|.gif|.png|@|mailto|xml|http|?|=“	
  
>	
  filtered.csv	
  
Analysis of Websites as Graphs for SEO
Visualization of a website or part of it
Gephi	
  is	
  an	
  interac5ve	
  visualiza5on	
  and	
  explora5on	
  plahorm	
  for	
  all	
  kinds	
  of	
  networks	
  and	
  complex	
  systems,	
  
dynamic	
  and	
  hierarchical	
  graphs.	
  	
  
	
  
It	
  performs	
  poorly	
  with	
  large	
  graphs	
  (tens	
  of	
  thousands	
  of	
  nodes	
  and	
  hundreds	
  of	
  thousands	
  of	
  inlinks).	
  
	
  
	
  
Other	
  tools?	
  –	
  promising	
  
	
  
Key	
  Lines	
  h@p://keylines.com/neo4j	
  
	
  
Tulip	
  h@p://tulip.labri.fr/TulipDrupal/	
  
Analysis of Websites as Graphs for SEO
Example 1 - Graph of the website of an annual conference
The	
  home	
  (dark	
  green	
  node	
  in	
  the	
  center)	
  links	
  down	
  to	
  categories	
  (light	
  green	
  or	
  light	
  orange)	
  like	
  the	
  page	
  of	
  
program	
  which	
  in	
  its	
  turn	
  links	
  down	
  to	
  item	
  pages	
  (dark	
  orange)	
  with	
  descrip5on	
  of	
  each	
  talk	
  with	
  bio	
  of	
  the	
  
speaker,	
  etc.	
  
This	
  web	
  architecture	
  seems	
  efficient	
  but	
  item	
  pages	
  might	
  be	
  be@er	
  connected	
  to	
  the	
  whole	
  graph	
  
The	
  cluster	
  on	
  
the	
  right	
  is	
  the	
  
1st	
  edi5on	
  of	
  
the	
  event	
  (few	
  
talks).	
  
The	
  cluster	
  on	
  
the	
  le_	
  is	
  the	
  
2nd	
  edi5on	
  of	
  
the	
  event	
  
(more	
  talks).	
  
Analysis of Websites as Graphs for SEO
Example 2 - Graph of the website of a shopping website
The	
  orange	
  dots	
  are	
  products	
  and	
  green	
  balls	
  categories.	
  Why	
  do	
  they	
  ALL	
  connect	
  to	
  each	
  other?	
  Aren’t	
  there	
  
products	
  more	
  relevant	
  to	
  users	
  and	
  to	
  the	
  business	
  than	
  others?	
  
Some	
  products	
  get	
  more	
  
traffic	
  but	
  yield	
  less	
  margin.	
  
	
  
The	
  op5mal	
  web	
  
architecture	
  overweighs	
  the	
  
internal	
  linking	
  to	
  the	
  most	
  
popular	
  products	
  with	
  the	
  
highest	
  revenue	
  or	
  margin.	
  
This	
  looks	
  like	
  a	
  
programma5c	
  linking	
  
scheme.	
  
	
  
Ecommerce	
  is	
  usually	
  more	
  
complex	
  than	
  it	
  is	
  
represented	
  here.	
  
	
  	
  
Analysis of Websites as Graphs for SEO
Example 3 - Graphs of 2 directly competing websites
This	
  looks	
  like	
  an	
  organic	
  network	
  of	
  clusters	
  connec5ng	
  
other	
  clusters	
  and	
  distant	
  nodes	
  with	
  thin	
  links.	
  	
  
This	
  is	
  a	
  dense	
  pack	
  of	
  many	
  webpages	
  connec5ng	
  to	
  many	
  
other	
  webpages	
  without	
  discernible	
  pa@erns	
  or	
  clusters.	
  
These	
  graphs	
  are	
  small	
  samples	
  of	
  2	
  large	
  websites	
  compe5ng	
  for	
  the	
  same	
  keywords	
  on	
  Google	
  
Both	
  websites	
  are	
  successful	
  SEO	
  proposi5ons	
  with	
  radically	
  different	
  approaches.	
  Why?	
  
Analysis of Websites as Graphs for SEO
Thin	
  connec5ons	
  tend	
  to	
  link	
  the	
  clusters,	
  allowing	
  informa5on	
  to	
  move	
  between	
  them.	
  	
  
Source: Giles, Jim. Making the links. Nature - Aug 23rd 2012
	
  
	
  
The power of weak links
These	
  networks	
  are	
  usually	
  efficient	
  enough	
  in	
  terms	
  of	
  SEO.	
  
Analysis of Websites as Graphs for SEO
Analysis of the whole graph
igraph	
  is	
  a	
  collec5on	
  of	
  network	
  analysis	
  tools	
  
	
  
It	
  is	
  available	
  in	
  R	
  
	
  
	
  
library(igraph)	
  
dat=read.csv(file.choose(),header=TRUE)	
  #	
  choose	
  an	
  edgelist	
  in	
  .csv	
  file	
  
format	
  
summary(dat)	
  
g=graph.data.frame(dat,directed=TRUE)	
  
vcount(g)	
  200637	
  
ecount(g)	
  4174400	
  
	
  
centralization.degree(g)	
  0.4998589	
  
Analysis of Websites as Graphs for SEO
Analysis of the whole graph - parameters
transitivity(g)	
  0.001666909	
  
graph.density(g)	
  0.0001036989	
  
igraph	
  calculates	
  metrics	
  of	
  whole	
  graphs	
  with	
  built-­‐in	
  func5ons.	
  
	
  
Transi5vity	
  or	
  clustering	
  coefficient	
  measures	
  the	
  probability	
  that	
  the	
  adjacent	
  ver;ces	
  of	
  the	
  ver;ces	
  or	
  a	
  graph	
  
are	
  connected.	
  This	
  metric	
  along	
  the	
  graph	
  density	
  are	
  useful	
  references	
  to	
  compare	
  websites	
  between	
  them	
  or	
  
one	
  website	
  before	
  and	
  a_er	
  changes	
  in	
  its	
  web	
  architecture.	
  	
  
website5	
  has	
  the	
  lowest	
  values	
  of	
  transi5vity	
  and	
  density:	
  increasing	
  them	
  would	
  result	
  in	
  an	
  improved	
  SEO	
  	
  
Sheet1
graph vertices edges diameter transitivity
website1 8305 34185 30 0.007959 0.000499
website2 10852 88732 16 0.004671 0.000721
website3 11272 71035 20 0.004017 0.000639
website4 11593 47380 32 0.003730 0.001088
website5 200637 4174400 n/a 0.001667 0.000104
graph
density
Analysis of Websites as Graphs for SEO
Analysis of specific nodes
	
  
h@p://console.neo4j.org/	
  
	
  
MATCH	
  (n:Crew)-­‐[r:LOVES*]-­‐(m)	
  
WHERE	
  n.name='Neo'	
  
RETURN	
  n,m	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
n	
   m	
  
(0:Crew	
  {name:"Neo"})	
   (2:Crew	
  {name:"Trinity"})	
  
Analysis of Websites as Graphs for SEO
Analysis of specific nodes
	
  
Count	
  the	
  number	
  of	
  nodes	
  connected	
  to	
  one	
  node	
  
	
  
MATCH	
  (n	
  {	
  name:	
  'Neo'	
  })-­‐-­‐>(x)	
  
RETURN	
  n,	
  count(*)	
  
	
  
	
  
	
  
	
  
	
  
	
  
MATCH	
  (n	
  {	
  name:	
  'Neo'	
  })-­‐-­‐>(x)	
  
RETURN	
  x	
  
	
  
(2:Crew	
  {name:"Trinity"})	
  
(1:Crew	
  {name:"Morpheus"})	
  
n	
   count(*)	
  	
  
(0:Crew	
  {name:"Neo"})	
   2
Analysis of Websites as Graphs for SEO
Analysis of specific nodes
MATCH	
  (n:Crew)-­‐[r:KNOWS*]-­‐(m:Matrix)	
  WHERE	
  n.name='Neo'	
  RETURN	
  m	
  
	
  
(3:Crew:Matrix	
  {name:"Cypher"})	
  
(4:Matrix	
  {name:"Agent	
  Smith"})	
  
	
  
	
  
Find	
  the	
  shortest	
  path	
  between	
  n	
  and	
  m	
  of	
  type	
  :LOVES	
  
	
  
MATCH	
  p	
  =	
  shortestPath((n:Crew)-­‐[:LOVES]-­‐>(m:Matrix))	
  
WHERE	
  n.name='Neo’	
  
RETURN	
  p	
  AS	
  Neo,m	
  
Analysis of Websites as Graphs for SEO
That’s all Folks!
Thank you.
Rubén	
  Marqnez	
  
@ruben_at_it	
  
rmar5nez@paradigmatecnologico.com	
  

Weitere ähnliche Inhalte

Andere mochten auch

Use Groovy&Grails in your spring boot projects
Use Groovy&Grails in your spring boot projectsUse Groovy&Grails in your spring boot projects
Use Groovy&Grails in your spring boot projectsParadigma Digital
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octParadigma Digital
 
Programación Reactiva con RxJava
Programación Reactiva con RxJavaProgramación Reactiva con RxJava
Programación Reactiva con RxJavaParadigma Digital
 
Google Analytics for Developers
Google Analytics for DevelopersGoogle Analytics for Developers
Google Analytics for DevelopersParadigma Digital
 
¿Cómo vencer a los dragones digitales?
¿Cómo vencer a los dragones digitales?¿Cómo vencer a los dragones digitales?
¿Cómo vencer a los dragones digitales?Paradigma Digital
 
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!Paradigma Digital
 

Andere mochten auch (16)

ECMAScript 6
ECMAScript 6ECMAScript 6
ECMAScript 6
 
Use Groovy&Grails in your spring boot projects
Use Groovy&Grails in your spring boot projectsUse Groovy&Grails in your spring boot projects
Use Groovy&Grails in your spring boot projects
 
Kafka y python
Kafka y pythonKafka y python
Kafka y python
 
Cómo usar google analytics
Cómo usar google analyticsCómo usar google analytics
Cómo usar google analytics
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4oct
 
Overview atlas (1)
Overview atlas (1)Overview atlas (1)
Overview atlas (1)
 
Programación Reactiva con RxJava
Programación Reactiva con RxJavaProgramación Reactiva con RxJava
Programación Reactiva con RxJava
 
Google Analytics for Developers
Google Analytics for DevelopersGoogle Analytics for Developers
Google Analytics for Developers
 
Transformación Digital
Transformación DigitalTransformación Digital
Transformación Digital
 
Python y Flink
Python y FlinkPython y Flink
Python y Flink
 
¿Cómo vencer a los dragones digitales?
¿Cómo vencer a los dragones digitales?¿Cómo vencer a los dragones digitales?
¿Cómo vencer a los dragones digitales?
 
HTML5 Web Components
HTML5 Web ComponentsHTML5 Web Components
HTML5 Web Components
 
Introducción a Kubernetes
Introducción a KubernetesIntroducción a Kubernetes
Introducción a Kubernetes
 
Introducción a Django
Introducción a DjangoIntroducción a Django
Introducción a Django
 
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!
¿Cómo se despliega y autoescala Couchbase en Cloud? ¡Aprende de manera práctica!
 
Cultura Digital Paradigma
Cultura Digital ParadigmaCultura Digital Paradigma
Cultura Digital Paradigma
 

Ähnlich wie Analysis of Websites as Graphs for SEO

Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystificationRaja R
 
IRJET- Page Ranking Algorithms – A Comparison
IRJET- Page Ranking Algorithms – A ComparisonIRJET- Page Ranking Algorithms – A Comparison
IRJET- Page Ranking Algorithms – A ComparisonIRJET Journal
 
Web2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google worldWeb2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google worldCarlo Vaccari
 
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.iosrjce
 
Web Search - Lecture 10 - Web Information Systems (4011474FNR)
Web Search - Lecture 10 - Web Information Systems (4011474FNR)Web Search - Lecture 10 - Web Information Systems (4011474FNR)
Web Search - Lecture 10 - Web Information Systems (4011474FNR)Beat Signer
 
Page rank by university of michagain.ppt
Page rank by university of michagain.pptPage rank by university of michagain.ppt
Page rank by university of michagain.pptrayyverma
 
Ranking algorithms
Ranking algorithmsRanking algorithms
Ranking algorithmsAnkit Raj
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneDoug Needham
 

Ähnlich wie Analysis of Websites as Graphs for SEO (20)

I04015559
I04015559I04015559
I04015559
 
Page Rank Link Farm Detection
Page Rank Link Farm DetectionPage Rank Link Farm Detection
Page Rank Link Farm Detection
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
 
IRJET- Page Ranking Algorithms – A Comparison
IRJET- Page Ranking Algorithms – A ComparisonIRJET- Page Ranking Algorithms – A Comparison
IRJET- Page Ranking Algorithms – A Comparison
 
Pagerank
PagerankPagerank
Pagerank
 
Macran
MacranMacran
Macran
 
TrustRank.PDF
TrustRank.PDFTrustRank.PDF
TrustRank.PDF
 
CSE509 Lecture 3
CSE509 Lecture 3CSE509 Lecture 3
CSE509 Lecture 3
 
Web2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google worldWeb2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google world
 
Sree saranya
Sree saranyaSree saranya
Sree saranya
 
Sree saranya
Sree saranyaSree saranya
Sree saranya
 
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
 
E017624043
E017624043E017624043
E017624043
 
Web mining
Web miningWeb mining
Web mining
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Ranking Web Pages
Ranking Web PagesRanking Web Pages
Ranking Web Pages
 
Web Search - Lecture 10 - Web Information Systems (4011474FNR)
Web Search - Lecture 10 - Web Information Systems (4011474FNR)Web Search - Lecture 10 - Web Information Systems (4011474FNR)
Web Search - Lecture 10 - Web Information Systems (4011474FNR)
 
Page rank by university of michagain.ppt
Page rank by university of michagain.pptPage rank by university of michagain.ppt
Page rank by university of michagain.ppt
 
Ranking algorithms
Ranking algorithmsRanking algorithms
Ranking algorithms
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 

Mehr von Paradigma Digital

Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.
Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.
Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.Paradigma Digital
 
Java 8 time to join the future
Java 8  time to join the futureJava 8  time to join the future
Java 8 time to join the futureParadigma Digital
 
Programación Reactiva con Spring WebFlux
Programación Reactiva con Spring WebFluxProgramación Reactiva con Spring WebFlux
Programación Reactiva con Spring WebFluxParadigma Digital
 
Orquestando microservicios como lo hace Netflix
Orquestando microservicios como lo hace NetflixOrquestando microservicios como lo hace Netflix
Orquestando microservicios como lo hace NetflixParadigma Digital
 
Meetup microservicios: API Management
Meetup microservicios: API ManagementMeetup microservicios: API Management
Meetup microservicios: API ManagementParadigma Digital
 
Meetup de kubernetes, conceptos básicos.
Meetup  de kubernetes, conceptos básicos.Meetup  de kubernetes, conceptos básicos.
Meetup de kubernetes, conceptos básicos.Paradigma Digital
 
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptx
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptxDocker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptx
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptxParadigma Digital
 
Implementando microservicios
Implementando microserviciosImplementando microservicios
Implementando microserviciosParadigma Digital
 
Equipo de Marketing de Paradigma Digital
Equipo de Marketing de Paradigma DigitalEquipo de Marketing de Paradigma Digital
Equipo de Marketing de Paradigma DigitalParadigma Digital
 

Mehr von Paradigma Digital (14)

Ddd + ah + microservicios
Ddd + ah + microserviciosDdd + ah + microservicios
Ddd + ah + microservicios
 
Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.
Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.
Bots 3.0: Dejando atrás los bots conversacionales con Dialogflow.
 
Have you met Istio?
Have you met Istio?Have you met Istio?
Have you met Istio?
 
Linkerd a fondo
Linkerd a fondoLinkerd a fondo
Linkerd a fondo
 
Horneando apis
Horneando apisHorneando apis
Horneando apis
 
Java 8 time to join the future
Java 8  time to join the futureJava 8  time to join the future
Java 8 time to join the future
 
Programación Reactiva con Spring WebFlux
Programación Reactiva con Spring WebFluxProgramación Reactiva con Spring WebFlux
Programación Reactiva con Spring WebFlux
 
Orquestando microservicios como lo hace Netflix
Orquestando microservicios como lo hace NetflixOrquestando microservicios como lo hace Netflix
Orquestando microservicios como lo hace Netflix
 
Meetup microservicios: API Management
Meetup microservicios: API ManagementMeetup microservicios: API Management
Meetup microservicios: API Management
 
Meetup de kubernetes, conceptos básicos.
Meetup  de kubernetes, conceptos básicos.Meetup  de kubernetes, conceptos básicos.
Meetup de kubernetes, conceptos básicos.
 
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptx
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptxDocker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptx
Docker, kubernetes, openshift y openstack, para mi abuela. techfest 2017.pptx
 
Implementando microservicios
Implementando microserviciosImplementando microservicios
Implementando microservicios
 
Equipo de Marketing de Paradigma Digital
Equipo de Marketing de Paradigma DigitalEquipo de Marketing de Paradigma Digital
Equipo de Marketing de Paradigma Digital
 
Seminario Apache Solr
Seminario Apache SolrSeminario Apache Solr
Seminario Apache Solr
 

Kürzlich hochgeladen

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 

Kürzlich hochgeladen (20)

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 

Analysis of Websites as Graphs for SEO

  • 1. Analysis of Websites as Graphs for SEO Analysis of Websites as Graphs for SEO Rubén Martínez – Junio 2015 – Open Analytics Madrid
  • 2. Analysis of Websites as Graphs for SEO Items  (books,  music,  etc)  used  to  be  arranged  in  5ght  silos  by  categories  
  • 3. Analysis of Websites as Graphs for SEO There is more to websites than meets the eye Has  a  website  ever  been  this  boring?   We  tend  to  think  of  websites  as  a  homepage  on  the  top  followed  by  a  second  layer  of  children  webpages  (categories),     a  third  level  below  (sub-­‐categories)  and  pages  of  items  (products,  ar5cles,  etc)  at  the  bo@om.   Happily,  reality  is  not  so  simple!  
  • 4. Analysis of Websites as Graphs for SEO First-ever website - 1990 Source:  Tim  Berners-­‐Lee's  web  catalog  at  CERN.   A  copy  is  available  at  h@p://www.w3.org/History/19921103-­‐hypertext/hypertext/WWW/TheProject.html   Not  even  the  1st  ever  website  was  a  simple  hierarchical  tree  of  categories  and  sub-­‐categories  
  • 5. Analysis of Websites as Graphs for SEO Websites are graphs Graph  theory     A  graph  is  an  ordered  pair  G  =  (V,  E)  comprising   a  set  V  of  ver5ces  or  nodes  together  with  a  set   E  of  edges  or  links.     Websites     Websites  are  graphs  whose  webpages  are   nodes  and  links,  directed  edges.   Actual  websites  are  a  more  organic,  messy  business   Visualiza5on  of  a  300-­‐pages  ecommerce  website  
  • 6. Analysis of Websites as Graphs for SEO Link analysis in graph theory PageRank  is  a  link  analysis  algorithm.  It  outputs  a  probability  distribu;on  that  represents  the  likelihood  that  a   person  clicking  on  links  will  arrive  at  any  par;cular  page.   Google’s  reasonable  surfer  model  of  weigh5ng  of  hyperlinks  by  their  posi5on  on  the  page   It  assigns  a  numerical  weigh5ng  to  each  element  of  a  hyperlinked  set  of  documents,  such  as  the  World  Wide  Web,   with  the  purpose  of  "measuring"  its  rela5ve  importance  within  the  set.    
  • 7. Analysis of Websites as Graphs for SEO Optimization of PageRank in websites The  PageRank  is  diluted  with  every  level  down  the  structure  of  categories  and  sub-­‐categories.     This is a waste of expensive PageRank Same information on a leaner, more efficient web architecture PageRank  is  not  as  important  in  SEO  as  it  used  to  be.  It  is  s5ll  useful  to  op5mise  web  architectures   On-­‐page  SEO  is  mostly  about  analysing  graphs,  measuring  them  and  op5mising  them  empirically  and  itera5vely  
  • 8. Analysis of Websites as Graphs for SEO Steps of the analysis of websites Crawling   a  website   Cleaning   the  output   of  inlinks   csv  file     Source,Des5na5on   Visualizing   the  graph   Analysing  the   rela5ons  of   specific  nodes   Parameterizing   the  whole  graph   SEO  experts  are  usually  presented  with  inefficient  websites  that  require  ra5onaliza5on  and  more  o_en  than  not,   extensive  re-­‐indexa5on  on  Google.     Understanding  and  parameterizing  the  graph  of  a  website  before  and  a_er  radical  changes  of  its  structure  is  key.   We  build  a  comma  separated  value  file  with  pairs  of  URLs  linking  to  other  URLs.     The  csv  file  contains  the  data  of  the  connected  graph  that  can  be  visualized,  parameterized  and  analysed.  
  • 9. Analysis of Websites as Graphs for SEO Crawling and exporting a csv file of inlinks 1st    step  –  Crawl  a  significant  sample  of  the  webpages  of  a  website   Desktop  applica5ons   •  Screaming  Frog  (fee  per  licence,  all  OS)   •  Xenu  Link  Sleuth  (free,  Windows)     Bash  scripts  using  command  tools    -­‐  Beware  –  poorly  wri@en  scripts  might  not  be  polite.   •  CURL   •  Wget       (2nd  step  -­‐  Scrape  if  you  have  to  get  specific  snippets  of  text  from  the  crawled  pages)   Scrapy  in  Python   $  pip  install  scrapy       (3rd  step  Extract  data  if  you  have  to  get  specific  URLs  linked  from  the  scraped  text)   Beau5ful  Soup   A  Python  library  for  pulling  data  out  of  HTML  and  XML  files.    
  • 10. Analysis of Websites as Graphs for SEO Cleansing & grooming of the output .csv file Output:  csv  files  with  the  crawled  inlinks     Origin,  Des5na5on   URL  1,  URL  2   URL  2,  URL  3   URL  1,  URL  3   …   URL  n,  URL  m     Clean  and  filter:  best  with  bash  one-­‐liners     #!/bin/bash     FILE=   DOMAIN=     cut  -­‐f2,3  $FILE  |   sed  -­‐e  "s/http://$DOMAIN//g"  -­‐e    "s/http://www."$DOMAIN"//g"  -­‐e  's/t/,/g'  |   grep  –vi  ".jpg|http:|.css|.js|.gif|.png|@|mailto|xml|http|?|=“   >  filtered.csv  
  • 11. Analysis of Websites as Graphs for SEO Visualization of a website or part of it Gephi  is  an  interac5ve  visualiza5on  and  explora5on  plahorm  for  all  kinds  of  networks  and  complex  systems,   dynamic  and  hierarchical  graphs.       It  performs  poorly  with  large  graphs  (tens  of  thousands  of  nodes  and  hundreds  of  thousands  of  inlinks).       Other  tools?  –  promising     Key  Lines  h@p://keylines.com/neo4j     Tulip  h@p://tulip.labri.fr/TulipDrupal/  
  • 12. Analysis of Websites as Graphs for SEO Example 1 - Graph of the website of an annual conference The  home  (dark  green  node  in  the  center)  links  down  to  categories  (light  green  or  light  orange)  like  the  page  of   program  which  in  its  turn  links  down  to  item  pages  (dark  orange)  with  descrip5on  of  each  talk  with  bio  of  the   speaker,  etc.   This  web  architecture  seems  efficient  but  item  pages  might  be  be@er  connected  to  the  whole  graph   The  cluster  on   the  right  is  the   1st  edi5on  of   the  event  (few   talks).   The  cluster  on   the  le_  is  the   2nd  edi5on  of   the  event   (more  talks).  
  • 13. Analysis of Websites as Graphs for SEO Example 2 - Graph of the website of a shopping website The  orange  dots  are  products  and  green  balls  categories.  Why  do  they  ALL  connect  to  each  other?  Aren’t  there   products  more  relevant  to  users  and  to  the  business  than  others?   Some  products  get  more   traffic  but  yield  less  margin.     The  op5mal  web   architecture  overweighs  the   internal  linking  to  the  most   popular  products  with  the   highest  revenue  or  margin.   This  looks  like  a   programma5c  linking   scheme.     Ecommerce  is  usually  more   complex  than  it  is   represented  here.      
  • 14. Analysis of Websites as Graphs for SEO Example 3 - Graphs of 2 directly competing websites This  looks  like  an  organic  network  of  clusters  connec5ng   other  clusters  and  distant  nodes  with  thin  links.     This  is  a  dense  pack  of  many  webpages  connec5ng  to  many   other  webpages  without  discernible  pa@erns  or  clusters.   These  graphs  are  small  samples  of  2  large  websites  compe5ng  for  the  same  keywords  on  Google   Both  websites  are  successful  SEO  proposi5ons  with  radically  different  approaches.  Why?  
  • 15. Analysis of Websites as Graphs for SEO Thin  connec5ons  tend  to  link  the  clusters,  allowing  informa5on  to  move  between  them.     Source: Giles, Jim. Making the links. Nature - Aug 23rd 2012     The power of weak links These  networks  are  usually  efficient  enough  in  terms  of  SEO.  
  • 16. Analysis of Websites as Graphs for SEO Analysis of the whole graph igraph  is  a  collec5on  of  network  analysis  tools     It  is  available  in  R       library(igraph)   dat=read.csv(file.choose(),header=TRUE)  #  choose  an  edgelist  in  .csv  file   format   summary(dat)   g=graph.data.frame(dat,directed=TRUE)   vcount(g)  200637   ecount(g)  4174400     centralization.degree(g)  0.4998589  
  • 17. Analysis of Websites as Graphs for SEO Analysis of the whole graph - parameters transitivity(g)  0.001666909   graph.density(g)  0.0001036989   igraph  calculates  metrics  of  whole  graphs  with  built-­‐in  func5ons.     Transi5vity  or  clustering  coefficient  measures  the  probability  that  the  adjacent  ver;ces  of  the  ver;ces  or  a  graph   are  connected.  This  metric  along  the  graph  density  are  useful  references  to  compare  websites  between  them  or   one  website  before  and  a_er  changes  in  its  web  architecture.     website5  has  the  lowest  values  of  transi5vity  and  density:  increasing  them  would  result  in  an  improved  SEO     Sheet1 graph vertices edges diameter transitivity website1 8305 34185 30 0.007959 0.000499 website2 10852 88732 16 0.004671 0.000721 website3 11272 71035 20 0.004017 0.000639 website4 11593 47380 32 0.003730 0.001088 website5 200637 4174400 n/a 0.001667 0.000104 graph density
  • 18. Analysis of Websites as Graphs for SEO Analysis of specific nodes   h@p://console.neo4j.org/     MATCH  (n:Crew)-­‐[r:LOVES*]-­‐(m)   WHERE  n.name='Neo'   RETURN  n,m                   n   m   (0:Crew  {name:"Neo"})   (2:Crew  {name:"Trinity"})  
  • 19. Analysis of Websites as Graphs for SEO Analysis of specific nodes   Count  the  number  of  nodes  connected  to  one  node     MATCH  (n  {  name:  'Neo'  })-­‐-­‐>(x)   RETURN  n,  count(*)               MATCH  (n  {  name:  'Neo'  })-­‐-­‐>(x)   RETURN  x     (2:Crew  {name:"Trinity"})   (1:Crew  {name:"Morpheus"})   n   count(*)     (0:Crew  {name:"Neo"})   2
  • 20. Analysis of Websites as Graphs for SEO Analysis of specific nodes MATCH  (n:Crew)-­‐[r:KNOWS*]-­‐(m:Matrix)  WHERE  n.name='Neo'  RETURN  m     (3:Crew:Matrix  {name:"Cypher"})   (4:Matrix  {name:"Agent  Smith"})       Find  the  shortest  path  between  n  and  m  of  type  :LOVES     MATCH  p  =  shortestPath((n:Crew)-­‐[:LOVES]-­‐>(m:Matrix))   WHERE  n.name='Neo’   RETURN  p  AS  Neo,m  
  • 21. Analysis of Websites as Graphs for SEO That’s all Folks! Thank you. Rubén  Marqnez   @ruben_at_it   rmar5nez@paradigmatecnologico.com