SlideShare ist ein Scribd-Unternehmen logo
1 von 22
The PageRank Citation Ranking:Bringing Order to the Web Larry Page etc. Stanford University Presented by Guoqiang Su & Wei Li
Contents Motivation Related work Page Rank & Random Surfer Model Implementation Application Conclusion
Motivation ,[object Object]
Free of quality control on the web
Commercial interest to manipulate ranking,[object Object]
Backlink Link Structure of the Web Approximation of importance / quality
PageRank Pages with lots of backlinks are important Backlinks coming from important pages convey more importance to a page Problem: Rank Sink
Rank Sink Page cycles pointed by some incoming link Problem: this loop will accumulate rank but never distribute any rank outside
Escape Term Solution: Rank Source c is maximized and       = 1 E(u) is some vector over the web pages 	– uniform, favorite page etc.
Matrix Notation R is the dominant eigenvector and c is the dominant eigenvalue of                because c is maximized
Computing PageRank                                          		- initialize vector over web pages loop:                                          		- new ranks sum of normalized backlink ranks                             			      			           		- compute normalizing factor 			            		- add escape term 				 	- control parameter while                                		- stop when converged
Random Surfer Model Page Rank corresponds to the probability distribution of a random walk on the web graphs E(u) can be re-phrased as the random surfer gets bored periodically and jumps to a different page and not kept in a loop forever
Implementation Computing resources     —  24 million pages     —  75 million URLs Memory and disk storage Weight Vector    (4 byte float) 			 Matrix A    (linear access)
Implementation (Con't) Unique integer ID for each URL Sort and Remove dangling links Rank initial assignment Iteration until convergence Add back dangling links and Re-compute
Convergence Properties Graph (V, E) is an expander with factor  if for all (not too large) subsets S: |As| |s| Eigenvalue separation: Largest eigenvalue is sufficiently larger than the second-largest eigenvalue Random walk converges fast to a limiting probability distribution on a set of nodes in the graph.
Convergence Properties (con't) PageRank computation is O(log(|V|)) due to rapidly mixing graph G of the web.
Personalized PageRank Rank Source E can be initialized : 	– uniformly over all pages: e.g. copyright  	warnings, disclaimers, mailing lists archives  result in overly high ranking 	– total weight on a single page, e.g. Netscape, McCarthy   great variation of ranks under different single pages 	as rank source 	– and everything in-between, e.g. server root pages  allow manipulation by commercial interests
Applications I Estimate web traffic 	– Server/page aliases  	– Link/traffic disparity, e.g. porn sites, free web-mail Backlink predictor 	– Citation counts have been used to predict future citations  	– very difficult to map the citation structure of the web completely 	– avoid the local maxima that citation counts get stuck in and get better performance
Applications II - Ranking Proxy Surfer's Navigation Aid Annotating links by PageRank (bar graph) Not query dependent
Issues Users are no random walkers     	– Content based methods Starting point distribution – Actual usage data as starting vector Reinforcing effects/bias towards main pages How about traffic to ranking pages? No query specific rank Linkage spam     – PageRank favors pages that managed to get other pages to link to           them     – Linkage not necessarily a sign of relevancy, only of promotion           (advertisement…)
Evaluation I
Evaluation II

Weitere ähnliche Inhalte

Was ist angesagt? (10)

Pagerank
PagerankPagerank
Pagerank
 
Pagerank(2)
Pagerank(2)Pagerank(2)
Pagerank(2)
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
The Pagerank
The PagerankThe Pagerank
The Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 

Andere mochten auch

NPX Overview
NPX OverviewNPX Overview
NPX Overviewmurp5348
 
Tidsstyring og selvledelse
Tidsstyring og selvledelseTidsstyring og selvledelse
Tidsstyring og selvledelsePeder Giertsen
 
Analox Military Systems
Analox Military SystemsAnalox Military Systems
Analox Military SystemsAnalox_AMS
 

Andere mochten auch (11)

Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
 
Alahad Group HR Services Staffing Solutions Training Payroll Outsourcing
Alahad Group HR Services Staffing Solutions Training Payroll OutsourcingAlahad Group HR Services Staffing Solutions Training Payroll Outsourcing
Alahad Group HR Services Staffing Solutions Training Payroll Outsourcing
 
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
 
Kefir 2 Japan
Kefir 2 JapanKefir 2 Japan
Kefir 2 Japan
 
Jobs in Saudi Arabia Search Saudi Arabia Jobs Recruitment Agencies in KSA
Jobs in Saudi Arabia Search Saudi Arabia Jobs Recruitment Agencies in KSAJobs in Saudi Arabia Search Saudi Arabia Jobs Recruitment Agencies in KSA
Jobs in Saudi Arabia Search Saudi Arabia Jobs Recruitment Agencies in KSA
 
B2B Manpower Nepal | Recruitment Agencies in Nepal
B2B Manpower Nepal | Recruitment Agencies in NepalB2B Manpower Nepal | Recruitment Agencies in Nepal
B2B Manpower Nepal | Recruitment Agencies in Nepal
 
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
 
NPX Overview
NPX OverviewNPX Overview
NPX Overview
 
Tidsstyring og selvledelse
Tidsstyring og selvledelseTidsstyring og selvledelse
Tidsstyring og selvledelse
 
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
Recruitment Agencies in Pakistan, Employment Agencies Pakistan, Manpower Agen...
 
Analox Military Systems
Analox Military SystemsAnalox Military Systems
Analox Military Systems
 

Ähnlich wie Pagerank

Introduccion a las Finanzas
Introduccion a las FinanzasIntroduccion a las Finanzas
Introduccion a las Finanzaslaflaquita165
 
Pagerank
PagerankPagerank
PagerankCarlos
 
Pagerank Di
Pagerank DiPagerank Di
Pagerank Dizulemita
 
Pagerank (1)
Pagerank (1)Pagerank (1)
Pagerank (1)diego
 
Pagerank
PagerankPagerank
Pagerankkaren
 
Page rank by university of michagain.ppt
Page rank by university of michagain.pptPage rank by university of michagain.ppt
Page rank by university of michagain.pptrayyverma
 
Pagerank
PagerankPagerank
PagerankESPOL
 
Pagerank
PagerankPagerank
PagerankAdrian
 
Incremental Page Rank Computation on Evolving Graphs : NOTES
Incremental Page Rank Computation on Evolving Graphs : NOTESIncremental Page Rank Computation on Evolving Graphs : NOTES
Incremental Page Rank Computation on Evolving Graphs : NOTESSubhajit Sahu
 
Markov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfMarkov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfrayyverma
 

Ähnlich wie Pagerank (20)

Pagerank
PagerankPagerank
Pagerank
 
Introduccion a las Finanzas
Introduccion a las FinanzasIntroduccion a las Finanzas
Introduccion a las Finanzas
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank(2)
Pagerank(2)Pagerank(2)
Pagerank(2)
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank Di
Pagerank DiPagerank Di
Pagerank Di
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank (1)
Pagerank (1)Pagerank (1)
Pagerank (1)
 
Pagerank
PagerankPagerank
Pagerank
 
Page rank by university of michagain.ppt
Page rank by university of michagain.pptPage rank by university of michagain.ppt
Page rank by university of michagain.ppt
 
Ranking Web Pages
Ranking Web PagesRanking Web Pages
Ranking Web Pages
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Pagerank
PagerankPagerank
Pagerank
 
Incremental Page Rank Computation on Evolving Graphs : NOTES
Incremental Page Rank Computation on Evolving Graphs : NOTESIncremental Page Rank Computation on Evolving Graphs : NOTES
Incremental Page Rank Computation on Evolving Graphs : NOTES
 
Markov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfMarkov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdf
 
Page Rank
Page RankPage Rank
Page Rank
 
Pagerank (1)
Pagerank (1)Pagerank (1)
Pagerank (1)
 

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Pagerank

  • 1. The PageRank Citation Ranking:Bringing Order to the Web Larry Page etc. Stanford University Presented by Guoqiang Su & Wei Li
  • 2. Contents Motivation Related work Page Rank & Random Surfer Model Implementation Application Conclusion
  • 3.
  • 4. Free of quality control on the web
  • 5.
  • 6. Backlink Link Structure of the Web Approximation of importance / quality
  • 7. PageRank Pages with lots of backlinks are important Backlinks coming from important pages convey more importance to a page Problem: Rank Sink
  • 8. Rank Sink Page cycles pointed by some incoming link Problem: this loop will accumulate rank but never distribute any rank outside
  • 9. Escape Term Solution: Rank Source c is maximized and = 1 E(u) is some vector over the web pages – uniform, favorite page etc.
  • 10. Matrix Notation R is the dominant eigenvector and c is the dominant eigenvalue of because c is maximized
  • 11. Computing PageRank - initialize vector over web pages loop: - new ranks sum of normalized backlink ranks - compute normalizing factor - add escape term - control parameter while - stop when converged
  • 12. Random Surfer Model Page Rank corresponds to the probability distribution of a random walk on the web graphs E(u) can be re-phrased as the random surfer gets bored periodically and jumps to a different page and not kept in a loop forever
  • 13. Implementation Computing resources — 24 million pages — 75 million URLs Memory and disk storage Weight Vector (4 byte float) Matrix A (linear access)
  • 14. Implementation (Con't) Unique integer ID for each URL Sort and Remove dangling links Rank initial assignment Iteration until convergence Add back dangling links and Re-compute
  • 15. Convergence Properties Graph (V, E) is an expander with factor  if for all (not too large) subsets S: |As| |s| Eigenvalue separation: Largest eigenvalue is sufficiently larger than the second-largest eigenvalue Random walk converges fast to a limiting probability distribution on a set of nodes in the graph.
  • 16. Convergence Properties (con't) PageRank computation is O(log(|V|)) due to rapidly mixing graph G of the web.
  • 17. Personalized PageRank Rank Source E can be initialized : – uniformly over all pages: e.g. copyright warnings, disclaimers, mailing lists archives  result in overly high ranking – total weight on a single page, e.g. Netscape, McCarthy  great variation of ranks under different single pages as rank source – and everything in-between, e.g. server root pages  allow manipulation by commercial interests
  • 18. Applications I Estimate web traffic – Server/page aliases – Link/traffic disparity, e.g. porn sites, free web-mail Backlink predictor – Citation counts have been used to predict future citations – very difficult to map the citation structure of the web completely – avoid the local maxima that citation counts get stuck in and get better performance
  • 19. Applications II - Ranking Proxy Surfer's Navigation Aid Annotating links by PageRank (bar graph) Not query dependent
  • 20. Issues Users are no random walkers – Content based methods Starting point distribution – Actual usage data as starting vector Reinforcing effects/bias towards main pages How about traffic to ranking pages? No query specific rank Linkage spam – PageRank favors pages that managed to get other pages to link to them – Linkage not necessarily a sign of relevancy, only of promotion (advertisement…)
  • 23. Conclusion PageRank is a global ranking based on the web's graph structure PageRank use backlinks information to bring order to the web PageRank can separate out representative pages as cluster center A great variety of applications