page ranking algorithm

J
“PAGE RANKING”
ALGORITHM
INTRODUCTION
• Finding useful information on the World Wide Web is something many of us take for
granted. According to the Internet research firm Netcraft, there are nearly 150,000,000
active Web sites on the Internet today.
• Google's algorithm does the work for you by searching out Web pages that contain
the keywords you used to search, then assigning a rank to each page based several
factors, including how many times the keywords appear on the page. Higher ranked
pages appear further up in Google's search engine results page (SERP), meaning that
the best links relating to your search query are theoretically the first ones Google lists.
• Automated programs called spiders or crawlers travel the Web, moving from link to link
and building up an index page that includes certain keywords. Google references this
index when a user enters a search query. The search engine lists the pages that contain
the same keywords that were in the user's search terms.
• Also like other search engines, Google has a large index of keywords and where those words can be found.
What sets Google apart is how it ranks search results, which in turn determines the order Google displays results
on its search engine results page (SERP). Google uses a trademarked algorithm called PageRank, which assigns
each Web page a relevancy score.
• Keyword placement plays a part in how Google finds sites. Google looks for keywords throughout each Web
page, but some sections are more important than others. Including the keyword in the Web page's title is a
good idea, for example. Google also searches for keywords in headings.
How to decide which page is to be selected and which has to be left out,
google does this by asking questions 200 of them, few important ones are:
i. How many time the keyword is contained in the page ? i.e.
frequency of the word in the page
ii. Do words appear in title ,URL, directly adjacent, meta tag?
iii. Does page include Synonyms..
iv. Page from quality website, low quality,…
v. Page rank?
PAGERANKING ALGORITHM
• Google’s PageRank algorithm has become one of the most famous in
computer science. It was originally designed to rank websites according
to their importance by assuming that a site is important if it is linked to by
other important sites it follows the real life philosophy that
“How does a product or an individual get popular when people other
than the individual know about that individual or product “
which is similar to page ranking of a page when other webpages has a
link to the specific web page.
• The algorithm works by counting the links to a website and the
importance of the sites these come from. It then uses this to work out the
importance of the original site. Through a process of iteration, the
algorithm comes up with a ranking.
• PageRank assigns a rank or score to every search result. The higher the page's
score, the further up the search results list it will appear.
• Scores are partially determined by the number of other Web pages that link to
the target page. Each link is counted as a vote for the target. The logic behind
this is that pages with high quality content will be linked to, more often than
mediocre pages.
• Not all votes are equal. Votes from a high-ranking Web page count more than
votes from low-ranking sites. You can't really boost one Web page's rank by
making a bunch of empty Web sites linking back to the target page.
• The more links a Web page sends out, the more diluted its voting power
becomes. In other words, if a high-ranking page links to hundreds of other pages,
each individual vote won't count as much as it would if the page only linked to a
few sites.
• Other factors that might affect scoring include the how long the site has been
around, the strength of the domain name, how and where the keywords appear
on the site and the age of the links going to and from the site. Google tends to
place more value on sites that have been around for a while.
A Web page's PageRank depends on a few factors:
• The frequency and location of keywords within the Web page: If the
keyword only appears once within the body of a page, it will receive
a low score for that keyword.
• How long the Web page has existed: People create new Web pages
every day, and not all of them stick around for long. Google places
more value on pages with an established history.
• The number of other Web pages that link to the page in question:
Google looks at how many Web pages link to a particular site to
determine its relevance.
• Out of these three factors, the third
is the most important. It's easier to
understand it with an example.
• Let's look at a search for the terms
"Planet Earth.“
• As more Web pages link to
Discovery's Planet Earth page, the
Discovery page's rank increases.
When Discovery's page ranks higher
than other pages, it shows up at the
top of the Google search results
page.
PageRank description
We assume page A has pages T1...Tn which point to it .
The parameter d is a damping factor which can be set between 0 and 1. We usually set d
to 0.85.
The PageRank theory holds that an imaginary surfer who is randomly clicking on links will
eventually stop clicking.
The probability, at any step, that the person will continue is a damping factor d.
Various studies have tested different damping factors, but it is generally assumed that the
damping factor will be set around 0.85.
Also C(A) is defined as the number of links going out of page A.
The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
the PageRank's form a probability distribution over web pages,
“so the sum of all web pages' PageRank's will be one”.
How is PageRank Calculated?
• The PR of each page depends on the PR of the pages pointing to it. But
we won’t know what PR those pages have until the pages pointing
to them have their PR calculated and so on… And when you consider that
page links can form circles it seems impossible to do this calculation!
• the Google paper says:
PageRank or PR(A) can be calculated using a simple iterative algorithm,
and corresponds to the principal eigenvector of the normalized link matrix of
the web.
What that means to us is that we can just go ahead and calculate a page’s
PR without knowing the final value of the PR of the other pages. That seems
strange but, basically, each time we run the calculation we’re getting a
closer estimate of the final value. So all we need to do is remember the
each value we calculate and repeat the calculations lots of times until the
numbers stop changing much.
Lets take the simplest example network: two pages, each pointing to the
other:
Each page has one outgoing link (the outgoing count is 1, i.e. C(A) = 1 and
C(B) = 1).
1. GUESS 1 d= 0.85
PR(A)= (1 – d) + d(PR(B)/1)
PR(B)= (1 – d) + d(PR(A)/1)
PR(A)= 0.15 + 0.85 * 1
= 1
PR(B)= 0.15 + 0.85 * 1
= 1
We don’t know what their PR should be to begin with, so let’s take a guess at 1.0 and do some calculations:
i.e.
2. GUESS 2
PR(A)= 0.15 + 0.85 * 0
= 0.15
PR(B)= 0.15 + 0.85 * 0.15
= 0.2775
PR(A)= 0.15 + 0.85 * 0.2775
= 0.385875
PR(B)= 0.15 + 0.85 * 0.385875
= 0.47799375
PR(A)= 0.15 + 0.85 * 0.47799375
= 0.5562946875
PR(B)= 0.15 + 0.85 * 0.5562946875
= 0.622850484375
Ok, let’s start the guess at 0 instead and re-calculate:
And again:
And again:
and so on. The numbers just keep going up. But will the numbers stop increasing when they get to 1.0? What if a calculation
over-shoots and goes above 1.0?
3. GUESS 3
Let’s start the guess at 40 each and do a few cycles:
PR(A) = 40
• Principle: it doesn’t matter where you start your guess, once the PageRank calculations
have settled down, the “normalized probability distribution” (the average PageRank for
all pages) will be 1.0
PR(A)= 0.15 + 0.85 * 40
= 34.25
PR(B)= 0.15 + 0.85 * 0.385875
= 29.1775
PR(A)= 0.15 + 0.85 * 29.1775
= 24.950875
PR(B)= 0.15 + 0.85 * 24.950875
= 21.35824375
First calculation
And again
PR(D)= (1-d) + d * (0)
= 0.15
no backlinks means the equation looks like this:
no matter what else is going on or how many times you do it.
Observation: every page has at least a PR of 0.15 to share out.
• Our home page has 2 and a
half times as much PR as the
child pages! Excellent!
• This is what we’d expect. All
the pages have the same
number of incoming links, all
pages are of equal
importance to each other, all
pages get the same PR of 1.0
(i.e. the “average”
probability).
EXAMPLES
• Because Google looks at links to a Web page as a vote, it's not easy to cheat the system. The best way to make sure
your Web page is high up on Google's search results is to provide great content so that people will link back to your
page. The more links your page gets, the higher its PageRank score will be. If you attract the attention of sites with a
high PageRank score, your score will grow faster.
• Mega-sites, like http://news.bbc.co.uk have tens or hundreds of editors writing new content – i.e. new pages - all day
long! Each one of those pages has rich, worthwhile content of its own and a link back to its parent or the home page!
That’s why the Home page Toolbar PR of these sites is 9/10 and the rest of us just get pushed lower and lower by
comparison…
• Principle: Content Is King! There really is no substitute for lots of good content…
Steps to a enhance your PAGERANK
1.Give visitors the information they're looking for
• Provide high-quality content on your pages, especially your homepage. This is the single most
important thing to do. If your pages contain useful information,their content will attract many
visitors and entice webmasters to link to your site. Think about the words users would type to
find your pages and include those words on your site.
2. Make sure that other sites link to yours
• Links help our crawlers find your site and can give your site greater visibility in our search results.
When returning results for a search, Google uses sophisticated text-matching techniques to
display pages that are both important and relevant to each search. Google interprets a link
from page A to page B as a vote by page A for page B.
3. Make your site easily accessible
• Build your site with a logical link structure. Every page should be reachable from at least one
static text link.
BIBLIOGRAPHY
• http://www.google.com/googlebot
• www.wikipedia.org
• http://infolab.stanford.edu/~backrub/google.html
THANK YOU
1 von 18

Recomendados

PageRank von
PageRankPageRank
PageRankabhav_luthra
1.4K views22 Folien
Seo and page rank algorithm von
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithmNilkanth Shirodkar
1.2K views26 Folien
Pagerank Algorithm Explained von
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explainedjdhaar
21K views18 Folien
Page-Rank Algorithm Final von
Page-Rank Algorithm FinalPage-Rank Algorithm Final
Page-Rank Algorithm FinalWilliam Keene
1.4K views38 Folien
Page rank algortihm von
Page rank algortihmPage rank algortihm
Page rank algortihmSiddharth Kar
404 views16 Folien
Page Rank von
Page RankPage Rank
Page RankPramit Kumar
3.1K views22 Folien

Más contenido relacionado

Was ist angesagt?

Linear algebra behind Google search von
Linear algebra behind Google searchLinear algebra behind Google search
Linear algebra behind Google searchPlusOrMinusZero
2.8K views80 Folien
Google PageRank von
Google PageRankGoogle PageRank
Google PageRankBeat Signer
13.7K views29 Folien
Ranking algorithms von
Ranking algorithmsRanking algorithms
Ranking algorithmsAnkit Raj
12.7K views25 Folien
Google page rank von
Google page rankGoogle page rank
Google page rankYifan Li
547 views11 Folien
Pagerank and hits von
Pagerank and hitsPagerank and hits
Pagerank and hitsShatakirti Er
20.6K views13 Folien
Link Analysis von
Link AnalysisLink Analysis
Link AnalysisYusuke Yamamoto
1.7K views50 Folien

Was ist angesagt?(20)

Linear algebra behind Google search von PlusOrMinusZero
Linear algebra behind Google searchLinear algebra behind Google search
Linear algebra behind Google search
PlusOrMinusZero2.8K views
Google PageRank von Beat Signer
Google PageRankGoogle PageRank
Google PageRank
Beat Signer13.7K views
Ranking algorithms von Ankit Raj
Ranking algorithmsRanking algorithms
Ranking algorithms
Ankit Raj12.7K views
Google page rank von Yifan Li
Google page rankGoogle page rank
Google page rank
Yifan Li547 views
PageRank_algorithm_Nfaoui_El_Habib von El Habib NFAOUI
PageRank_algorithm_Nfaoui_El_HabibPageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_Habib
El Habib NFAOUI2.1K views
Crawling and Indexing von Himani Tyagi
Crawling and IndexingCrawling and Indexing
Crawling and Indexing
Himani Tyagi3.7K views
Data Mining: Graph mining and social network analysis von DataminingTools Inc
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc14.2K views
Search engine and web crawler von ishmecse13
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
ishmecse133.5K views
Top 10 Free SEO Tools von Simplilearn
Top 10 Free SEO ToolsTop 10 Free SEO Tools
Top 10 Free SEO Tools
Simplilearn103 views
Page rank and hyperlink von Silicon
Page rank and hyperlink Page rank and hyperlink
Page rank and hyperlink
Silicon4.9K views
HITS + Pagerank von ajkt
HITS + PagerankHITS + Pagerank
HITS + Pagerank
ajkt395 views

Destacado

Google Panda von
Google PandaGoogle Panda
Google PandaManifest Infotech
282 views8 Folien
Fourier Transforms von
Fourier TransformsFourier Transforms
Fourier TransformsArvind Devaraj
698 views1 Folie
Brains von
BrainsBrains
BrainsNiels Van Galen Last
350 views12 Folien
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ... von
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...Wolfgang Reinhardt
2.1K views14 Folien
PageRank and Related Methods von
PageRank and Related MethodsPageRank and Related Methods
PageRank and Related MethodsJohn Breslin
1.7K views47 Folien
Link Analysis (RBY) von
Link Analysis (RBY)Link Analysis (RBY)
Link Analysis (RBY)Carlos Castillo (ChaTo)
1.1K views141 Folien

Similar a page ranking algorithm

Google Page Ranking von
Google Page RankingGoogle Page Ranking
Google Page Rankingsreenivas1591
1.9K views22 Folien
Page ranking factors von
Page ranking factorsPage ranking factors
Page ranking factorsSudha Thangavel
1.9K views14 Folien
The Best Guide to SEO von
The Best Guide to SEOThe Best Guide to SEO
The Best Guide to SEOSumeet Chadha
234 views47 Folien
Master Class SEO von
Master Class SEOMaster Class SEO
Master Class SEODQ Network
76 views51 Folien
PageRank Algorithm von
PageRank AlgorithmPageRank Algorithm
PageRank AlgorithmIOSRjournaljce
146 views7 Folien
Introduction to SEO Basics von
Introduction to SEO BasicsIntroduction to SEO Basics
Introduction to SEO BasicsJenifer Renjini
1.2K views63 Folien

Similar a page ranking algorithm(20)

An Intro To SEO, SEM & Internet Marketing von Dave Davies
An Intro To SEO, SEM & Internet MarketingAn Intro To SEO, SEM & Internet Marketing
An Intro To SEO, SEM & Internet Marketing
Dave Davies133 views
Search engine page rank demystification von Raja R
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
Raja R637 views
SEO Fundamentals and Off Page Best Practices von Vaishali Singh
SEO Fundamentals and Off Page Best PracticesSEO Fundamentals and Off Page Best Practices
SEO Fundamentals and Off Page Best Practices
Vaishali Singh65 views
Google page rank and alexa rank von cabikhosting
Google page rank and alexa rankGoogle page rank and alexa rank
Google page rank and alexa rank
cabikhosting2.1K views
Search engine optimisation von robclarkson
Search engine optimisationSearch engine optimisation
Search engine optimisation
robclarkson500 views
Are you interested in increasing your Google PageRank? von paully58
Are you interested in increasing your Google PageRank?Are you interested in increasing your Google PageRank?
Are you interested in increasing your Google PageRank?
paully5887 views
Are you interested in increasing your Google PageRank? von believe52
Are you interested in increasing your Google PageRank?Are you interested in increasing your Google PageRank?
Are you interested in increasing your Google PageRank?
believe52131 views
Are you interested in increasing your Google PageRank? von isawyours
Are you interested in increasing your Google PageRank?Are you interested in increasing your Google PageRank?
Are you interested in increasing your Google PageRank?
isawyours220 views
Pakar SEO Kudus Iswanto SEO von Iswanto Seo
Pakar SEO Kudus Iswanto SEOPakar SEO Kudus Iswanto SEO
Pakar SEO Kudus Iswanto SEO
Iswanto Seo24 views
Pakar SEO Batam Iswanto SEO von Iswanto Seo
Pakar SEO Batam Iswanto SEOPakar SEO Batam Iswanto SEO
Pakar SEO Batam Iswanto SEO
Iswanto Seo26 views
Pakar SEO Aceh Iswanto SEO von Iswanto Seo
Pakar SEO Aceh Iswanto SEOPakar SEO Aceh Iswanto SEO
Pakar SEO Aceh Iswanto SEO
Iswanto Seo19 views
Pakar SEO Medan Iswanto SEO von Iswanto Seo
Pakar SEO Medan Iswanto SEOPakar SEO Medan Iswanto SEO
Pakar SEO Medan Iswanto SEO
Iswanto Seo18 views
Pakar SEO Jogja Iswanto SEO von Iswanto Seo
Pakar SEO Jogja Iswanto SEOPakar SEO Jogja Iswanto SEO
Pakar SEO Jogja Iswanto SEO
Iswanto Seo25 views
Pakar SEO Bali Iswanto SEO von Iswanto Seo
Pakar SEO Bali Iswanto SEOPakar SEO Bali Iswanto SEO
Pakar SEO Bali Iswanto SEO
Iswanto Seo15 views

Último

Introduction to AERO Supply Chain - #BEAERO Trainning program von
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning programGuennoun Wajih
123 views78 Folien
NodeJS and ExpressJS.pdf von
NodeJS and ExpressJS.pdfNodeJS and ExpressJS.pdf
NodeJS and ExpressJS.pdfArthyR3
50 views17 Folien
Meet the Bible von
Meet the BibleMeet the Bible
Meet the BibleSteve Thomason
81 views80 Folien
STRATEGIC MANAGEMENT MODULE 1_UNIT1 _UNIT2.pdf von
STRATEGIC MANAGEMENT MODULE 1_UNIT1 _UNIT2.pdfSTRATEGIC MANAGEMENT MODULE 1_UNIT1 _UNIT2.pdf
STRATEGIC MANAGEMENT MODULE 1_UNIT1 _UNIT2.pdfDr Vijay Vishwakarma
134 views68 Folien
The Future of Micro-credentials: Is Small Really Beautiful? von
The Future of Micro-credentials:  Is Small Really Beautiful?The Future of Micro-credentials:  Is Small Really Beautiful?
The Future of Micro-credentials: Is Small Really Beautiful?Mark Brown
102 views35 Folien
Java Simplified: Understanding Programming Basics von
Java Simplified: Understanding Programming BasicsJava Simplified: Understanding Programming Basics
Java Simplified: Understanding Programming BasicsAkshaj Vadakkath Joshy
663 views155 Folien

Último(20)

Introduction to AERO Supply Chain - #BEAERO Trainning program von Guennoun Wajih
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning program
Guennoun Wajih123 views
NodeJS and ExpressJS.pdf von ArthyR3
NodeJS and ExpressJS.pdfNodeJS and ExpressJS.pdf
NodeJS and ExpressJS.pdf
ArthyR350 views
The Future of Micro-credentials: Is Small Really Beautiful? von Mark Brown
The Future of Micro-credentials:  Is Small Really Beautiful?The Future of Micro-credentials:  Is Small Really Beautiful?
The Future of Micro-credentials: Is Small Really Beautiful?
Mark Brown102 views
Monthly Information Session for MV Asterix (November) von Esquimalt MFRC
Monthly Information Session for MV Asterix (November)Monthly Information Session for MV Asterix (November)
Monthly Information Session for MV Asterix (November)
Esquimalt MFRC213 views
12.5.23 Poverty and Precarity.pptx von mary850239
12.5.23 Poverty and Precarity.pptx12.5.23 Poverty and Precarity.pptx
12.5.23 Poverty and Precarity.pptx
mary850239514 views
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice von Taste
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a ChoiceCreative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Taste52 views
Guess Papers ADC 1, Karachi University von Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz105 views
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv... von Taste
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...
Creative Restart 2023: Leonard Savage - The Permanent Brief: Unearthing unobv...
Taste62 views
Career Building in AI - Technologies, Trends and Opportunities von WebStackAcademy
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and Opportunities
WebStackAcademy47 views
Nelson_RecordStore.pdf von BrynNelson5
Nelson_RecordStore.pdfNelson_RecordStore.pdf
Nelson_RecordStore.pdf
BrynNelson550 views
Education of marginalized and socially disadvantages segments.pptx von GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati547 views
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... von Nguyen Thanh Tu Collection
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...

page ranking algorithm

  • 2. INTRODUCTION • Finding useful information on the World Wide Web is something many of us take for granted. According to the Internet research firm Netcraft, there are nearly 150,000,000 active Web sites on the Internet today. • Google's algorithm does the work for you by searching out Web pages that contain the keywords you used to search, then assigning a rank to each page based several factors, including how many times the keywords appear on the page. Higher ranked pages appear further up in Google's search engine results page (SERP), meaning that the best links relating to your search query are theoretically the first ones Google lists. • Automated programs called spiders or crawlers travel the Web, moving from link to link and building up an index page that includes certain keywords. Google references this index when a user enters a search query. The search engine lists the pages that contain the same keywords that were in the user's search terms.
  • 3. • Also like other search engines, Google has a large index of keywords and where those words can be found. What sets Google apart is how it ranks search results, which in turn determines the order Google displays results on its search engine results page (SERP). Google uses a trademarked algorithm called PageRank, which assigns each Web page a relevancy score. • Keyword placement plays a part in how Google finds sites. Google looks for keywords throughout each Web page, but some sections are more important than others. Including the keyword in the Web page's title is a good idea, for example. Google also searches for keywords in headings. How to decide which page is to be selected and which has to be left out, google does this by asking questions 200 of them, few important ones are: i. How many time the keyword is contained in the page ? i.e. frequency of the word in the page ii. Do words appear in title ,URL, directly adjacent, meta tag? iii. Does page include Synonyms.. iv. Page from quality website, low quality,… v. Page rank?
  • 4. PAGERANKING ALGORITHM • Google’s PageRank algorithm has become one of the most famous in computer science. It was originally designed to rank websites according to their importance by assuming that a site is important if it is linked to by other important sites it follows the real life philosophy that “How does a product or an individual get popular when people other than the individual know about that individual or product “ which is similar to page ranking of a page when other webpages has a link to the specific web page. • The algorithm works by counting the links to a website and the importance of the sites these come from. It then uses this to work out the importance of the original site. Through a process of iteration, the algorithm comes up with a ranking.
  • 5. • PageRank assigns a rank or score to every search result. The higher the page's score, the further up the search results list it will appear. • Scores are partially determined by the number of other Web pages that link to the target page. Each link is counted as a vote for the target. The logic behind this is that pages with high quality content will be linked to, more often than mediocre pages. • Not all votes are equal. Votes from a high-ranking Web page count more than votes from low-ranking sites. You can't really boost one Web page's rank by making a bunch of empty Web sites linking back to the target page. • The more links a Web page sends out, the more diluted its voting power becomes. In other words, if a high-ranking page links to hundreds of other pages, each individual vote won't count as much as it would if the page only linked to a few sites. • Other factors that might affect scoring include the how long the site has been around, the strength of the domain name, how and where the keywords appear on the site and the age of the links going to and from the site. Google tends to place more value on sites that have been around for a while.
  • 6. A Web page's PageRank depends on a few factors: • The frequency and location of keywords within the Web page: If the keyword only appears once within the body of a page, it will receive a low score for that keyword. • How long the Web page has existed: People create new Web pages every day, and not all of them stick around for long. Google places more value on pages with an established history. • The number of other Web pages that link to the page in question: Google looks at how many Web pages link to a particular site to determine its relevance.
  • 7. • Out of these three factors, the third is the most important. It's easier to understand it with an example. • Let's look at a search for the terms "Planet Earth.“ • As more Web pages link to Discovery's Planet Earth page, the Discovery page's rank increases. When Discovery's page ranks higher than other pages, it shows up at the top of the Google search results page.
  • 8. PageRank description We assume page A has pages T1...Tn which point to it . The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. The PageRank theory holds that an imaginary surfer who is randomly clicking on links will eventually stop clicking. The probability, at any step, that the person will continue is a damping factor d. Various studies have tested different damping factors, but it is generally assumed that the damping factor will be set around 0.85. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows: PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) the PageRank's form a probability distribution over web pages, “so the sum of all web pages' PageRank's will be one”.
  • 9. How is PageRank Calculated? • The PR of each page depends on the PR of the pages pointing to it. But we won’t know what PR those pages have until the pages pointing to them have their PR calculated and so on… And when you consider that page links can form circles it seems impossible to do this calculation! • the Google paper says: PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. What that means to us is that we can just go ahead and calculate a page’s PR without knowing the final value of the PR of the other pages. That seems strange but, basically, each time we run the calculation we’re getting a closer estimate of the final value. So all we need to do is remember the each value we calculate and repeat the calculations lots of times until the numbers stop changing much.
  • 10. Lets take the simplest example network: two pages, each pointing to the other: Each page has one outgoing link (the outgoing count is 1, i.e. C(A) = 1 and C(B) = 1). 1. GUESS 1 d= 0.85 PR(A)= (1 – d) + d(PR(B)/1) PR(B)= (1 – d) + d(PR(A)/1) PR(A)= 0.15 + 0.85 * 1 = 1 PR(B)= 0.15 + 0.85 * 1 = 1 We don’t know what their PR should be to begin with, so let’s take a guess at 1.0 and do some calculations: i.e.
  • 11. 2. GUESS 2 PR(A)= 0.15 + 0.85 * 0 = 0.15 PR(B)= 0.15 + 0.85 * 0.15 = 0.2775 PR(A)= 0.15 + 0.85 * 0.2775 = 0.385875 PR(B)= 0.15 + 0.85 * 0.385875 = 0.47799375 PR(A)= 0.15 + 0.85 * 0.47799375 = 0.5562946875 PR(B)= 0.15 + 0.85 * 0.5562946875 = 0.622850484375 Ok, let’s start the guess at 0 instead and re-calculate: And again: And again: and so on. The numbers just keep going up. But will the numbers stop increasing when they get to 1.0? What if a calculation over-shoots and goes above 1.0?
  • 12. 3. GUESS 3 Let’s start the guess at 40 each and do a few cycles: PR(A) = 40 • Principle: it doesn’t matter where you start your guess, once the PageRank calculations have settled down, the “normalized probability distribution” (the average PageRank for all pages) will be 1.0 PR(A)= 0.15 + 0.85 * 40 = 34.25 PR(B)= 0.15 + 0.85 * 0.385875 = 29.1775 PR(A)= 0.15 + 0.85 * 29.1775 = 24.950875 PR(B)= 0.15 + 0.85 * 24.950875 = 21.35824375 First calculation And again
  • 13. PR(D)= (1-d) + d * (0) = 0.15 no backlinks means the equation looks like this: no matter what else is going on or how many times you do it. Observation: every page has at least a PR of 0.15 to share out.
  • 14. • Our home page has 2 and a half times as much PR as the child pages! Excellent! • This is what we’d expect. All the pages have the same number of incoming links, all pages are of equal importance to each other, all pages get the same PR of 1.0 (i.e. the “average” probability).
  • 15. EXAMPLES • Because Google looks at links to a Web page as a vote, it's not easy to cheat the system. The best way to make sure your Web page is high up on Google's search results is to provide great content so that people will link back to your page. The more links your page gets, the higher its PageRank score will be. If you attract the attention of sites with a high PageRank score, your score will grow faster. • Mega-sites, like http://news.bbc.co.uk have tens or hundreds of editors writing new content – i.e. new pages - all day long! Each one of those pages has rich, worthwhile content of its own and a link back to its parent or the home page! That’s why the Home page Toolbar PR of these sites is 9/10 and the rest of us just get pushed lower and lower by comparison… • Principle: Content Is King! There really is no substitute for lots of good content…
  • 16. Steps to a enhance your PAGERANK 1.Give visitors the information they're looking for • Provide high-quality content on your pages, especially your homepage. This is the single most important thing to do. If your pages contain useful information,their content will attract many visitors and entice webmasters to link to your site. Think about the words users would type to find your pages and include those words on your site. 2. Make sure that other sites link to yours • Links help our crawlers find your site and can give your site greater visibility in our search results. When returning results for a search, Google uses sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google interprets a link from page A to page B as a vote by page A for page B. 3. Make your site easily accessible • Build your site with a logical link structure. Every page should be reachable from at least one static text link.
  • 17. BIBLIOGRAPHY • http://www.google.com/googlebot • www.wikipedia.org • http://infolab.stanford.edu/~backrub/google.html