SlideShare a Scribd company logo
1 of 25
Lets Unleash The Secret Behind
Search Engine Giant
Presented by:
Prakhar Gethe
(CEO and Co-Founder Team Zenith)
4/29/2014 PSIT CS SOCIETY 2
 Facts About Google
 How A Search Engine Works
** Types Of search engine
 How Google Works
** Google Architecture
** Google Web Crawler
** Google indexer
** Google Query Processor
 Goole Working Info graphic
 What Is Seo
** SEO techniques
 What Is Google Digging
** Methods Of Google Digging
 Technology Requirements Of Creating
Search Engine
TOPICS TO BE COVERED
FACTS ABOUT GOOGLE
4/29/2014 PSIT CS SOCIETY 3
• Google was founded by Larry Page and Sergey Brin while they were Ph.D.
students at Stanford University
• Founded on 4th september 1998.
• Google uses approximately 20 petabytes of user-generated data every
day. (Petabytes are estimated at 10 to the 15th power bytes.)
• In June 2006, the Oxford English Dictionary (OED) added “Google” as a
verb
• A Google employee is named a “Googler” while a new team member is
called a “Noogler
• The name ‘Google’ was an accident. A spelling mistake made by the
original founders who thought they were going for ‘Googol’
• The prime reason the Google home page is so bare is due to the fact that
the founders didn’t know HTML and just wanted a quick interface. In fact
it was noted that the submit button was a long time coming and hitting
the RETURN key was the only way to burst Google into life.
• Google has the largest network of translators in the world
• On average, Google has acquired more than one company every week
since 2010.
4/29/2014 PSIT CS SOCIETY 4
• On average, Google has acquired more than
one company every week since 2010.
• Google might be the only company with the
explicit goal to REDUCE the amount of time
people spend on its site.
• The world watches 450,000 years of YouTube
videos each month, over twice as long as
modern humans have existed.
• Google has photographed 5 million miles of
road for its Street View maps
• Google.com, home to arguably the world's
most important internet company, contains 23
markup errors in its code.
HOW A SEARCH ENGINE WORKS
4/29/2014 PSIT CS SOCIETY 5
A program that searches for and identifies items in a database that correspond
to keywords or characters specified by the user, used especially for finding
particular sites on the Internet.
Or simply
A search engine is a database system designed to index and categorize internet
addresses, otherwise known as URLs.
FACTS ABOUT SEARCH ENGINES
Search Engine Popularity
The most popular search engines on the web:
Google 55.2%
Yahoo 21.7%
MSN Search 9.6%
AOL Search 3.8%
Terra Lycos 2.6%
AltaVista 2.2%
AskJeeves 1.5%
4/29/2014 PSIT CS SOCIETY 6
Number of Words Used in Search Phrases
2-word phrases 32.58%
3-word phrase 25.61%
1-word phrases 19.02%
4-word phrases 12.83%
5-word phrases 5.64%
6-word phrases 2.32%
7-word phrases 0.98%
When People Search
The breakdown of surfer traffic by day of the week:
Monday 15.31%
Tuesday 15.23%
Thursday 14.73%
Wednesday 14.62%
Friday 14.48%
Saturday 13.08%
Sunday 12.55%
Screen Resolutions
The most popular screen resolutions on the web:
1024 x 768 48.3%
800 x 600 31.7%
1280 x 1024 13.6%
1152 x 864 4.0%
640 x 480 1.0%
1600 x 1200 1.0%
1152 x 870 0.2%
TYPES OF SEARCH ENGINES
4/29/2014 PSIT CS SOCIETY 7
Automatic:
These search engines are based on information that is collected,
sorted and analyzed by software programs, commonly referred to as
"robots", "spiders", or "crawlers". These spiders crawl through web
pages collecting information which is then analyzed and categorized
into an "index". When you conduct a search using one of these search
engines, you are really searching the index. The results of the search
will depend on the contents of that index and its relevancy to your
query.
4/29/2014 PSIT CS SOCIETY 8
Directories:
A directory is a searchable subject guide of Web sites that have been
reviewed and compiled by human editors. These editors decide which
sites to list, and, in which categories.
Meta:
Meta search engines use automated technology to gather information
from a spider and then deliver a summary of that information as the
results of a search to the end user.
Pay-per-click (PPC):
A search engine that determines ranking according to the dollar
amount you pay for each click from that search engine to your site.
Examples of PPC search engines are Overture.com and FindWhat.com.
The highest ranking goes to the highest bidder.
4/29/2014 PSIT CS SOCIETY 9
How Do Search Engines Rank Web Pages?
When ranking Web pages, search engines follow specific criteria, which
may vary from one search engine to another. Naturally, they want to
generate the most popular (or relevant) pages at the top of their list.
Search engines will look at keywords and phrases, content, HTML meta
tags and link popularity -- just to name a few -- to determine the value of
the Web page.
How Do Search Engines Work?
Search engines compile their databases with the aid of spiders (a.k.a.
robots). These search engine spiders crawl the Internet from link to link,
identifying Web pages. Once search engine spiders find a Web site, they
index the content on those pages, making the URLs available to Internet
users.
In turn, owners of Websites submit their URLs to search engines
for crawling and, ultimately, inclusion in their databases. This is known
as search engine submission.
When you use search engines to find something on the Internet, you're
Basically asking the search engine to scan its database and match your
keywords and phrases with the content of the URLs they have on file
at that time. Spiders regularly return to the URLs they index to look for
changes.When changes occur, the index is updated to reflect the new
information.
HOW GOOGLE WORKS
4/29/2014 PSIT CS SOCIETY 10
Google runs on a distributed network of thousands of low-cost computers
and can therefore carry out fast parallel processing. Parallel processing is a
method of computation in which many calculations can be performed
simultaneously, significantly speeding up data processing. Google has three
distinct parts:
 Googlebot, a web crawler that finds and fetches web pages.
 The indexer that sorts every word on every page and stores the resulting
index of words in a huge database.
 The query processor, which compares your search query to the index and
recommends the documents that it considers most relevant.
4/29/2014 PSIT CS SOCIETY 11
Google Architecture
Various Data Structures Used
In
 Repository
 Lexicon
 Document Index
 Hit Lists
 Forward Index
 Inverted Index
4/29/2014 PSIT CS SOCIETY 12
Googlebot, Google’s Web Crawler
Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off
to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of
cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web
browser, by sending a request to a web server for a web page, downloading the entire page, then handing
it off to Google’s indexer.
Googlebot consists of many computers requesting and fetching pages much more quickly than you can
with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To
avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately
makes requests of each individual web server more slowly than it’s capable of doing.
4/29/2014 PSIT CS SOCIETY 13
Google’s Indexer
Googlebot gives the indexer the full text of the pages it finds. These
pages are stored in Google’s index database. This index is sorted
alphabetically by search term, with each index entry storing a list of
documents in which the term appears and the location within the text
where it occurs. This data structure allows rapid access to documents
that contain user query terms.
To improve search performance, Google ignores (doesn’t index)
common words called stop words (such as the, is, on, or, of, how, why, as
well as certain single digits and single letters). Stop words are so
common that they do little to narrow a search, and therefore they can
safely be discarded. The indexer also ignores some punctuation and
multiple spaces, as well as converting all letters to lowercase, to improve
Google’s performance.
4/29/2014 PSIT CS SOCIETY 14
Traditional method Google Caffeine
4/29/2014 PSIT CS SOCIETY 15
Google’s Query Processor
The query processor has several parts, including the user
interface (search box), the “engine” that evaluates queries and
matches them to relevant documents, and the results formatter.
PageRank is Google’s system for ranking web pages. A page with a
higher PageRank is deemed more important and is more likely to
be listed above a page with a lower PageRank.
Google considers over a hundred factors in computing a PageRank
and determining which documents are most relevant to a query,
including the popularity of the page, the position and size of the
search terms within the page, and the proximity of the search
terms to one another on the page. A patent application discusses
other factors that Google considers when ranking a page.
4/29/2014 PSIT CS SOCIETY 16
Let’s see how Google processes a query.
4/29/2014 PSIT CS SOCIETY 17
SEO-Search Engine Optimization
4/29/2014 PSIT CS SOCIETY 18
Search Engine Optimization is the process of improving the
visibility of a website on organic ("natural" or un-paid)
search engine result pages (SERPs), by incorporating search
engine friendly elements into a website. A successful search
engine optimization campaign will have, as part of the
improvements, carefully select, relevant, keywords which
the on-page optimization will be designed to make
prominent for search engine algorithms.
Search engine optimization is broken down into two basic
areas: on-page, and off-page optimization.
 On-page optimization refers to website elements which
comprise a web page, such as HTML code, textual
content, and images.
 Off-page optimization refers, predominantly, to
backlinks (links pointing to the site which is being
optimized, from other relevant websites).
4/29/2014 PSIT CS SOCIETY 19
 Optimize your title tags
 Create compelling meta descriptions
 Utilize keyword-rich headings
 Add ALT tags to your images
 Create a sitemap
 Build internal links between pages
 Update your site regularly
 Image Optimization
 URL Optimization
 Directory Submission
 Commenting
 Social Networking
 Guest Posting
SEO cont.…
Various SEO techniques:-
GOOGLE DIGGING
4/29/2014 PSIT CS SOCIETY 20
The art of searching any content
using google is called Google digging
or the art of googling or sometimes
even Google hacking
Google Dorks or search techniques which can
be used to refine our search
1) Intitle :
2) Filetype :
3) Site :
4) Related
5) Inurl :
4/29/2014 PSIT CS SOCIETY 21
GOOGLE DIGGING cont….
Technology Requirements Of Creating
Search Engine
4/29/2014 PSIT CS SOCIETY 22
For back-end:-
 Asp.Net
 PHP
 Python
 Perl
 Or your customized language
For database
• MySql
• Oracle technology
• Any Nosql Databases
• Or any customized database
There are various technologies which can be used to create search engine and
web crawlers ,Bots and query indexer.
For Front-End
• Javascript
• Xml
• JSON
• Dart etc.
4/29/2014 PSIT CS SOCIETY 23
Source : Wikipedia
Cont……
4/29/2014 PSIT CS SOCIETY 24
Lets thank you to Google for such a
wonderful technology and search
engine
4/29/2014 PSIT CS SOCIETY 25
Questions, comments, feedbacks are
welcome

More Related Content

What's hot

Lost in the net: Navigating search engines
Lost in the net:  Navigating search enginesLost in the net:  Navigating search engines
Lost in the net: Navigating search enginesJohan Koren
 
Winnipeg Pay Per Click Advertising - Tutorial: Google for Webmas
Winnipeg Pay Per Click Advertising  - Tutorial: Google for WebmasWinnipeg Pay Per Click Advertising  - Tutorial: Google for Webmas
Winnipeg Pay Per Click Advertising - Tutorial: Google for Webmasdavidawass
 
Cubrickz - Tutorial: Google for Webmasters
Cubrickz - Tutorial: Google for WebmastersCubrickz - Tutorial: Google for Webmasters
Cubrickz - Tutorial: Google for WebmastersRed Angel, LLC
 
Ebriks-Tips for Google Ranking Evolution.
Ebriks-Tips for Google Ranking Evolution.Ebriks-Tips for Google Ranking Evolution.
Ebriks-Tips for Google Ranking Evolution.georgepaulv
 
How Google Search Algorithm Works ??
How Google Search Algorithm Works ??How Google Search Algorithm Works ??
How Google Search Algorithm Works ??viralshahb
 
Tutorial Google For Webmasters
Tutorial Google For WebmastersTutorial Google For Webmasters
Tutorial Google For Webmastersmamos
 
ваш сантехник в Питере - Tutorial: Google for Webmasters
ваш сантехник в Питере - Tutorial: Google for Webmastersваш сантехник в Питере - Tutorial: Google for Webmasters
ваш сантехник в Питере - Tutorial: Google for Webmastersкрылов сергей
 
Google for webmasters
Google for webmastersGoogle for webmasters
Google for webmastersMK-D Activo
 
ReadingSEO - 5th september 2019
ReadingSEO - 5th september 2019ReadingSEO - 5th september 2019
ReadingSEO - 5th september 2019Matt Williamson
 
SEO Overview and Tips for Beginners
SEO Overview and Tips for BeginnersSEO Overview and Tips for Beginners
SEO Overview and Tips for BeginnersDeepak Rajput
 
basic web search
basic web searchbasic web search
basic web searchosuchin
 
Tips and technics for search engine market
Tips and technics for search engine marketTips and technics for search engine market
Tips and technics for search engine marketStefanos Anastasiadis
 
SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)kdmcBerkeley at UC Berkeley
 
How google search engine work
How google search engine workHow google search engine work
How google search engine workLạc Lạc
 

What's hot (17)

Lost in the net: Navigating search engines
Lost in the net:  Navigating search enginesLost in the net:  Navigating search engines
Lost in the net: Navigating search engines
 
Winnipeg Pay Per Click Advertising - Tutorial: Google for Webmas
Winnipeg Pay Per Click Advertising  - Tutorial: Google for WebmasWinnipeg Pay Per Click Advertising  - Tutorial: Google for Webmas
Winnipeg Pay Per Click Advertising - Tutorial: Google for Webmas
 
Cubrickz - Tutorial: Google for Webmasters
Cubrickz - Tutorial: Google for WebmastersCubrickz - Tutorial: Google for Webmasters
Cubrickz - Tutorial: Google for Webmasters
 
Ebriks-Tips for Google Ranking Evolution.
Ebriks-Tips for Google Ranking Evolution.Ebriks-Tips for Google Ranking Evolution.
Ebriks-Tips for Google Ranking Evolution.
 
How Google Search Algorithm Works ??
How Google Search Algorithm Works ??How Google Search Algorithm Works ??
How Google Search Algorithm Works ??
 
Tutorial Google For Webmasters
Tutorial Google For WebmastersTutorial Google For Webmasters
Tutorial Google For Webmasters
 
ваш сантехник в Питере - Tutorial: Google for Webmasters
ваш сантехник в Питере - Tutorial: Google for Webmastersваш сантехник в Питере - Tutorial: Google for Webmasters
ваш сантехник в Питере - Tutorial: Google for Webmasters
 
Google for webmasters
Google for webmastersGoogle for webmasters
Google for webmasters
 
ReadingSEO - 5th september 2019
ReadingSEO - 5th september 2019ReadingSEO - 5th september 2019
ReadingSEO - 5th september 2019
 
SEO Overview and Tips for Beginners
SEO Overview and Tips for BeginnersSEO Overview and Tips for Beginners
SEO Overview and Tips for Beginners
 
basic web search
basic web searchbasic web search
basic web search
 
SEO
SEOSEO
SEO
 
Seo
SeoSeo
Seo
 
Tips and technics for search engine market
Tips and technics for search engine marketTips and technics for search engine market
Tips and technics for search engine market
 
SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)
 
How google search engine work
How google search engine workHow google search engine work
How google search engine work
 
People Search
People SearchPeople Search
People Search
 

Viewers also liked

How Google works
How Google worksHow Google works
How Google workshsa2
 
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013Reflet Communication
 
DISC ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اول
DISC     ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اولDISC     ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اول
DISC ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اولAlireza Tafreshi
 
How google works_final
How google works_finalHow google works_final
How google works_finalkirantej1920
 
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریف
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریفسمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریف
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریفArman Safayi
 
Disc model presentation in persian
Disc model presentation in persianDisc model presentation in persian
Disc model presentation in persianHassan Bahreini
 
How google works
How google worksHow google works
How google worksZuni
 
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاء
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاءپلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاء
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاءstartupIoT
 
فرصتهای شغلی با اینترنت چیزها
فرصتهای شغلی با اینترنت چیزهافرصتهای شغلی با اینترنت چیزها
فرصتهای شغلی با اینترنت چیزهاAmir Ghorbanali
 
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاء
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاءمروی بر استارترکیتها و پلتفرمهای اینترنت اشیاء
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاءstartupIoT
 
بازی کاری- کورش جمشیدی
بازی کاری- کورش جمشیدیبازی کاری- کورش جمشیدی
بازی کاری- کورش جمشیدیKourosh Jamshidi
 
دانش مورد نیاز و ابزار ضروری برای کارآفرینی
دانش مورد نیاز و ابزار ضروری برای کارآفرینیدانش مورد نیاز و ابزار ضروری برای کارآفرینی
دانش مورد نیاز و ابزار ضروری برای کارآفرینیmohammad zahedi
 

Viewers also liked (20)

How Google works
How Google worksHow Google works
How Google works
 
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013
TGIF Reflet Communication - Special Fall Edition - Nov 8, 2013
 
TGIF - 2012 - Semaine 50
TGIF - 2012 - Semaine 50TGIF - 2012 - Semaine 50
TGIF - 2012 - Semaine 50
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Team work
Team workTeam work
Team work
 
innovative thinling
innovative thinlinginnovative thinling
innovative thinling
 
Intro Lean Startup
Intro Lean StartupIntro Lean Startup
Intro Lean Startup
 
DISC ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اول
DISC     ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اولDISC     ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اول
DISC ارتباطات حرفه ای بر اساس مدل جهانی دیسک قسمت اول
 
How google works_final
How google works_finalHow google works_final
How google works_final
 
Lean Startup
Lean StartupLean Startup
Lean Startup
 
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریف
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریفسمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریف
سمینار استارتاپ‌های حوزه سلامت هوشمند در دانشگاه شریف
 
Disc model presentation in persian
Disc model presentation in persianDisc model presentation in persian
Disc model presentation in persian
 
How google works
How google worksHow google works
How google works
 
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاء
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاءپلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاء
پلتفرمهای نرم افزاری و سخت افزاری پیاده سازی راهکارهای اینترنت اشیاء
 
Sales & DISC WORKSHOP
Sales & DISC WORKSHOPSales & DISC WORKSHOP
Sales & DISC WORKSHOP
 
فرصتهای شغلی با اینترنت چیزها
فرصتهای شغلی با اینترنت چیزهافرصتهای شغلی با اینترنت چیزها
فرصتهای شغلی با اینترنت چیزها
 
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاء
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاءمروی بر استارترکیتها و پلتفرمهای اینترنت اشیاء
مروی بر استارترکیتها و پلتفرمهای اینترنت اشیاء
 
Team working
Team workingTeam working
Team working
 
بازی کاری- کورش جمشیدی
بازی کاری- کورش جمشیدیبازی کاری- کورش جمشیدی
بازی کاری- کورش جمشیدی
 
دانش مورد نیاز و ابزار ضروری برای کارآفرینی
دانش مورد نیاز و ابزار ضروری برای کارآفرینیدانش مورد نیاز و ابزار ضروری برای کارآفرینی
دانش مورد نیاز و ابزار ضروری برای کارآفرینی
 

Similar to How google works and functions: A complete Approach

Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine Aniket_1415
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlerishmecse13
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than GoogleDr Trivedi
 
SEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOSEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOFlutterbyBarb
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEONeeraj Reddy
 
How search engine works and history of search engine
How search engine works and history of search engineHow search engine works and history of search engine
How search engine works and history of search engineAK DigiHub
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrievaliosrjce
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine OptimizationKaran Thakkar
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEOThanh Nguyen
 
A Survey On Search Engines
A Survey On Search EnginesA Survey On Search Engines
A Survey On Search EnginesAndrew Parish
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)jhon smith
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)ssunnysengar
 
Effective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerEffective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerIJMER
 

Similar to How google works and functions: A complete Approach (20)

Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than Google
 
SEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOSEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEO
 
Search engine
Search engineSearch engine
Search engine
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEO
 
How search engine works and history of search engine
How search engine works and history of search engineHow search engine works and history of search engine
How search engine works and history of search engine
 
G017254554
G017254554G017254554
G017254554
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
 
Seo Presentation
Seo PresentationSeo Presentation
Seo Presentation
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
About search engines
About search enginesAbout search engines
About search engines
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEO
 
A SURVEY ON SEARCH ENGINES
A SURVEY ON SEARCH ENGINESA SURVEY ON SEARCH ENGINES
A SURVEY ON SEARCH ENGINES
 
A Survey On Search Engines
A Survey On Search EnginesA Survey On Search Engines
A Survey On Search Engines
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)
 
Search Engine Optimization (Seo)
Search Engine Optimization (Seo)Search Engine Optimization (Seo)
Search Engine Optimization (Seo)
 
Effective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerEffective Searching Policies for Web Crawler
Effective Searching Policies for Web Crawler
 

Recently uploaded

DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 

Recently uploaded (20)

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 

How google works and functions: A complete Approach

  • 1. Lets Unleash The Secret Behind Search Engine Giant Presented by: Prakhar Gethe (CEO and Co-Founder Team Zenith)
  • 2. 4/29/2014 PSIT CS SOCIETY 2  Facts About Google  How A Search Engine Works ** Types Of search engine  How Google Works ** Google Architecture ** Google Web Crawler ** Google indexer ** Google Query Processor  Goole Working Info graphic  What Is Seo ** SEO techniques  What Is Google Digging ** Methods Of Google Digging  Technology Requirements Of Creating Search Engine TOPICS TO BE COVERED
  • 3. FACTS ABOUT GOOGLE 4/29/2014 PSIT CS SOCIETY 3 • Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University • Founded on 4th september 1998. • Google uses approximately 20 petabytes of user-generated data every day. (Petabytes are estimated at 10 to the 15th power bytes.) • In June 2006, the Oxford English Dictionary (OED) added “Google” as a verb • A Google employee is named a “Googler” while a new team member is called a “Noogler • The name ‘Google’ was an accident. A spelling mistake made by the original founders who thought they were going for ‘Googol’ • The prime reason the Google home page is so bare is due to the fact that the founders didn’t know HTML and just wanted a quick interface. In fact it was noted that the submit button was a long time coming and hitting the RETURN key was the only way to burst Google into life. • Google has the largest network of translators in the world • On average, Google has acquired more than one company every week since 2010.
  • 4. 4/29/2014 PSIT CS SOCIETY 4 • On average, Google has acquired more than one company every week since 2010. • Google might be the only company with the explicit goal to REDUCE the amount of time people spend on its site. • The world watches 450,000 years of YouTube videos each month, over twice as long as modern humans have existed. • Google has photographed 5 million miles of road for its Street View maps • Google.com, home to arguably the world's most important internet company, contains 23 markup errors in its code.
  • 5. HOW A SEARCH ENGINE WORKS 4/29/2014 PSIT CS SOCIETY 5 A program that searches for and identifies items in a database that correspond to keywords or characters specified by the user, used especially for finding particular sites on the Internet. Or simply A search engine is a database system designed to index and categorize internet addresses, otherwise known as URLs. FACTS ABOUT SEARCH ENGINES Search Engine Popularity The most popular search engines on the web: Google 55.2% Yahoo 21.7% MSN Search 9.6% AOL Search 3.8% Terra Lycos 2.6% AltaVista 2.2% AskJeeves 1.5%
  • 6. 4/29/2014 PSIT CS SOCIETY 6 Number of Words Used in Search Phrases 2-word phrases 32.58% 3-word phrase 25.61% 1-word phrases 19.02% 4-word phrases 12.83% 5-word phrases 5.64% 6-word phrases 2.32% 7-word phrases 0.98% When People Search The breakdown of surfer traffic by day of the week: Monday 15.31% Tuesday 15.23% Thursday 14.73% Wednesday 14.62% Friday 14.48% Saturday 13.08% Sunday 12.55% Screen Resolutions The most popular screen resolutions on the web: 1024 x 768 48.3% 800 x 600 31.7% 1280 x 1024 13.6% 1152 x 864 4.0% 640 x 480 1.0% 1600 x 1200 1.0% 1152 x 870 0.2%
  • 7. TYPES OF SEARCH ENGINES 4/29/2014 PSIT CS SOCIETY 7 Automatic: These search engines are based on information that is collected, sorted and analyzed by software programs, commonly referred to as "robots", "spiders", or "crawlers". These spiders crawl through web pages collecting information which is then analyzed and categorized into an "index". When you conduct a search using one of these search engines, you are really searching the index. The results of the search will depend on the contents of that index and its relevancy to your query.
  • 8. 4/29/2014 PSIT CS SOCIETY 8 Directories: A directory is a searchable subject guide of Web sites that have been reviewed and compiled by human editors. These editors decide which sites to list, and, in which categories. Meta: Meta search engines use automated technology to gather information from a spider and then deliver a summary of that information as the results of a search to the end user. Pay-per-click (PPC): A search engine that determines ranking according to the dollar amount you pay for each click from that search engine to your site. Examples of PPC search engines are Overture.com and FindWhat.com. The highest ranking goes to the highest bidder.
  • 9. 4/29/2014 PSIT CS SOCIETY 9 How Do Search Engines Rank Web Pages? When ranking Web pages, search engines follow specific criteria, which may vary from one search engine to another. Naturally, they want to generate the most popular (or relevant) pages at the top of their list. Search engines will look at keywords and phrases, content, HTML meta tags and link popularity -- just to name a few -- to determine the value of the Web page. How Do Search Engines Work? Search engines compile their databases with the aid of spiders (a.k.a. robots). These search engine spiders crawl the Internet from link to link, identifying Web pages. Once search engine spiders find a Web site, they index the content on those pages, making the URLs available to Internet users. In turn, owners of Websites submit their URLs to search engines for crawling and, ultimately, inclusion in their databases. This is known as search engine submission. When you use search engines to find something on the Internet, you're Basically asking the search engine to scan its database and match your keywords and phrases with the content of the URLs they have on file at that time. Spiders regularly return to the URLs they index to look for changes.When changes occur, the index is updated to reflect the new information.
  • 10. HOW GOOGLE WORKS 4/29/2014 PSIT CS SOCIETY 10 Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:  Googlebot, a web crawler that finds and fetches web pages.  The indexer that sorts every word on every page and stores the resulting index of words in a huge database.  The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
  • 11. 4/29/2014 PSIT CS SOCIETY 11 Google Architecture Various Data Structures Used In  Repository  Lexicon  Document Index  Hit Lists  Forward Index  Inverted Index
  • 12. 4/29/2014 PSIT CS SOCIETY 12 Googlebot, Google’s Web Crawler Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer. Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.
  • 13. 4/29/2014 PSIT CS SOCIETY 13 Google’s Indexer Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms. To improve search performance, Google ignores (doesn’t index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.
  • 14. 4/29/2014 PSIT CS SOCIETY 14 Traditional method Google Caffeine
  • 15. 4/29/2014 PSIT CS SOCIETY 15 Google’s Query Processor The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter. PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank. Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page.
  • 16. 4/29/2014 PSIT CS SOCIETY 16 Let’s see how Google processes a query.
  • 17. 4/29/2014 PSIT CS SOCIETY 17
  • 18. SEO-Search Engine Optimization 4/29/2014 PSIT CS SOCIETY 18 Search Engine Optimization is the process of improving the visibility of a website on organic ("natural" or un-paid) search engine result pages (SERPs), by incorporating search engine friendly elements into a website. A successful search engine optimization campaign will have, as part of the improvements, carefully select, relevant, keywords which the on-page optimization will be designed to make prominent for search engine algorithms. Search engine optimization is broken down into two basic areas: on-page, and off-page optimization.  On-page optimization refers to website elements which comprise a web page, such as HTML code, textual content, and images.  Off-page optimization refers, predominantly, to backlinks (links pointing to the site which is being optimized, from other relevant websites).
  • 19. 4/29/2014 PSIT CS SOCIETY 19  Optimize your title tags  Create compelling meta descriptions  Utilize keyword-rich headings  Add ALT tags to your images  Create a sitemap  Build internal links between pages  Update your site regularly  Image Optimization  URL Optimization  Directory Submission  Commenting  Social Networking  Guest Posting SEO cont.… Various SEO techniques:-
  • 20. GOOGLE DIGGING 4/29/2014 PSIT CS SOCIETY 20 The art of searching any content using google is called Google digging or the art of googling or sometimes even Google hacking Google Dorks or search techniques which can be used to refine our search 1) Intitle : 2) Filetype : 3) Site : 4) Related 5) Inurl :
  • 21. 4/29/2014 PSIT CS SOCIETY 21 GOOGLE DIGGING cont….
  • 22. Technology Requirements Of Creating Search Engine 4/29/2014 PSIT CS SOCIETY 22 For back-end:-  Asp.Net  PHP  Python  Perl  Or your customized language For database • MySql • Oracle technology • Any Nosql Databases • Or any customized database There are various technologies which can be used to create search engine and web crawlers ,Bots and query indexer. For Front-End • Javascript • Xml • JSON • Dart etc.
  • 23. 4/29/2014 PSIT CS SOCIETY 23 Source : Wikipedia Cont……
  • 24. 4/29/2014 PSIT CS SOCIETY 24 Lets thank you to Google for such a wonderful technology and search engine
  • 25. 4/29/2014 PSIT CS SOCIETY 25 Questions, comments, feedbacks are welcome