SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Helping people find what they’re looking for
   Starts with an “information need”
   Convert to a query
   Gets results
In the materials available
   Web pages
   Other formats
   Deep Web
 Search can’t find what’s not there
    The content is hugely important
 Information Architecture is vital
 Usable sites have good navigation and structure
Index ahead of time
  • Find files or records
  • Open each one and read it
  • Store each word in a searchable index
Provide search forms
  • Match the query terms with words in the index
  • Sort documents by relevance
Display results
Like an iceberg,
2/3 below water


                                user
                             interface




                                      search
                   content         functionality
•   Text search works for structured content
•   Keyword search vs. SQL queries
•   Approximate vs. exact match
•   Multiple sources of content
•   Response time and database resources
•   Relevance ranking, very important
•   Works in the real world (e.g. EBay)
Users blame the search engine
   Even when the content is unavailable
Understand the scope of site or intranet
     Kinds of information
     Divided sites: products / corporate info
     Dates
     Languages
     Sources and data silos: CMSs, databases...
     Update processes
Store text to search it later
Many ways to gather text
     Crawl (spider) via HTTP
     Read files on file servers
     Access databases (HTTP or API)
     Data silos via local APIs
     Applications, CMSs, via Web Services
Security and Access Control
 Basic information for document or record
   • File name / URL / record ID
   • Title or equivalent
   • Size, date, MIME type
 Full text of item
 More metadata
   • Product name, picture ID
   • Category, topic, or subject
   • Other attributes, for relevance ranking and display
Stop words
Stemming
Metadata
   Explicit (tags)
   Implicit (context)
Semantics
   CMS and Database fields
   XML tags and attributes
What happens after you click the search button and
 before retrieval starts.
Usually in this order
     Handle character set, maybe language
     Look for operators and organize the query
     Look for field names or metadata
     Extract words (just like the indexer)
     Deal with letter casing
• Retrieval: find files with query terms
• Not the same as relevance ranking
  Recall: find all
   relevant items
  Precision: find only
   relevant items
  Increasing one
   decreases the
   other
Single-word queries
   Find items containing that word
Multi-word queries: combine lists
   Any: every item with any query word
   All: only items with every word
   Phrases: find only items with all words in order
Boolean and complex queries
  – Use algorithm to combine lists
•   Empty search
•   Nothing on the site on that topic (scope)
•   Misspelling or typing mistakes
•   Vocabulary differences
•   Restrictive search defaults
•   Restrictive search choices
•   Software failure
Theory: sort the matching items, so the most
 relevant ones appear first
Can't really know what the user wants
Relevance is hard to define and situational
Short queries tend to be deeply ambiguous
  What do people mean when they type “bank”?
First 10 results are the most important
The more transparent, the better
 Sorting documents on various criteria
 Start with words matching query terms
 Citation and link analysis
   Like old library Citation Indexes
   Ted Nelson - not only hypertext, but the links
   Google PageRank
      Incoming links
      Authority of linkers
 Taxonomies and external metadata
• Term frequency in the item
• Inverse document frequency of term
   Rare words are likely to be more important
   wij = weight of Term Tj in Document Di
   tfij = frequency of Term Tj in Document Dj
   N = number of Documents in collection
   n = number of Documents where term Tj
   occurs at least once

   From Salton 1989
•   Vector space
•   Probabilistic (binary interdependence)
•   Fuzzy set theory
•   Bayesian statistical analysis
•   Latent semantic indexing
•   Neural networks
•   Machine learning
•   All require sophisticated queries
•   See MIR, chapter 2
Heuristics are rules of thumb
  • Not algorithms, not math
Search Relevance Ranking Heuristics
  •   Documents containing all search words
  •   Search words as a phrase
  •   Matches in title tag
  •   Matches in other metadata
Based on real-word user behavior
What users see after they click the Search button
The most visible part of search
Elements of the results page
     Page layout and navigation
     Results header
     List of results items
     Results footer
Human judgment beats algorithms
Great for frequent, ambiguous searches
   Use search log to identify best candidates
Recommend good starting pages
      Product information, FAQs, etc.
Requires human resources
   That means money and time
More static than algorithmic search
 Leverage content structure
    database fields (i.e. cruise amenities)
    document metadata (news article bylines)
 Provide both search and browse
      Support information foraging
      Integrate navigation with results
      Not just subject taxonomies
      Display only fruitful paths, no dead ends
 Supported by academic research
    Marti Hearst, UCB SIMS, flamenco.berkeley.edu
Metrics
     Number of searches
     Number of no-matches searches
     Traffic from search to high-value pages
     Relate search changes to other metrics
Search Log Analysis
   Top 5% searches: phrases and words
   Top no-matches searches
        Use as market research
Search engines can’t read minds
   User queries are short and ambiguous
Some things will help
     Design a usable interface
     Show match words in context
     Keep index current and complete
     Adjust heuristic weighting
     Maintain suggestions and synonyms
     Consider faceted metadata search
Join us
Add: WZ-30-a,Bhagwan Das Nagar
East Punjabi Bagh, Delhi-110026
Tel.: 011 28316148, 3203571, 30538061
Mobile; +91-8010 298 388, 8010 198 388
E-mail: info@seocertification.org.in

Weitere ähnliche Inhalte

Was ist angesagt?

Webpowerpoint
WebpowerpointWebpowerpoint
Webpowerpointtonideegs
 
Search strategies – subject searching
Search strategies – subject searchingSearch strategies – subject searching
Search strategies – subject searchingdoverlibrary
 
Finding and Managing Information
Finding and Managing InformationFinding and Managing Information
Finding and Managing InformationNeny Isharyanti
 
Lesson Six Researching And The Internet
Lesson Six   Researching And The InternetLesson Six   Researching And The Internet
Lesson Six Researching And The Internetbsimoneaux
 
Finding Information in HRM
Finding Information in HRMFinding Information in HRM
Finding Information in HRMKatie Wiese
 
Presentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTAPresentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTATimo Kouwenhoven
 
Ws.dowland spring 2015
Ws.dowland spring 2015Ws.dowland spring 2015
Ws.dowland spring 2015k-baril
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 
Tips for searching for information
Tips for searching for informationTips for searching for information
Tips for searching for informationKatie Wiese
 
Classification, Tagging & Search
Classification, Tagging & SearchClassification, Tagging & Search
Classification, Tagging & SearchJames Melzer
 
Keyword Searching: Advanced Techniques
Keyword Searching: Advanced TechniquesKeyword Searching: Advanced Techniques
Keyword Searching: Advanced TechniquesKris Jacobson
 
Eco4132 Spring 2010
Eco4132 Spring 2010Eco4132 Spring 2010
Eco4132 Spring 2010lindahauck
 
WsMcManusPt2
WsMcManusPt2WsMcManusPt2
WsMcManusPt2k-kobiela
 
W13 libr250 databases_scholarlyvs_popular
W13 libr250 databases_scholarlyvs_popularW13 libr250 databases_scholarlyvs_popular
W13 libr250 databases_scholarlyvs_popularlterrones
 

Was ist angesagt? (18)

Business research lec5
Business research lec5Business research lec5
Business research lec5
 
Webpowerpoint
WebpowerpointWebpowerpoint
Webpowerpoint
 
Search strategies – subject searching
Search strategies – subject searchingSearch strategies – subject searching
Search strategies – subject searching
 
Finding and Managing Information
Finding and Managing InformationFinding and Managing Information
Finding and Managing Information
 
Lesson Six Researching And The Internet
Lesson Six   Researching And The InternetLesson Six   Researching And The Internet
Lesson Six Researching And The Internet
 
Finding Information in HRM
Finding Information in HRMFinding Information in HRM
Finding Information in HRM
 
Presentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTAPresentation Timo Kouwenhoven FIATIFTA
Presentation Timo Kouwenhoven FIATIFTA
 
Ws.dowland spring 2015
Ws.dowland spring 2015Ws.dowland spring 2015
Ws.dowland spring 2015
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
Tips for searching for information
Tips for searching for informationTips for searching for information
Tips for searching for information
 
Classification, Tagging & Search
Classification, Tagging & SearchClassification, Tagging & Search
Classification, Tagging & Search
 
Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
Stutoday10
Stutoday10Stutoday10
Stutoday10
 
Keyword Searching: Advanced Techniques
Keyword Searching: Advanced TechniquesKeyword Searching: Advanced Techniques
Keyword Searching: Advanced Techniques
 
Eco4132 Spring 2010
Eco4132 Spring 2010Eco4132 Spring 2010
Eco4132 Spring 2010
 
WsMcManusPt2
WsMcManusPt2WsMcManusPt2
WsMcManusPt2
 
Accessing Information
Accessing InformationAccessing Information
Accessing Information
 
W13 libr250 databases_scholarlyvs_popular
W13 libr250 databases_scholarlyvs_popularW13 libr250 databases_scholarlyvs_popular
W13 libr250 databases_scholarlyvs_popular
 

Andere mochten auch

курандын кереметтери. кyrgyz (кыргыз)
курандын кереметтери. кyrgyz (кыргыз)курандын кереметтери. кyrgyz (кыргыз)
курандын кереметтери. кyrgyz (кыргыз)HarunyahyaKyrgyz
 
Slfjaklsd
SlfjaklsdSlfjaklsd
Slfjaklsdesrgngr
 
Nasir journalism CV
Nasir journalism CVNasir journalism CV
Nasir journalism CVNasir Iqbal
 
RV AAD RL.pdf.
RV AAD RL.pdf.RV AAD RL.pdf.
RV AAD RL.pdf.Ron Vlieg
 
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016Captivate Search Marketing
 
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)HarunyahyaKyrgyz
 
Вспомнить все
Вспомнить всеВспомнить все
Вспомнить всеNotamedia
 
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОД
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОДОбзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОД
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОДCisco Russia
 
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...dri_ireland
 
Globalizzazione: opportunità e scelte di business
Globalizzazione: opportunità e scelte di businessGlobalizzazione: opportunità e scelte di business
Globalizzazione: opportunità e scelte di businessPaolo Borzatta
 
Los peces
Los peces Los peces
Los peces matildeh
 
fragments of a diary of savouring, to the kindergarten
fragments of a diary of savouring, to the kindergartenfragments of a diary of savouring, to the kindergarten
fragments of a diary of savouring, to the kindergartena-small-lab
 

Andere mochten auch (17)

курандын кереметтери. кyrgyz (кыргыз)
курандын кереметтери. кyrgyz (кыргыз)курандын кереметтери. кyrgyz (кыргыз)
курандын кереметтери. кyrgyz (кыргыз)
 
2559 project 602-10
2559 project 602-102559 project 602-10
2559 project 602-10
 
Slfjaklsd
SlfjaklsdSlfjaklsd
Slfjaklsd
 
Nasir journalism CV
Nasir journalism CVNasir journalism CV
Nasir journalism CV
 
RV AAD RL.pdf.
RV AAD RL.pdf.RV AAD RL.pdf.
RV AAD RL.pdf.
 
Jon Purday
Jon PurdayJon Purday
Jon Purday
 
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016
#SEOChat Recap - Conducting SEO Site Audits - August 4, 2016
 
2014 MATC Intern Program: Reid Winkelmann
2014 MATC Intern Program: Reid Winkelmann2014 MATC Intern Program: Reid Winkelmann
2014 MATC Intern Program: Reid Winkelmann
 
Seniors Cruising
Seniors CruisingSeniors Cruising
Seniors Cruising
 
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)
жаныбарлардагы жан аябастыктар жана акылдуу кыймыл аракеттер. кyrgyz (кыргыз)
 
Вспомнить все
Вспомнить всеВспомнить все
Вспомнить все
 
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОД
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОДОбзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОД
Обзор новых продуктов и решений Cisco для для сетевой инфраструктуры ЦОД
 
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
Rebecca Grant - DRI/ARA(I) Training: Introduction to EAD - Metadata and Metad...
 
Regent Seven Seas
Regent Seven SeasRegent Seven Seas
Regent Seven Seas
 
Globalizzazione: opportunità e scelte di business
Globalizzazione: opportunità e scelte di businessGlobalizzazione: opportunità e scelte di business
Globalizzazione: opportunità e scelte di business
 
Los peces
Los peces Los peces
Los peces
 
fragments of a diary of savouring, to the kindergarten
fragments of a diary of savouring, to the kindergartenfragments of a diary of savouring, to the kindergarten
fragments of a diary of savouring, to the kindergarten
 

Ähnlich wie How search engines work Anand Saini

Phrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalPhrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalBala Abirami
 
How search engines work
How search engines workHow search engines work
How search engines workChinna Botla
 
Database Searching Basics
Database Searching BasicsDatabase Searching Basics
Database Searching Basicszhang48
 
information retrieval in artificial intelligence
information retrieval in artificial intelligenceinformation retrieval in artificial intelligence
information retrieval in artificial intelligencePriyadharshiniG41
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
 
Evaluation criteria
Evaluation criteriaEvaluation criteria
Evaluation criteriaCarr Tamara
 
Optimizing Your Content for Search
Optimizing Your Content for SearchOptimizing Your Content for Search
Optimizing Your Content for SearchSharon Weaver
 
How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search Sharon Weaver
 
Automatic Metadata Generation Charles Duncan
Automatic Metadata Generation Charles DuncanAutomatic Metadata Generation Charles Duncan
Automatic Metadata Generation Charles DuncanJISC CETIS
 
Using metadata repositories with search
Using metadata repositories with searchUsing metadata repositories with search
Using metadata repositories with searchJean Graef
 
Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Scriptorium Publishing
 
Practical Approaches to Sharing Information
Practical Approaches to Sharing InformationPractical Approaches to Sharing Information
Practical Approaches to Sharing InformationChristine Connors
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnAIIM Minnesota
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 

Ähnlich wie How search engines work Anand Saini (20)

Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
Phrase based Indexing and Information Retrieval
Phrase based Indexing and Information RetrievalPhrase based Indexing and Information Retrieval
Phrase based Indexing and Information Retrieval
 
How search engines work
How search engines workHow search engines work
How search engines work
 
Database Searching Basics
Database Searching BasicsDatabase Searching Basics
Database Searching Basics
 
information retrieval in artificial intelligence
information retrieval in artificial intelligenceinformation retrieval in artificial intelligence
information retrieval in artificial intelligence
 
Search Systems
Search SystemsSearch Systems
Search Systems
 
Starting a search application
Starting a search applicationStarting a search application
Starting a search application
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
Evaluation criteria
Evaluation criteriaEvaluation criteria
Evaluation criteria
 
Optimizing Your Content for Search
Optimizing Your Content for SearchOptimizing Your Content for Search
Optimizing Your Content for Search
 
How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search How did you find that?! Optimizing your SharePoint content for search
How did you find that?! Optimizing your SharePoint content for search
 
Automatic Metadata Generation Charles Duncan
Automatic Metadata Generation Charles DuncanAutomatic Metadata Generation Charles Duncan
Automatic Metadata Generation Charles Duncan
 
Using metadata repositories with search
Using metadata repositories with searchUsing metadata repositories with search
Using metadata repositories with search
 
Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Post conference workshop (xml and structure)
Post conference workshop (xml and structure)
 
SharePoint site admins leverage search
SharePoint site admins leverage searchSharePoint site admins leverage search
SharePoint site admins leverage search
 
Practical Approaches to Sharing Information
Practical Approaches to Sharing InformationPractical Approaches to Sharing Information
Practical Approaches to Sharing Information
 
Electronic Databases
Electronic DatabasesElectronic Databases
Electronic Databases
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
Haystacks slides
Haystacks slidesHaystacks slides
Haystacks slides
 

Mehr von Dr,Saini Anand

Website redesign & seo Anand Saini
Website redesign & seo Anand SainiWebsite redesign & seo Anand Saini
Website redesign & seo Anand SainiDr,Saini Anand
 
Seo the soul of web design Anand Saini
Seo the soul of web design Anand SainiSeo the soul of web design Anand Saini
Seo the soul of web design Anand SainiDr,Saini Anand
 
Seo Training By Anand Saini
Seo Training By Anand SainiSeo Training By Anand Saini
Seo Training By Anand SainiDr,Saini Anand
 
Search engine-optimization-tips-within-commonspot
Search engine-optimization-tips-within-commonspotSearch engine-optimization-tips-within-commonspot
Search engine-optimization-tips-within-commonspotDr,Saini Anand
 
Search engine optimization rankings, tactics & trends
Search engine optimization rankings, tactics & trendsSearch engine optimization rankings, tactics & trends
Search engine optimization rankings, tactics & trendsDr,Saini Anand
 
Search engine optimization beyond meta tags
Search engine optimization beyond meta tagsSearch engine optimization beyond meta tags
Search engine optimization beyond meta tagsDr,Saini Anand
 
Promoting website through_search engine
Promoting website through_search enginePromoting website through_search engine
Promoting website through_search engineDr,Saini Anand
 
An intorduction to optimize your web (fil eminimizer)
An intorduction to optimize your web (fil eminimizer)An intorduction to optimize your web (fil eminimizer)
An intorduction to optimize your web (fil eminimizer)Dr,Saini Anand
 
Web marketing Anand Saini
Web marketing  Anand SainiWeb marketing  Anand Saini
Web marketing Anand SainiDr,Saini Anand
 
Search engine marketing
Search engine marketingSearch engine marketing
Search engine marketingDr,Saini Anand
 
Search engine marketing current past future (fil eminimizer)
Search engine marketing current past future (fil eminimizer)Search engine marketing current past future (fil eminimizer)
Search engine marketing current past future (fil eminimizer)Dr,Saini Anand
 
Keyword seo preparation final steps
Keyword seo preparation final stepsKeyword seo preparation final steps
Keyword seo preparation final stepsDr,Saini Anand
 
Google adwords-use-for-your-business
Google adwords-use-for-your-businessGoogle adwords-use-for-your-business
Google adwords-use-for-your-businessDr,Saini Anand
 

Mehr von Dr,Saini Anand (20)

Website redesign & seo Anand Saini
Website redesign & seo Anand SainiWebsite redesign & seo Anand Saini
Website redesign & seo Anand Saini
 
Seo the soul of web design Anand Saini
Seo the soul of web design Anand SainiSeo the soul of web design Anand Saini
Seo the soul of web design Anand Saini
 
Seo Training By Anand Saini
Seo Training By Anand SainiSeo Training By Anand Saini
Seo Training By Anand Saini
 
Search engine-optimization-tips-within-commonspot
Search engine-optimization-tips-within-commonspotSearch engine-optimization-tips-within-commonspot
Search engine-optimization-tips-within-commonspot
 
Search engine optimization rankings, tactics & trends
Search engine optimization rankings, tactics & trendsSearch engine optimization rankings, tactics & trends
Search engine optimization rankings, tactics & trends
 
Search engine optimization beyond meta tags
Search engine optimization beyond meta tagsSearch engine optimization beyond meta tags
Search engine optimization beyond meta tags
 
Promoting website through_search engine
Promoting website through_search enginePromoting website through_search engine
Promoting website through_search engine
 
An intorduction to optimize your web (fil eminimizer)
An intorduction to optimize your web (fil eminimizer)An intorduction to optimize your web (fil eminimizer)
An intorduction to optimize your web (fil eminimizer)
 
Web marketing Anand Saini
Web marketing  Anand SainiWeb marketing  Anand Saini
Web marketing Anand Saini
 
Seo & sem training
Seo & sem trainingSeo & sem training
Seo & sem training
 
Search engine marketing
Search engine marketingSearch engine marketing
Search engine marketing
 
Search engine marketing current past future (fil eminimizer)
Search engine marketing current past future (fil eminimizer)Search engine marketing current past future (fil eminimizer)
Search engine marketing current past future (fil eminimizer)
 
Internet marketing
Internet marketingInternet marketing
Internet marketing
 
Google+
Google+Google+
Google+
 
Eternal truths of seo
Eternal truths of seoEternal truths of seo
Eternal truths of seo
 
Emarketing
EmarketingEmarketing
Emarketing
 
Blog feed-search-seo
Blog feed-search-seoBlog feed-search-seo
Blog feed-search-seo
 
Keyword seo preparation final steps
Keyword seo preparation final stepsKeyword seo preparation final steps
Keyword seo preparation final steps
 
Google adwords-use-for-your-business
Google adwords-use-for-your-businessGoogle adwords-use-for-your-business
Google adwords-use-for-your-business
 
Google adwprds
Google adwprdsGoogle adwprds
Google adwprds
 

Kürzlich hochgeladen

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 

Kürzlich hochgeladen (20)

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 

How search engines work Anand Saini

  • 1.
  • 2. Helping people find what they’re looking for  Starts with an “information need”  Convert to a query  Gets results In the materials available  Web pages  Other formats  Deep Web
  • 3.  Search can’t find what’s not there  The content is hugely important  Information Architecture is vital  Usable sites have good navigation and structure
  • 4.
  • 5. Index ahead of time • Find files or records • Open each one and read it • Store each word in a searchable index Provide search forms • Match the query terms with words in the index • Sort documents by relevance Display results
  • 6.
  • 7. Like an iceberg, 2/3 below water user interface search content functionality
  • 8. Text search works for structured content • Keyword search vs. SQL queries • Approximate vs. exact match • Multiple sources of content • Response time and database resources • Relevance ranking, very important • Works in the real world (e.g. EBay)
  • 9. Users blame the search engine  Even when the content is unavailable Understand the scope of site or intranet  Kinds of information  Divided sites: products / corporate info  Dates  Languages  Sources and data silos: CMSs, databases...  Update processes
  • 10. Store text to search it later Many ways to gather text  Crawl (spider) via HTTP  Read files on file servers  Access databases (HTTP or API)  Data silos via local APIs  Applications, CMSs, via Web Services Security and Access Control
  • 11.
  • 12.  Basic information for document or record • File name / URL / record ID • Title or equivalent • Size, date, MIME type  Full text of item  More metadata • Product name, picture ID • Category, topic, or subject • Other attributes, for relevance ranking and display
  • 13.
  • 14.
  • 15. Stop words Stemming Metadata  Explicit (tags)  Implicit (context) Semantics  CMS and Database fields  XML tags and attributes
  • 16. What happens after you click the search button and before retrieval starts. Usually in this order  Handle character set, maybe language  Look for operators and organize the query  Look for field names or metadata  Extract words (just like the indexer)  Deal with letter casing
  • 17. • Retrieval: find files with query terms • Not the same as relevance ranking Recall: find all relevant items Precision: find only relevant items Increasing one decreases the other
  • 18. Single-word queries  Find items containing that word Multi-word queries: combine lists  Any: every item with any query word  All: only items with every word  Phrases: find only items with all words in order Boolean and complex queries – Use algorithm to combine lists
  • 19. Empty search • Nothing on the site on that topic (scope) • Misspelling or typing mistakes • Vocabulary differences • Restrictive search defaults • Restrictive search choices • Software failure
  • 20.
  • 21. Theory: sort the matching items, so the most relevant ones appear first Can't really know what the user wants Relevance is hard to define and situational Short queries tend to be deeply ambiguous What do people mean when they type “bank”? First 10 results are the most important The more transparent, the better
  • 22.  Sorting documents on various criteria  Start with words matching query terms  Citation and link analysis  Like old library Citation Indexes  Ted Nelson - not only hypertext, but the links  Google PageRank  Incoming links  Authority of linkers  Taxonomies and external metadata
  • 23. • Term frequency in the item • Inverse document frequency of term  Rare words are likely to be more important wij = weight of Term Tj in Document Di tfij = frequency of Term Tj in Document Dj N = number of Documents in collection n = number of Documents where term Tj occurs at least once From Salton 1989
  • 24. Vector space • Probabilistic (binary interdependence) • Fuzzy set theory • Bayesian statistical analysis • Latent semantic indexing • Neural networks • Machine learning • All require sophisticated queries • See MIR, chapter 2
  • 25. Heuristics are rules of thumb • Not algorithms, not math Search Relevance Ranking Heuristics • Documents containing all search words • Search words as a phrase • Matches in title tag • Matches in other metadata Based on real-word user behavior
  • 26. What users see after they click the Search button The most visible part of search Elements of the results page  Page layout and navigation  Results header  List of results items  Results footer
  • 27.
  • 28.
  • 29. Human judgment beats algorithms Great for frequent, ambiguous searches  Use search log to identify best candidates Recommend good starting pages  Product information, FAQs, etc. Requires human resources  That means money and time More static than algorithmic search
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.  Leverage content structure  database fields (i.e. cruise amenities)  document metadata (news article bylines)  Provide both search and browse  Support information foraging  Integrate navigation with results  Not just subject taxonomies  Display only fruitful paths, no dead ends  Supported by academic research  Marti Hearst, UCB SIMS, flamenco.berkeley.edu
  • 35.
  • 36.
  • 37. Metrics  Number of searches  Number of no-matches searches  Traffic from search to high-value pages  Relate search changes to other metrics Search Log Analysis  Top 5% searches: phrases and words  Top no-matches searches  Use as market research
  • 38. Search engines can’t read minds  User queries are short and ambiguous Some things will help  Design a usable interface  Show match words in context  Keep index current and complete  Adjust heuristic weighting  Maintain suggestions and synonyms  Consider faceted metadata search
  • 39. Join us Add: WZ-30-a,Bhagwan Das Nagar East Punjabi Bagh, Delhi-110026 Tel.: 011 28316148, 3203571, 30538061 Mobile; +91-8010 298 388, 8010 198 388 E-mail: info@seocertification.org.in