SlideShare ist ein Scribd-Unternehmen logo
1 von 54
SEARCH
A PRACTICAL GUIDE TO THE FUTURE

INFORMATION THAT’S HARD TO FIND WILL
REMAIN INFORMATION THAT’S HARDLY
FOUND.


                           Copyleft
“Even a blind squirrel finds a nut ,
occasionally.” But few of us are determined
enough to search through millions, or
billions, of pages of information to find our
“nut.” So, to reduce the problem to a, more
or less, manageable solution, web “search
engines” were introduced a few years ago.
Finding key information
from gigantic World Wide
Web is similar to find a
needle lost in haystack. For
this purpose we would use a
special magnet that would
automatically, quickly and
effortlessly attract that
needle for us.
 In this scenario magnet is
“Search Engine”
Search
  COMPUTING to examine a computer file, disk,
  database, or network for particular information.
Engine
  Something that supplies the driving force or energy
  to a movement, system, or trend.
Search Engine
  A computer program that searches for particular
  keywords and returns a list of documents in which
  they were found, especially a commercial service
  that scans documents on the Internet.
Search is a Wicked Problem
• No definitive formulation.
• Considerable uncertainty. Complex interdependencies.
• Incomplete, contradictory, and changing requirements.
• Stakeholders have radically different world views and
  different frames for understanding the project or process.
• The problem is never solved.

 Roles   Language     Input                 Index                  Metadata                Design
 Goals   Vocabulary   Interaction           Algorithms             Controlled Vocabulary   Interaction
 Tasks   Syntax       Feedback              Linguistics            Knowledge Management    Behavior




User
         ?
         Query
                        Search
                       Interface
                                              Search
                                              Engine


                                    Ask, Browse, or Search Again
                                                                     Content                Results




                                                                                                         6
Interaction      Information
                                                          Discovery
    Design         Architecture



                                  Search

Futures          Knowledge
                                           Patterns   Wayfinding
Studies         Management
   1st Generation (ca 1994):
     • AltaVista, Excite, Infoseek…
     • Ranking based on Content:
        Pure Information Retrieval

   2nd Generation (ca 1996):
     • Lycos
     • Ranking based on Content + Structure
        Site Popularity

   3rd Generation (ca 1998):
     • Google, Teoma, Yahoo
     • Ranking based on Content + Structure + Value
        Page Reputation

   In the Works
     • Ranking based on “the need behind the query”
   Content Similarity Ranking:
    The more rare words two documents share,
    the more similar they are

   Documents are treated as “bags of words”
    (no effort to “understand” the contents)

   Similarity is measured by vector angles
                                               t3
   Query Results are ranked                            d
    by sorting the angles                               2

    between query and documents                             d1
                                                    θ

                                                             t1

                                      t2
   A hyperlink
    from a page in site A              www.aa.com
    to some page in site B                 1
    is considered a popularity vote                               www.bb.com
    from site A to site B                                             2


   Rank similar documents
                                      www.cc.com
    according to popularity               1                      www.dd.com
                                                                     2


                                                    www.zz.com
                                                        0
   The reputation “PageRank” of a page Pi =
    the sum of a fraction of the reputations of all
    pages Pj that point to Pi

   Idea similar to academic co-citations

   Beautiful Math behind it
     • PR = principal eigenvector
       of the web‟s link matrix
     • PR equivalent to the chance
       of randomly surfing to the page
   HITS algorithm tries to recognize
        “authorities” and “hubs”
Check for duplicates,
        crawl the              store the
           web                documents
                                                  DocIds




 user                                                      create an
                                                            inverted
query                                                         index



                                      Search
           Show results                                     Inverted
                                      engine
             To user                                          index
                                      servers
Crawling
     Follow links to find information

               Indexing
    Record what words appear where

                Ranking
What information is a good match to a user
 query? What information is inherently good?

                Displaying
  Find a good format for the information
50% of emails received are spam!
But Google is usually so good in finding info…
Why does it do that?
• I try another search engine.

• I try different keywords but if I still can't find
an answer, I just think real hard for an
answer.

• I focus on the encyclopedia.
I punch the
screen.
 Just kidding, LOL.
don’t know how to form a sound search
query;
don’t have a strategy for dealing with poor
results;
can’t articulate how they know content is
credible;
don’t check the author or date of an article.
 Step  1 – define the data you want
 Step 2 – figure out where it‟s likely to be
  found
 Step 3 – select the search tool most likely
  to provide it
 Step 4 – learn how to interpret your results
 The   most commonly used search tools are
  • Search Engines
  • Subject Directories
 Other   search tools include
  • Targeted directories
  • Focused Crawlers
  • Portals
  • Vortals
  • Meta-tools
  • Value-added search services
 Searchengines are the preferred tool
 when you:
  • Are looking for something very specific
  • Need to pin down a quick fact or two
  • Need to know if any information exists at all on a
    subject
  • Want mass quantities of links, but are not
    concerned about quality control.
A   subject directory is a database of titles,
  citations, and websites organized by
  category.
 Advantage – Most directories are edited,
  maintained and created by people.
  • Usually they are carefully evaluated and annotated for
     this reason.
 Disadvantage   – Typically include a smaller
  number of sites than a search engine due
  to the great amount of human effort
  involved.
 Open   Directory Project - The largest, most
  comprehensive human-edited directory of the
  Web. It is constructed and maintained by a
  vast, global community of volunteer editors.
 Closed model directories such as Yahoo! And
  LookSmart are pulled together by professional
  editors who select the links and set up the
  categories. The user generally gets high
  quality results
 Subject directories are organized and
  selective.
 They are useful when you want to know
  more about broad-based subjects, such as
  •   General topics
  •   Popular topics
  •   Targeted directories
  •   Current events
  •   Product information
 Many   search engines are now hybrids-
  search tools that have an engine as well
  as a directory.
 Sometimes targeted directories are
  matched with focused crawlers to produce
  a very powerful hybrid search tool. (e.g.
  http://www.FirstGov.gov
 Metasearches   use multiple engines to look for
  your keywords.
 Advantage – You have many search engines all
  looking for what you need. Great when you are
  looking for something that is hard to find.
 Disadvantage – It‟s hard to fine tune your search
  and narrow things down. Also, Metasearches
  can sometimes give you more information than
  what you need.
 Beaucoup!   – www.beaucoup.com
 Clusty – http://clusty.com
 Mamma, “the mother of all search
  engines”- www.mamma.com
 Ixquick – www.ixquick.com
 Yahooligans    – Made for ages 7-12, pages are
  hand picked to be appropriate for children. Not
  only will the content on these pages be
  monitored, but so are the ads that are displayed.
 Froogle – Made for the frugal shopper, this
  offshoot of Google has engines that catalog
  products and finds you the cheapest price for a
  given item on the internet. It‟s in it‟s “beta”
  version so they are still working out some kinks.
   Boolean Operators (AND, OR, and
    NOT)
    • AND:
       Limits the number of „hits‟ (results) you receive
       In many search sites, this is implied (if you type
        two or more words, it assumes you want x AND y
        AND z, etc.)
    • OR:
       Increases the number of „hits‟ you receive
       Synonyms for words can be used
    • NOT:
       Limits the number of „hits‟ you receive
       Useful for getting rid of words that have more than
        one meaning
       Ex: Sun NOT Microsystems
       Sometimes a (-) sign (like for Google)
   Phrase Search
       Usually quotation marks are used: “ “
       Useful for a specific search (song lyrics, part of a poem, etc.)
       Ex: “fly me to the moon”
   Truncation and Wildcards
       Used as placeholders for additional characters - usually (*)
       Truncation = finds any characters that come after the placeholder
        • Ex: Red* --> red, reds, redwood, redding, etc.
       Wildcards = finds different characters within a word
        • Ex: Wom*n --> woman, women
   Stop Words
       Small words that are used often
       Some stop words include: and, the, a, not, to, be, etc.
        • Ex: Give me a cookie and Give me cookie would yield similar results
       Most search engines and databases ingore these
   Limiters
        Most search engines and databases provide other ways to narrow your search
        Often found under Advanced Search
        Varies greatly!

    • Search limiters
          Keyword (usually default)
          Title
          Author
          Subject
          Multiple search boxes

    • Other limiters
       Date
       Language
       Type ( book, dvd, magazine, etc.) OR (web: .gov, .edu, .org)
    • Google Advanced Search
    • Wilson Select Plus
 Power  searching also uses math, the
  universal language.
 Uses symbols of + and – and “”.
 Example: “Clinton – Lewinsky” on Yahoo!
 Usethese commands in the search
 window.
  •   intitle: Find sites with one search term in the title.
  •   allintitle: Find sites with all search terms in the title.
  •   inurl: Find sites with one search term in the URL.
  •   allinurl: Find sites with all search terms in the URL.
  •   site: Limit your search to a specific web site.
  •   filetype: Specify a type of document to search.




                                                 8/2/2007
 Find  pages containing the term in the title:
            intitle:[search term]
 Find pages with terms in the text:
            allintext:[search terms]
 Find similar pages to a certain website:
            related:[insert URL]
 Find pages with the term in the URL:
            inurl:[insert search term]
Try it out!
 Find   pages containing the term in the title:
            title:[search term]

 Find   pages with the term in the URL:
            url.all:[search term]
 Also called “deep web” consists of
  materials search engines will not or cannot
  index.
 Usually consists of web-based databases
  or pdf files.
 Example: American Memory Project:
  Jackie Robinson.
 Google  – The only traditional search
  engine that can recognize .pdf and .doc
  files.
 Profusion – a Metasearch tool that lets you
  search .pdf files.
    Google
              By far the most used search site (76% of searches on the Internet are done using Google).
              Simple one line search box
              Phrase completion function
              Did you mean function
              I‟m Feeling Lucky!
              Other search options
               • Images, Videos, Maps, News, Shopping (limiters)
        • Search strategies

TYPE                           INCLUDED?     HOW
Boolean operators              Yes           AND = [default]     OR = OR(capitalized)     NOT = [-]
(AND, OR, NOT)
Phrase Search                  Yes           Quotation marks [“ “]


Wildcards / Truncation         Some          No truncation (Google automatically searches other endings)
                                             Wildcards = [*]
Advanced search                Yes           Limit by Language, File type, Domain, etc.
Bing
   Bing (Microsoft‟s latest search engine)
           Starts out with a simple one box search, but becomes more complex
           Phrase completion function
           Web site review function
           Related searches
           Other search options
            • Images, Videos, Maps (localized), News, Shopping, History (limiters)
     • Search strategies

    TYPE                        INCLUDED?    HOW
    Boolean operators           Yes          AND = [default]     OR = OR(capitalized)   NOT = NOT (capitalized)
    (AND, OR, NOT)
    Phrase Search               Yes          Quotation marks [“ “]


    Wildcards / Truncation      No           No truncation or wildcard options

    Advanced search             Yes          Limit by Terms. [under Preferences] Domain, Country/Region, Language,
                                             Filter
Yahoo! Search
   Yahoo! Search
          Much more than a search engine (search.yahoo.com for ONLY search)
          Search Assist / Also try:
          Sponsored results
          Related searches
          Other search options
           • Images, video, local, shopping, jobs, news, sports, weather, etc. (limiters)
    • Search strategies

    TYPE                         INCLUDED?     HOW
    Boolean operators            Yes           AND = [default]     OR = OR(capitalized)    NOT = [-]
    (AND, OR, NOT)
    Phrase Search                Yes           Quotation marks [“ “]


    Wildcards / Truncation       No            No truncation or wildcard options

    Advanced search              Yes           Limit by Terms, Last updated, Domain, Country, Language, Filter
Meta Search Engines
   Dogpile
           Meta search engines search multiple other search sites
           Simple one line search box
           Phrase complete function
           Did you mean function
           Other search options
            • Images, video, news, white and yellow pages (limiters)
     • Search strategies
    TYPE                        INCLUDED?     HOW
    Boolean operators           No            * Advanced search terms function in a similar way
    (AND, OR, NOT)
    Phrase Search               No            * Advanced search terms function in a similar way


    Wildcards / Truncation      No            No truncation or wildcard options

    Advanced search             Yes           Limit by Terms, Domain. [under preferences] Filter, Bold search terms, #
                                              displays
Meta Search Engines
   Clutsy
           Simple one line search box
           Clusters function (groups results into subjects)
           Sources and Sites function
           Did you mean function
           Other search options
            • News, Images, Wikipedia, Blogs, Jobs (limiters)
     • Search strategies

    TYPE                       INCLUDED?     HOW
    Boolean operators          Yes           AND = [default]     OR = OR(capitalized)      NOT = [-]
    (AND, OR, NOT)
    Phrase Search              Yes           Quotation marks [“ “]


    Wildcards / Truncation     No            No truncation or wildcard options

    Advanced search            Yes           Limit by Host (domain), Language, Type, # Results in a Cluster, Filter
 Surfwax           (meta search engine)
      Can view contents of your search in a sidebar (Snap)
      Is very cluttered / complex
      Can broaden or narrow your search (Focus)
      Sort by and results functions
      Useful if you are „browsing‟ the Web without a clear topic
 Wikipedia             (online encyclopedia)
    Encyclopedia in which anyone can edit content
       • Vast amount of information on practically any subject
       • Reliability somewhat in question
       • List of references
    Best if you are looking for specific information or as a place to start a search
    Useful if you are „browsing‟ the Web without a clear topic
 YouTube            (videos posted by anyone)
    Video of practically anything you can think of
    Anyone can post a video clip
    Difficult to find information. Cluttered.
 Many        others
    Just search the words “search engines” in your favorite search
apophenia
the spontaneous
perception of connections
and meaningfulness
in unrelated things
                            53
1.     Most search engines have vanished.
2.     Google is a big player.
3.    63% of Internet users use a search engine in a
      given session.
4.    Approximately 94 million adults use the internet
      on an average day.
5.    This means approximately 59.22 MILLION people
      use search engines in an average day.
6.    Microsoft realized Internet is here to stay
     i. Dominates the browser market.
     ii. Realizes search is critical.

Weitere ähnliche Inhalte

Was ist angesagt?

Pub355: SEO Copywriting
Pub355: SEO CopywritingPub355: SEO Copywriting
Pub355: SEO Copywritingsomisguided
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Bradley Allen
 
Searching the internet information and assessment
Searching the internet information and assessmentSearching the internet information and assessment
Searching the internet information and assessmentnollyris
 
99ways presentation at semtech conference 2009
99ways presentation at semtech conference 200999ways presentation at semtech conference 2009
99ways presentation at semtech conference 2009michele minno
 
Advanced google searching (1)
Advanced google searching (1)Advanced google searching (1)
Advanced google searching (1)Brenda Crawford
 
Search Engine Optimisation Waterford
Search Engine Optimisation WaterfordSearch Engine Optimisation Waterford
Search Engine Optimisation Waterfordcianmurphy85
 
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingSocial Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingShelly D. Farnham, Ph.D.
 
SEO 101 Workshop 10/2
SEO 101 Workshop 10/2SEO 101 Workshop 10/2
SEO 101 Workshop 10/2451 Marketing
 
SEO Workshop (Blazin Multimedia)
SEO Workshop (Blazin Multimedia)SEO Workshop (Blazin Multimedia)
SEO Workshop (Blazin Multimedia)blazinmedia
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Roi Blanco
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Science portfolio internet presentation
Science portfolio internet presentationScience portfolio internet presentation
Science portfolio internet presentationrgifast
 
BIZ 2401 and the Library
BIZ 2401 and the LibraryBIZ 2401 and the Library
BIZ 2401 and the LibraryTraciwm
 

Was ist angesagt? (18)

Pub355: SEO Copywriting
Pub355: SEO CopywritingPub355: SEO Copywriting
Pub355: SEO Copywriting
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
 
Searching the internet information and assessment
Searching the internet information and assessmentSearching the internet information and assessment
Searching the internet information and assessment
 
Www04 -rose
Www04 -roseWww04 -rose
Www04 -rose
 
Online research
Online researchOnline research
Online research
 
99ways presentation at semtech conference 2009
99ways presentation at semtech conference 200999ways presentation at semtech conference 2009
99ways presentation at semtech conference 2009
 
Actionable Small Business SEO SEM (SBA)
Actionable Small Business SEO SEM (SBA)Actionable Small Business SEO SEM (SBA)
Actionable Small Business SEO SEM (SBA)
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Advanced google searching (1)
Advanced google searching (1)Advanced google searching (1)
Advanced google searching (1)
 
Search Engine Optimisation Waterford
Search Engine Optimisation WaterfordSearch Engine Optimisation Waterford
Search Engine Optimisation Waterford
 
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingSocial Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
 
SEO 101 Workshop 10/2
SEO 101 Workshop 10/2SEO 101 Workshop 10/2
SEO 101 Workshop 10/2
 
SEO Workshop (Blazin Multimedia)
SEO Workshop (Blazin Multimedia)SEO Workshop (Blazin Multimedia)
SEO Workshop (Blazin Multimedia)
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Science portfolio internet presentation
Science portfolio internet presentationScience portfolio internet presentation
Science portfolio internet presentation
 
BIZ 2401 and the Library
BIZ 2401 and the LibraryBIZ 2401 and the Library
BIZ 2401 and the Library
 
Se ocheatsheet8.5x11
Se ocheatsheet8.5x11Se ocheatsheet8.5x11
Se ocheatsheet8.5x11
 

Andere mochten auch

Dewatering Optimization_Gupta
Dewatering Optimization_GuptaDewatering Optimization_Gupta
Dewatering Optimization_GuptaRashi Gupta
 
Structural Lightweight Concrete
Structural Lightweight ConcreteStructural Lightweight Concrete
Structural Lightweight ConcreteMoe Abadla
 
Dewatering System
Dewatering SystemDewatering System
Dewatering Systemnik kin
 
Presentation on well point system
Presentation on well point systemPresentation on well point system
Presentation on well point systemAnshuman Tyagi
 
Controlling Water On Construction Sites
Controlling Water On Construction SitesControlling Water On Construction Sites
Controlling Water On Construction SitesMartin Preene
 
Dewatering techniques
Dewatering techniquesDewatering techniques
Dewatering techniquesjamali husain
 
Thermal Power Plant Basic to Knowledge
Thermal Power Plant Basic to KnowledgeThermal Power Plant Basic to Knowledge
Thermal Power Plant Basic to KnowledgeAnshuman Tyagi
 
Civil Engineering-Dewatering
Civil Engineering-DewateringCivil Engineering-Dewatering
Civil Engineering-DewateringMoe Abadla
 
Construction dewatering
Construction dewateringConstruction dewatering
Construction dewateringShubham Parab
 
DESIGN OF SUBSURFACE DRAINAGE SYSTEM
DESIGN OF SUBSURFACE DRAINAGE SYSTEMDESIGN OF SUBSURFACE DRAINAGE SYSTEM
DESIGN OF SUBSURFACE DRAINAGE SYSTEMNamitha M R
 
Drainage presentation
Drainage presentationDrainage presentation
Drainage presentationBilly Wiggins
 
Drainage system
Drainage systemDrainage system
Drainage systemAditi Shah
 
Construction Dewatering PowerPoint
Construction Dewatering PowerPointConstruction Dewatering PowerPoint
Construction Dewatering PowerPointTerry Aylward
 

Andere mochten auch (20)

Dewatering Optimization_Gupta
Dewatering Optimization_GuptaDewatering Optimization_Gupta
Dewatering Optimization_Gupta
 
Structural Lightweight Concrete
Structural Lightweight ConcreteStructural Lightweight Concrete
Structural Lightweight Concrete
 
Dewatering System
Dewatering SystemDewatering System
Dewatering System
 
Chimney one go
Chimney one goChimney one go
Chimney one go
 
Presentation on well point system
Presentation on well point systemPresentation on well point system
Presentation on well point system
 
Controlling Water On Construction Sites
Controlling Water On Construction SitesControlling Water On Construction Sites
Controlling Water On Construction Sites
 
Dewatering
DewateringDewatering
Dewatering
 
Dewatering techniques
Dewatering techniquesDewatering techniques
Dewatering techniques
 
Thermal Power Plant Basic to Knowledge
Thermal Power Plant Basic to KnowledgeThermal Power Plant Basic to Knowledge
Thermal Power Plant Basic to Knowledge
 
Civil Engineering-Dewatering
Civil Engineering-DewateringCivil Engineering-Dewatering
Civil Engineering-Dewatering
 
Drainage
DrainageDrainage
Drainage
 
Drainage
DrainageDrainage
Drainage
 
Construction dewatering
Construction dewateringConstruction dewatering
Construction dewatering
 
Methods of Dewatering
Methods of DewateringMethods of Dewatering
Methods of Dewatering
 
Dewatering techniques
Dewatering techniquesDewatering techniques
Dewatering techniques
 
DESIGN OF SUBSURFACE DRAINAGE SYSTEM
DESIGN OF SUBSURFACE DRAINAGE SYSTEMDESIGN OF SUBSURFACE DRAINAGE SYSTEM
DESIGN OF SUBSURFACE DRAINAGE SYSTEM
 
Drainage presentation
Drainage presentationDrainage presentation
Drainage presentation
 
Metro presentation
Metro presentationMetro presentation
Metro presentation
 
Drainage system
Drainage systemDrainage system
Drainage system
 
Construction Dewatering PowerPoint
Construction Dewatering PowerPointConstruction Dewatering PowerPoint
Construction Dewatering PowerPoint
 

Ähnlich wie Search engines

Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technologyStefanos Anastasiadis
 
Searchland: Search quality for Beginners
Searchland: Search quality for BeginnersSearchland: Search quality for Beginners
Searchland: Search quality for BeginnersValeria de Paiva
 
Charting Searchland, ACM SIG Data Mining
Charting Searchland, ACM SIG Data MiningCharting Searchland, ACM SIG Data Mining
Charting Searchland, ACM SIG Data MiningValeria de Paiva
 
Google Paper
Google Paper Google Paper
Google Paper girish1m
 
How to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User ExperienceHow to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User ExperienceBrightEdge
 
Se omoz the-beginners-guide-to-seo-2012
Se omoz the-beginners-guide-to-seo-2012Se omoz the-beginners-guide-to-seo-2012
Se omoz the-beginners-guide-to-seo-2012Matt Evans
 
SEO 2012 - The Beginners Guide
SEO 2012 - The Beginners GuideSEO 2012 - The Beginners Guide
SEO 2012 - The Beginners GuideShipra Malik
 
Search engines by Gulshan K Maheshwari(QAU)
Search engines by Gulshan  K Maheshwari(QAU)Search engines by Gulshan  K Maheshwari(QAU)
Search engines by Gulshan K Maheshwari(QAU)GulshanKumar368
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptxScrbifPt
 
How search engines work
How search engines workHow search engines work
How search engines workChinna Botla
 
Research power point for students
Research power point for studentsResearch power point for students
Research power point for studentslarchmeany1
 
Search Engine Optimization Review
Search Engine Optimization ReviewSearch Engine Optimization Review
Search Engine Optimization ReviewMark Cijo
 
[Book];[the-beginners-guide-to-seo]
[Book];[the-beginners-guide-to-seo][Book];[the-beginners-guide-to-seo]
[Book];[the-beginners-guide-to-seo]AiiM Education
 
Seomoz The Beginners Guide to SEO
Seomoz The Beginners Guide to SEOSeomoz The Beginners Guide to SEO
Seomoz The Beginners Guide to SEOTyson Stevens
 
the-beginners-guide-to-seo
the-beginners-guide-to-seothe-beginners-guide-to-seo
the-beginners-guide-to-seogs-seo-club
 
SEOMoz - The Beginner's Guide to Search Engine Optimization
SEOMoz - The Beginner's Guide to Search Engine OptimizationSEOMoz - The Beginner's Guide to Search Engine Optimization
SEOMoz - The Beginner's Guide to Search Engine OptimizationStepValue - Web Intelligence
 

Ähnlich wie Search engines (20)

Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technology
 
Web Search Engine
Web Search EngineWeb Search Engine
Web Search Engine
 
Searchland: Search quality for Beginners
Searchland: Search quality for BeginnersSearchland: Search quality for Beginners
Searchland: Search quality for Beginners
 
Searchland2
Searchland2Searchland2
Searchland2
 
Charting Searchland, ACM SIG Data Mining
Charting Searchland, ACM SIG Data MiningCharting Searchland, ACM SIG Data Mining
Charting Searchland, ACM SIG Data Mining
 
Google Paper
Google Paper Google Paper
Google Paper
 
How to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User ExperienceHow to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User Experience
 
Beginners Guide To SEO
Beginners Guide To SEOBeginners Guide To SEO
Beginners Guide To SEO
 
Se omoz the-beginners-guide-to-seo-2012
Se omoz the-beginners-guide-to-seo-2012Se omoz the-beginners-guide-to-seo-2012
Se omoz the-beginners-guide-to-seo-2012
 
SEO 2012 - The Beginners Guide
SEO 2012 - The Beginners GuideSEO 2012 - The Beginners Guide
SEO 2012 - The Beginners Guide
 
Search engines by Gulshan K Maheshwari(QAU)
Search engines by Gulshan  K Maheshwari(QAU)Search engines by Gulshan  K Maheshwari(QAU)
Search engines by Gulshan K Maheshwari(QAU)
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
 
How search engines work
How search engines workHow search engines work
How search engines work
 
Research power point for students
Research power point for studentsResearch power point for students
Research power point for students
 
Deck_Rob Flaherty
Deck_Rob FlahertyDeck_Rob Flaherty
Deck_Rob Flaherty
 
Search Engine Optimization Review
Search Engine Optimization ReviewSearch Engine Optimization Review
Search Engine Optimization Review
 
[Book];[the-beginners-guide-to-seo]
[Book];[the-beginners-guide-to-seo][Book];[the-beginners-guide-to-seo]
[Book];[the-beginners-guide-to-seo]
 
Seomoz The Beginners Guide to SEO
Seomoz The Beginners Guide to SEOSeomoz The Beginners Guide to SEO
Seomoz The Beginners Guide to SEO
 
the-beginners-guide-to-seo
the-beginners-guide-to-seothe-beginners-guide-to-seo
the-beginners-guide-to-seo
 
SEOMoz - The Beginner's Guide to Search Engine Optimization
SEOMoz - The Beginner's Guide to Search Engine OptimizationSEOMoz - The Beginner's Guide to Search Engine Optimization
SEOMoz - The Beginner's Guide to Search Engine Optimization
 

Search engines

  • 1. SEARCH A PRACTICAL GUIDE TO THE FUTURE INFORMATION THAT’S HARD TO FIND WILL REMAIN INFORMATION THAT’S HARDLY FOUND. Copyleft
  • 2. “Even a blind squirrel finds a nut , occasionally.” But few of us are determined enough to search through millions, or billions, of pages of information to find our “nut.” So, to reduce the problem to a, more or less, manageable solution, web “search engines” were introduced a few years ago.
  • 3. Finding key information from gigantic World Wide Web is similar to find a needle lost in haystack. For this purpose we would use a special magnet that would automatically, quickly and effortlessly attract that needle for us. In this scenario magnet is “Search Engine”
  • 4.
  • 5. Search COMPUTING to examine a computer file, disk, database, or network for particular information. Engine Something that supplies the driving force or energy to a movement, system, or trend. Search Engine A computer program that searches for particular keywords and returns a list of documents in which they were found, especially a commercial service that scans documents on the Internet.
  • 6. Search is a Wicked Problem • No definitive formulation. • Considerable uncertainty. Complex interdependencies. • Incomplete, contradictory, and changing requirements. • Stakeholders have radically different world views and different frames for understanding the project or process. • The problem is never solved. Roles Language Input Index Metadata Design Goals Vocabulary Interaction Algorithms Controlled Vocabulary Interaction Tasks Syntax Feedback Linguistics Knowledge Management Behavior User ? Query Search Interface Search Engine Ask, Browse, or Search Again Content Results 6
  • 7. Interaction Information Discovery Design Architecture Search Futures Knowledge Patterns Wayfinding Studies Management
  • 8.
  • 9. 1st Generation (ca 1994): • AltaVista, Excite, Infoseek… • Ranking based on Content:  Pure Information Retrieval  2nd Generation (ca 1996): • Lycos • Ranking based on Content + Structure  Site Popularity  3rd Generation (ca 1998): • Google, Teoma, Yahoo • Ranking based on Content + Structure + Value  Page Reputation  In the Works • Ranking based on “the need behind the query”
  • 10. Content Similarity Ranking: The more rare words two documents share, the more similar they are  Documents are treated as “bags of words” (no effort to “understand” the contents)  Similarity is measured by vector angles t3  Query Results are ranked d by sorting the angles 2 between query and documents d1 θ t1 t2
  • 11. A hyperlink from a page in site A www.aa.com to some page in site B 1 is considered a popularity vote www.bb.com from site A to site B 2  Rank similar documents www.cc.com according to popularity 1 www.dd.com 2 www.zz.com 0
  • 12. The reputation “PageRank” of a page Pi = the sum of a fraction of the reputations of all pages Pj that point to Pi  Idea similar to academic co-citations  Beautiful Math behind it • PR = principal eigenvector of the web‟s link matrix • PR equivalent to the chance of randomly surfing to the page  HITS algorithm tries to recognize “authorities” and “hubs”
  • 13.
  • 14. Check for duplicates, crawl the store the web documents DocIds user create an inverted query index Search Show results Inverted engine To user index servers
  • 15. Crawling Follow links to find information Indexing Record what words appear where Ranking What information is a good match to a user query? What information is inherently good? Displaying Find a good format for the information
  • 16.
  • 17. 50% of emails received are spam!
  • 18.
  • 19.
  • 20. But Google is usually so good in finding info… Why does it do that?
  • 21. • I try another search engine. • I try different keywords but if I still can't find an answer, I just think real hard for an answer. • I focus on the encyclopedia.
  • 22. I punch the screen. Just kidding, LOL.
  • 23. don’t know how to form a sound search query; don’t have a strategy for dealing with poor results; can’t articulate how they know content is credible; don’t check the author or date of an article.
  • 24.  Step 1 – define the data you want  Step 2 – figure out where it‟s likely to be found  Step 3 – select the search tool most likely to provide it  Step 4 – learn how to interpret your results
  • 25.  The most commonly used search tools are • Search Engines • Subject Directories  Other search tools include • Targeted directories • Focused Crawlers • Portals • Vortals • Meta-tools • Value-added search services
  • 26.  Searchengines are the preferred tool when you: • Are looking for something very specific • Need to pin down a quick fact or two • Need to know if any information exists at all on a subject • Want mass quantities of links, but are not concerned about quality control.
  • 27. A subject directory is a database of titles, citations, and websites organized by category.  Advantage – Most directories are edited, maintained and created by people. • Usually they are carefully evaluated and annotated for this reason.  Disadvantage – Typically include a smaller number of sites than a search engine due to the great amount of human effort involved.
  • 28.  Open Directory Project - The largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.  Closed model directories such as Yahoo! And LookSmart are pulled together by professional editors who select the links and set up the categories. The user generally gets high quality results
  • 29.  Subject directories are organized and selective.  They are useful when you want to know more about broad-based subjects, such as • General topics • Popular topics • Targeted directories • Current events • Product information
  • 30.  Many search engines are now hybrids- search tools that have an engine as well as a directory.  Sometimes targeted directories are matched with focused crawlers to produce a very powerful hybrid search tool. (e.g. http://www.FirstGov.gov
  • 31.  Metasearches use multiple engines to look for your keywords.  Advantage – You have many search engines all looking for what you need. Great when you are looking for something that is hard to find.  Disadvantage – It‟s hard to fine tune your search and narrow things down. Also, Metasearches can sometimes give you more information than what you need.
  • 32.  Beaucoup! – www.beaucoup.com  Clusty – http://clusty.com  Mamma, “the mother of all search engines”- www.mamma.com  Ixquick – www.ixquick.com
  • 33.  Yahooligans – Made for ages 7-12, pages are hand picked to be appropriate for children. Not only will the content on these pages be monitored, but so are the ads that are displayed.  Froogle – Made for the frugal shopper, this offshoot of Google has engines that catalog products and finds you the cheapest price for a given item on the internet. It‟s in it‟s “beta” version so they are still working out some kinks.
  • 34. Boolean Operators (AND, OR, and NOT) • AND:  Limits the number of „hits‟ (results) you receive  In many search sites, this is implied (if you type two or more words, it assumes you want x AND y AND z, etc.) • OR:  Increases the number of „hits‟ you receive  Synonyms for words can be used • NOT:  Limits the number of „hits‟ you receive  Useful for getting rid of words that have more than one meaning  Ex: Sun NOT Microsystems  Sometimes a (-) sign (like for Google)
  • 35. Phrase Search  Usually quotation marks are used: “ “  Useful for a specific search (song lyrics, part of a poem, etc.)  Ex: “fly me to the moon”  Truncation and Wildcards  Used as placeholders for additional characters - usually (*)  Truncation = finds any characters that come after the placeholder • Ex: Red* --> red, reds, redwood, redding, etc.  Wildcards = finds different characters within a word • Ex: Wom*n --> woman, women  Stop Words  Small words that are used often  Some stop words include: and, the, a, not, to, be, etc. • Ex: Give me a cookie and Give me cookie would yield similar results  Most search engines and databases ingore these
  • 36. Limiters  Most search engines and databases provide other ways to narrow your search  Often found under Advanced Search  Varies greatly! • Search limiters  Keyword (usually default)  Title  Author  Subject  Multiple search boxes • Other limiters  Date  Language  Type ( book, dvd, magazine, etc.) OR (web: .gov, .edu, .org) • Google Advanced Search • Wilson Select Plus
  • 37.  Power searching also uses math, the universal language.  Uses symbols of + and – and “”.  Example: “Clinton – Lewinsky” on Yahoo!
  • 38.  Usethese commands in the search window. • intitle: Find sites with one search term in the title. • allintitle: Find sites with all search terms in the title. • inurl: Find sites with one search term in the URL. • allinurl: Find sites with all search terms in the URL. • site: Limit your search to a specific web site. • filetype: Specify a type of document to search. 8/2/2007
  • 39.  Find pages containing the term in the title: intitle:[search term]  Find pages with terms in the text: allintext:[search terms]  Find similar pages to a certain website: related:[insert URL]  Find pages with the term in the URL: inurl:[insert search term] Try it out!
  • 40.  Find pages containing the term in the title: title:[search term]  Find pages with the term in the URL: url.all:[search term]
  • 41.  Also called “deep web” consists of materials search engines will not or cannot index.  Usually consists of web-based databases or pdf files.  Example: American Memory Project: Jackie Robinson.
  • 42.  Google – The only traditional search engine that can recognize .pdf and .doc files.  Profusion – a Metasearch tool that lets you search .pdf files.
  • 43. Google  By far the most used search site (76% of searches on the Internet are done using Google).  Simple one line search box  Phrase completion function  Did you mean function  I‟m Feeling Lucky!  Other search options • Images, Videos, Maps, News, Shopping (limiters) • Search strategies TYPE INCLUDED? HOW Boolean operators Yes AND = [default] OR = OR(capitalized) NOT = [-] (AND, OR, NOT) Phrase Search Yes Quotation marks [“ “] Wildcards / Truncation Some No truncation (Google automatically searches other endings) Wildcards = [*] Advanced search Yes Limit by Language, File type, Domain, etc.
  • 44. Bing
  • 45. Bing (Microsoft‟s latest search engine)  Starts out with a simple one box search, but becomes more complex  Phrase completion function  Web site review function  Related searches  Other search options • Images, Videos, Maps (localized), News, Shopping, History (limiters) • Search strategies TYPE INCLUDED? HOW Boolean operators Yes AND = [default] OR = OR(capitalized) NOT = NOT (capitalized) (AND, OR, NOT) Phrase Search Yes Quotation marks [“ “] Wildcards / Truncation No No truncation or wildcard options Advanced search Yes Limit by Terms. [under Preferences] Domain, Country/Region, Language, Filter
  • 47. Yahoo! Search  Much more than a search engine (search.yahoo.com for ONLY search)  Search Assist / Also try:  Sponsored results  Related searches  Other search options • Images, video, local, shopping, jobs, news, sports, weather, etc. (limiters) • Search strategies TYPE INCLUDED? HOW Boolean operators Yes AND = [default] OR = OR(capitalized) NOT = [-] (AND, OR, NOT) Phrase Search Yes Quotation marks [“ “] Wildcards / Truncation No No truncation or wildcard options Advanced search Yes Limit by Terms, Last updated, Domain, Country, Language, Filter
  • 49. Dogpile  Meta search engines search multiple other search sites  Simple one line search box  Phrase complete function  Did you mean function  Other search options • Images, video, news, white and yellow pages (limiters) • Search strategies TYPE INCLUDED? HOW Boolean operators No * Advanced search terms function in a similar way (AND, OR, NOT) Phrase Search No * Advanced search terms function in a similar way Wildcards / Truncation No No truncation or wildcard options Advanced search Yes Limit by Terms, Domain. [under preferences] Filter, Bold search terms, # displays
  • 51. Clutsy  Simple one line search box  Clusters function (groups results into subjects)  Sources and Sites function  Did you mean function  Other search options • News, Images, Wikipedia, Blogs, Jobs (limiters) • Search strategies TYPE INCLUDED? HOW Boolean operators Yes AND = [default] OR = OR(capitalized) NOT = [-] (AND, OR, NOT) Phrase Search Yes Quotation marks [“ “] Wildcards / Truncation No No truncation or wildcard options Advanced search Yes Limit by Host (domain), Language, Type, # Results in a Cluster, Filter
  • 52.  Surfwax (meta search engine)  Can view contents of your search in a sidebar (Snap)  Is very cluttered / complex  Can broaden or narrow your search (Focus)  Sort by and results functions  Useful if you are „browsing‟ the Web without a clear topic  Wikipedia (online encyclopedia)  Encyclopedia in which anyone can edit content • Vast amount of information on practically any subject • Reliability somewhat in question • List of references  Best if you are looking for specific information or as a place to start a search  Useful if you are „browsing‟ the Web without a clear topic  YouTube (videos posted by anyone)  Video of practically anything you can think of  Anyone can post a video clip  Difficult to find information. Cluttered.  Many others  Just search the words “search engines” in your favorite search
  • 53. apophenia the spontaneous perception of connections and meaningfulness in unrelated things 53
  • 54. 1. Most search engines have vanished. 2. Google is a big player. 3. 63% of Internet users use a search engine in a given session. 4. Approximately 94 million adults use the internet on an average day. 5. This means approximately 59.22 MILLION people use search engines in an average day. 6. Microsoft realized Internet is here to stay i. Dominates the browser market. ii. Realizes search is critical.

Hinweis der Redaktion

  1. Neha
  2. Neha
  3. Monu
  4. ADD BUBBLES. AND PICTURE OF STUDENT