SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Building Local/Geo Search
with Apache Lucene and Solr
Agenda



   Grant Ingersoll, Lucid Imagination
      Introduction
      Basics of geo-spatial search
      Tools available in Lucene and Solr
   Ryan McKinley, Voyager GIS
      Spatial search in Action:
   Sameer Maggon, AT&T Interactive
      How Solr powers local search at YP.com


                              Lucid Imagination, Inc.
Introductions
   Grant Ingersoll
         Lucene/Solr committer
         Co-author of upcoming “Taming Text”


   Ryan McKinley
         Lucene/Solr committer
         Co-founder of Voyager GIS


   Sameer Maggon
         Search Eng. Team lead at AT&T Interactive
         Active user of Lucene since 2001

                                 Lucid Imagination, Inc.
Use Cases



      Asset Management
        “Dude, where’s my map?”
      Social Networking
        Find all friends near me
      Targeted, local search results and ads
        “restaurants in Austin Texas”
        “Starbucks, 55313”
      Business Intelligence
        Restrict doc set for analysis by location

                                   Lucid Imagination, Inc.
Spatial Search Concepts



      Spatial Data Types
        Points (latitude/longitude)
        Lines
        Shapes


      Maps and overlays
        Streets, POI
                                         http://www.openstreetmap.org/?lat=44.9744&lon=-93.2484&zoom=14&layers=B000FTFT

      Integration with unstructured text
        Metadata, descriptions, user reviews, etc.

                                Lucid Imagination, Inc.
Application Needs



      Query Parsing
      Efficient distance calculations
        Euclidean, Great Circle (Haversine), Vincenty’s
      Filtering
        Bounding Box
      Sort by Distance
      Relevance Enhancement
      Faceting
      Advanced: shape intersections, routes

                                Lucid Imagination, Inc.
Lucene 2.9/Solr 1.4 Features for Spatial Search



      Lucene/Solr are excellent for dealing with unstructured text


      2.9/1.4 adds:
        Better Numeric handling for range searches


        Spatial contribution with features for (2.9 only, coming in 1.5):
        • Creating Cartesian Tiers (Grids)
        • Geohashes
        • Calculating distances
        • Filter implementations
                                   Lucid Imagination, Inc.
Query Parsing



      Query parsing is often the most difficult to get right
        User error, ambiguity in names
        Mixture of topic and location: bars in Minneapolis MN
      Geocoding translates addresses, POIs into lat/lon or other
        Several publicly available services: geonames.org, Google Maps
        Often have built-in throttles, so may not be effective for prod.


      Query logs are invaluable for developing an effective parser



                                Lucid Imagination, Inc.
Filtering



       Range queries can significantly slow down search if done
     improperly
       Goal: reduce the number of terms to evaluate
       Solution 1:
            New Trie-based numeric capabilities
       Solution 2:
            Cartesian Tiers




                                   Lucid Imagination, Inc.
Cartesian Tiers



     Divide up the space into grids and assign it an id
       Each tier breaks the space down into 2tier grids
       Sample code using Lucene spatial contrib:
   CartesianTierPlotter pl = new
    CartesianTierPlotter(10, new
    SinusoidalProjector(), "spatial");
   pl.getTierBoxId(latitude, longitude);
      See
   http://www.nsshutdown.com/projects/lucene/wh
   itepaper/locallucene_v2.html

                                 Lucid Imagination, Inc.
What’s next?



      Tighter integration in Solr
        Work already under way
        Native field types, query parsing support, faceting support


      Resources
        java-user@lucene,apache.org, solr-user@lucene.apache.org
        https://issues.apache.org/jira/browse/SOLR-773
        http://lucene.apache.org/java/2_9_1/api/contrib-
        spatial/index.html
        Many, many more general resources on the web
                                Lucid Imagination, Inc.
Voyager Spatial Data Search
                       Ryan McKinley
               Co-founder, Voyager GIS
Where is my Data?
• Files stored across the network – desktop,
  external drives, databases etc.
• Many distinct data formats
• Massive datasets keep getting bigger.
• Poor cataloging tools
• Limited metadata
Voyager Solution
Voyager is a search engine for your geographic data.

• Find data with simple text search and
  geographic constraints
• Keep data in its existing location (no need to
  import to a new system)
• Tools to work with search results
Implementation
• Data Discovery / Extraction
• Solr search
• Wicket UI
Data Extraction
• For each result, we extract basic information:




- ESRI ArcObjects
- GDAL
- PDFBox
- Geotools
- Tika
- etc
Geographic Search in Solr
• Need to search by ‘extent’ not point
• Works well with a standard RTree
• Built a custom Lucene Filter to
  intersect/search within a given extent.
Work in Progress
• Custom Gazateer
  – “Building 12” > ‘-96.X 30.X -96.X 30.X’


• Named Entity Extraction
  – Geographic words that appear in titles / text get
    indexed with geographic properties
Geographic Search in Solr 1.5+
• Standard API, pluggable implementation.
  – Standard Qparser, pluggable indexing
• Single input ‘field’ could index multiple lucene
  fields.
• Share objects between different parts of the
  request cycle (only calculate distance once)
• Augment results with calculated value
  – Manual or from function query
How Solr powers local search at
           YP.com



           Sameer Maggon
           November 18, 2009




© 2008 AT&T Intellectual Property. All rights reserved.
AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.
YP.com
        Technical Challenges
        Custom Relevance Model
        Scalability / Architecture
        Conclusion




© 2008 AT&T Intellectual Property. All rights reserved.
AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.
YP.com (beta)


Local Search Site


Focused on providing
relevant results


Uses Solr for search




                       AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-   3
                        designated team(s) within the AT&T companies and not for general distribution
Technical Challenges



        Relevancy                                                                                Scalability

Topically relevant results                                         10s of millions of
                                                                   records
Constrained by contextual
geographical search                                                Response time less
                                                                   than 200ms
Local relevancy is not just
keyword and location –                                             Fault resistant
ratings, brands, etc                                               More than 150 million
                                                                   searches per month




                        AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-     4
                         designated team(s) within the AT&T companies and not for general distribution
Custom Relevance Model


  Topical             +     Geographical                                                                    +               Social

Complex handling of       Distance modulation based on                                                              Business with 4.5 stars and
multiword queries         business density                                                                          200 reviews is more relevant
                                                                                                                    than 5.0 star 1 review




                               AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-                              5
                                designated team(s) within the AT&T companies and not for general distribution
Custom Relevance Model


   Topical             +     Geographical                                                                    +               Social

Complex handling of        Distance modulation based on                                                              Business with 4.5 stars and
multiword queries          business density                                                                          200 reviews is more relevant
                                                                                                                     than 5.0 star 1 review




Field Boosts for certain    LocalSolr as a geographic                                                                CustomScoreQuery to tie
fields                      filter                                                                                   all different scores together
Dismax to handle complex    Ability to modulate score
queries                     based on business density




                                AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-                                   6
                                 designated team(s) within the AT&T companies and not for general distribution
Geographic Sharding


                                                           Score Combinations

                                                           Performance was better


                                                           Provisioning is a bit complex




               AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-   7
                designated team(s) within the AT&T companies and not for general distribution
Search Architecture

                 Search Slaves                                                                      Masters

                                                        shards
    API Layer




                                                             replication                                          Feeder /
                                                                                                              Document Pipeline




                rows

                       AT&T Proprietary (Restricted) Only for use by authorized individuals or any above-                     8
                        designated team(s) within the AT&T companies and not for general distribution
Bottom Line



Solr has enabled us to innovate faster
   • Quick iterations of relevancy model and functionality
   • Open Platform with much more flexibility
   • Scalable Architecture to meet our business needs
Bottom Line



Solr has enabled us to innovate faster
   • Quick iterations of relevancy model and functionality
   • Open Platform with much more flexibility
   • Scalable Architecture to meet our business needs




Thus, delivering value to our consumers
Resources




       http://bit.ly/lucid-local




                     Lucid Imagination, Inc.
Q&A


Lucid Imagination, Inc.
http://bit.ly/lucid-local

Weitere ähnliche Inhalte

Andere mochten auch

Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Lucidworks (Archived)
 
情報科学演習 09
情報科学演習 09情報科学演習 09
情報科学演習 09libryukyu
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条彰 村地
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Impact of open source search on the intelligence community
Impact of open source search on the intelligence communityImpact of open source search on the intelligence community
Impact of open source search on the intelligence communityLucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Tennis
TennisTennis
Tennisaritz
 
Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennontanica
 
Oslb office365
Oslb office365Oslb office365
Oslb office365彰 村地
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache彰 村地
 
Spanish bombss
Spanish bombssSpanish bombss
Spanish bombsstanica
 
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Marty Kaszubowski
 
Cancer
CancerCancer
Cancertanica
 

Andere mochten auch (20)

Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"
 
Juan gris
Juan grisJuan gris
Juan gris
 
情報科学演習 09
情報科学演習 09情報科学演習 09
情報科学演習 09
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条
 
What’s new in apache solr 1.4
What’s new in apache solr 1.4What’s new in apache solr 1.4
What’s new in apache solr 1.4
 
Updated: Sources of Funding
Updated:  Sources of FundingUpdated:  Sources of Funding
Updated: Sources of Funding
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
Van gogh
Van goghVan gogh
Van gogh
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
корея
кореякорея
корея
 
Impact of open source search on the intelligence community
Impact of open source search on the intelligence communityImpact of open source search on the intelligence community
Impact of open source search on the intelligence community
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Tennis
TennisTennis
Tennis
 
Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennon
 
Oslb office365
Oslb office365Oslb office365
Oslb office365
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache
 
Spanish bombss
Spanish bombssSpanish bombss
Spanish bombss
 
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
Center for Enterprise Innovation (CEI) Summary for HREDA, 9-25-14
 
Cancer
CancerCancer
Cancer
 
Learn How to Master Solr1 4
Learn How to Master Solr1 4Learn How to Master Solr1 4
Learn How to Master Solr1 4
 

Ähnlich wie Building Local/Geo Search with Apache Lucene and Solr

Local Search using Solr at YP.com
Local Search using Solr at YP.comLocal Search using Solr at YP.com
Local Search using Solr at YP.comSameer Maggon
 
Solr the intelligent search engine
Solr the intelligent search engineSolr the intelligent search engine
Solr the intelligent search engineCS2 AG
 
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011SEO CAMP
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Bringing Geospatial Business Intelligence to the Enterprise
Bringing Geospatial Business Intelligenceto the EnterpriseBringing Geospatial Business Intelligenceto the Enterprise
Bringing Geospatial Business Intelligence to the Enterprisemkarren
 
7 dee finding the right methodologies marshall sponder - 9-12-12 - submitted
7 dee finding the right methodologies   marshall sponder - 9-12-12 - submitted7 dee finding the right methodologies   marshall sponder - 9-12-12 - submitted
7 dee finding the right methodologies marshall sponder - 9-12-12 - submittedMarshall Sponder
 
Introduction to FluentData - The Micro ORM
Introduction to FluentData - The Micro ORMIntroduction to FluentData - The Micro ORM
Introduction to FluentData - The Micro ORMLars-Erik Kindblad
 
Esri Application on AWS Cloud Webinar
Esri Application on AWS Cloud WebinarEsri Application on AWS Cloud Webinar
Esri Application on AWS Cloud WebinarAmazon Web Services
 
Database@Home - Maps and Spatial Analyses: How to use them
Database@Home - Maps and Spatial Analyses: How to use themDatabase@Home - Maps and Spatial Analyses: How to use them
Database@Home - Maps and Spatial Analyses: How to use themTammy Bednar
 
Being a mobile entrepreneur
Being a mobile entrepreneurBeing a mobile entrepreneur
Being a mobile entrepreneurgetsocialize
 
AWS Total Cost of Ownership Hong Kong and Taiwan
AWS Total Cost of Ownership Hong Kong and TaiwanAWS Total Cost of Ownership Hong Kong and Taiwan
AWS Total Cost of Ownership Hong Kong and TaiwanAmazon Web Services
 
Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Umesh Ramalingachar
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Amazon Web Services
 
FME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales TeamFME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales TeamSafe Software
 
Enterprise Location Intelligence
Enterprise Location IntelligenceEnterprise Location Intelligence
Enterprise Location Intelligencebrennonmartin
 
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3IBMInfoSphereUGFR
 
Domain driven design
Domain driven designDomain driven design
Domain driven designtatyaso
 

Ähnlich wie Building Local/Geo Search with Apache Lucene and Solr (20)

Local Search using Solr at YP.com
Local Search using Solr at YP.comLocal Search using Solr at YP.com
Local Search using Solr at YP.com
 
Solr the intelligent search engine
Solr the intelligent search engineSolr the intelligent search engine
Solr the intelligent search engine
 
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Bringing Geospatial Business Intelligence to the Enterprise
Bringing Geospatial Business Intelligenceto the EnterpriseBringing Geospatial Business Intelligenceto the Enterprise
Bringing Geospatial Business Intelligence to the Enterprise
 
7 dee finding the right methodologies marshall sponder - 9-12-12 - submitted
7 dee finding the right methodologies   marshall sponder - 9-12-12 - submitted7 dee finding the right methodologies   marshall sponder - 9-12-12 - submitted
7 dee finding the right methodologies marshall sponder - 9-12-12 - submitted
 
Introduction to FluentData - The Micro ORM
Introduction to FluentData - The Micro ORMIntroduction to FluentData - The Micro ORM
Introduction to FluentData - The Micro ORM
 
Esri Application on AWS Cloud Webinar
Esri Application on AWS Cloud WebinarEsri Application on AWS Cloud Webinar
Esri Application on AWS Cloud Webinar
 
Database@Home - Maps and Spatial Analyses: How to use them
Database@Home - Maps and Spatial Analyses: How to use themDatabase@Home - Maps and Spatial Analyses: How to use them
Database@Home - Maps and Spatial Analyses: How to use them
 
Being a mobile entrepreneur
Being a mobile entrepreneurBeing a mobile entrepreneur
Being a mobile entrepreneur
 
AWS Total Cost of Ownership Hong Kong and Taiwan
AWS Total Cost of Ownership Hong Kong and TaiwanAWS Total Cost of Ownership Hong Kong and Taiwan
AWS Total Cost of Ownership Hong Kong and Taiwan
 
Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4
 
FME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales TeamFME Geo Enabling Field Sales Team
FME Geo Enabling Field Sales Team
 
Enterprise Location Intelligence
Enterprise Location IntelligenceEnterprise Location Intelligence
Enterprise Location Intelligence
 
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Domain driven design
Domain driven designDomain driven design
Domain driven design
 

Mehr von Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 

Mehr von Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 

Kürzlich hochgeladen

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Building Local/Geo Search with Apache Lucene and Solr

  • 1. Building Local/Geo Search with Apache Lucene and Solr
  • 2. Agenda Grant Ingersoll, Lucid Imagination Introduction Basics of geo-spatial search Tools available in Lucene and Solr Ryan McKinley, Voyager GIS Spatial search in Action: Sameer Maggon, AT&T Interactive How Solr powers local search at YP.com Lucid Imagination, Inc.
  • 3. Introductions Grant Ingersoll Lucene/Solr committer Co-author of upcoming “Taming Text” Ryan McKinley Lucene/Solr committer Co-founder of Voyager GIS Sameer Maggon Search Eng. Team lead at AT&T Interactive Active user of Lucene since 2001 Lucid Imagination, Inc.
  • 4. Use Cases Asset Management “Dude, where’s my map?” Social Networking Find all friends near me Targeted, local search results and ads “restaurants in Austin Texas” “Starbucks, 55313” Business Intelligence Restrict doc set for analysis by location Lucid Imagination, Inc.
  • 5. Spatial Search Concepts Spatial Data Types Points (latitude/longitude) Lines Shapes Maps and overlays Streets, POI http://www.openstreetmap.org/?lat=44.9744&lon=-93.2484&zoom=14&layers=B000FTFT Integration with unstructured text Metadata, descriptions, user reviews, etc. Lucid Imagination, Inc.
  • 6. Application Needs Query Parsing Efficient distance calculations Euclidean, Great Circle (Haversine), Vincenty’s Filtering Bounding Box Sort by Distance Relevance Enhancement Faceting Advanced: shape intersections, routes Lucid Imagination, Inc.
  • 7. Lucene 2.9/Solr 1.4 Features for Spatial Search Lucene/Solr are excellent for dealing with unstructured text 2.9/1.4 adds: Better Numeric handling for range searches Spatial contribution with features for (2.9 only, coming in 1.5): • Creating Cartesian Tiers (Grids) • Geohashes • Calculating distances • Filter implementations Lucid Imagination, Inc.
  • 8. Query Parsing Query parsing is often the most difficult to get right User error, ambiguity in names Mixture of topic and location: bars in Minneapolis MN Geocoding translates addresses, POIs into lat/lon or other Several publicly available services: geonames.org, Google Maps Often have built-in throttles, so may not be effective for prod. Query logs are invaluable for developing an effective parser Lucid Imagination, Inc.
  • 9. Filtering Range queries can significantly slow down search if done improperly Goal: reduce the number of terms to evaluate Solution 1: New Trie-based numeric capabilities Solution 2: Cartesian Tiers Lucid Imagination, Inc.
  • 10. Cartesian Tiers Divide up the space into grids and assign it an id Each tier breaks the space down into 2tier grids Sample code using Lucene spatial contrib: CartesianTierPlotter pl = new CartesianTierPlotter(10, new SinusoidalProjector(), "spatial"); pl.getTierBoxId(latitude, longitude); See http://www.nsshutdown.com/projects/lucene/wh itepaper/locallucene_v2.html Lucid Imagination, Inc.
  • 11. What’s next? Tighter integration in Solr Work already under way Native field types, query parsing support, faceting support Resources java-user@lucene,apache.org, solr-user@lucene.apache.org https://issues.apache.org/jira/browse/SOLR-773 http://lucene.apache.org/java/2_9_1/api/contrib- spatial/index.html Many, many more general resources on the web Lucid Imagination, Inc.
  • 12. Voyager Spatial Data Search Ryan McKinley Co-founder, Voyager GIS
  • 13. Where is my Data? • Files stored across the network – desktop, external drives, databases etc. • Many distinct data formats • Massive datasets keep getting bigger. • Poor cataloging tools • Limited metadata
  • 14. Voyager Solution Voyager is a search engine for your geographic data. • Find data with simple text search and geographic constraints • Keep data in its existing location (no need to import to a new system) • Tools to work with search results
  • 15.
  • 16.
  • 17.
  • 18. Implementation • Data Discovery / Extraction • Solr search • Wicket UI
  • 19. Data Extraction • For each result, we extract basic information: - ESRI ArcObjects - GDAL - PDFBox - Geotools - Tika - etc
  • 20. Geographic Search in Solr • Need to search by ‘extent’ not point • Works well with a standard RTree • Built a custom Lucene Filter to intersect/search within a given extent.
  • 21. Work in Progress • Custom Gazateer – “Building 12” > ‘-96.X 30.X -96.X 30.X’ • Named Entity Extraction – Geographic words that appear in titles / text get indexed with geographic properties
  • 22. Geographic Search in Solr 1.5+ • Standard API, pluggable implementation. – Standard Qparser, pluggable indexing • Single input ‘field’ could index multiple lucene fields. • Share objects between different parts of the request cycle (only calculate distance once) • Augment results with calculated value – Manual or from function query
  • 23. How Solr powers local search at YP.com Sameer Maggon November 18, 2009 © 2008 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.
  • 24. YP.com Technical Challenges Custom Relevance Model Scalability / Architecture Conclusion © 2008 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.
  • 25. YP.com (beta) Local Search Site Focused on providing relevant results Uses Solr for search AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 3 designated team(s) within the AT&T companies and not for general distribution
  • 26. Technical Challenges Relevancy Scalability Topically relevant results 10s of millions of records Constrained by contextual geographical search Response time less than 200ms Local relevancy is not just keyword and location – Fault resistant ratings, brands, etc More than 150 million searches per month AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 4 designated team(s) within the AT&T companies and not for general distribution
  • 27. Custom Relevance Model Topical + Geographical + Social Complex handling of Distance modulation based on Business with 4.5 stars and multiword queries business density 200 reviews is more relevant than 5.0 star 1 review AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 5 designated team(s) within the AT&T companies and not for general distribution
  • 28. Custom Relevance Model Topical + Geographical + Social Complex handling of Distance modulation based on Business with 4.5 stars and multiword queries business density 200 reviews is more relevant than 5.0 star 1 review Field Boosts for certain LocalSolr as a geographic CustomScoreQuery to tie fields filter all different scores together Dismax to handle complex Ability to modulate score queries based on business density AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 6 designated team(s) within the AT&T companies and not for general distribution
  • 29. Geographic Sharding Score Combinations Performance was better Provisioning is a bit complex AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 7 designated team(s) within the AT&T companies and not for general distribution
  • 30. Search Architecture Search Slaves Masters shards API Layer replication Feeder / Document Pipeline rows AT&T Proprietary (Restricted) Only for use by authorized individuals or any above- 8 designated team(s) within the AT&T companies and not for general distribution
  • 31. Bottom Line Solr has enabled us to innovate faster • Quick iterations of relevancy model and functionality • Open Platform with much more flexibility • Scalable Architecture to meet our business needs
  • 32. Bottom Line Solr has enabled us to innovate faster • Quick iterations of relevancy model and functionality • Open Platform with much more flexibility • Scalable Architecture to meet our business needs Thus, delivering value to our consumers
  • 33. Resources http://bit.ly/lucid-local Lucid Imagination, Inc.