SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
Small wins In a small
time with Apache Solr
Who am I?


    My (Buddhist) name is Upayavira

    Consultant with Sourcesense, specialising in
    search and operational technologies

    A member of the Apache Software Foundation
Who are Sourcesense?


    Open Source integrator, specialising in:
    
        Search
    
        Business Intelligence
    
        Content Management
    
        Application Lifecycle Management

    Offices in London, Amsterdam, Milan and Rome
Committers and Contributors

     Search:
     
            Lucene/Solr – contributor
     
            Hibernate Search – committer
     
            Lucene Infinispan integration – lead developer
     
            Apache UIMA – committer

     CMS:
     
            Apache Chemistry – contributor
     
            Apache Jackrabbit – contributor
     
            JBoss GateIn Portal – committer
     
            OpenSSO-Alfresco - contributor
What is Lucene?


    Lucene is a Java information retrieval library

    Provides free text search facilities

    Started in 2000, by Doug Cutting

    A project of the Apache Software Foundation

    It is designed to be embedded in Java apps
What is Solr?


    Solr is an enterprise search server based on
    Lucene

    Wraps Lucene with a RESTful web interface

    Provides configurable schema

    Provides replication functionality
Solr Design
                                       User queries




     Solr          SearchHandler
     instance


                       Lucene
                        index



                UpdateRequestHandler



                                        content
                                       application
Prerequisites



    Java, preferably Java 6

    Apache Solr 1.4.1

    http://www.sourcesense.com/dev8d-solr.zip
Prerequisites

    Extract your Solr distribution

    At a command prompt:
    – cd into the unzipped distribution directory
    – cd into the example directory
    – Enter: java -jar start.jar

    Visit http://localhost:8983/solr/ in a browser. If you see a
    welcome message, your Solr works

    Unpack your dev8d-solr.zip file

    At another command prompt, cd into your dev8d-solr
    directory
Checking Solr Works


    Visit http://localhost:8983/solr/admin/

    You should see the Solr admin page.

    Click statistics link

    You'll see NumDocs: 0

    There's nothing in the index, so searches won't show
    much

    So we need to index some sample content
Indexing Sample Content



    In your dev8d-solr directory (extracted from the zip), at
    a command prompt:

    Java -jar post.jar wikipedia-basic.xml
Searching




    http://localhost:8983/solr/select?q=*:*
Searching




    http://localhost:8983/solr/select?q=computers
Searching




    http://localhost:8983/solr/select?q=computer systems
Searching




     http://localhost:8983/solr/select?q=computers OR systems
Searching




     http://localhost:8983/solr/select?q=computers AND systems
Searching




     http://localhost:8983/solr/select?q="computer systems"
Searching




     http://localhost:8983/solr/select?q="computer systems"~10
Searching




     http://localhost:8983/solr/select?q=computers NOT data
Searching




     http://localhost:8983/solr/select?q=computers -data
Searching




     http://localhost:8983/solr/select/?q=computers&fl=title
Searching




     http://localhost:8983/solr/select/?q=computers&fq=author:yobot
Searching



     http://localhost:8983/solr/select/?
     q=computers&fq=author:yobot&fl=title,author
Searching



     http://localhost:8983/solr/select/?
     q=computers&rows=10&start=10&fl=title
Searching




     http://localhost:8983/solr/select/?q=title:system&fl=title
Searching



     http://localhost:8983/solr/select/?
     q=computers&fl=title,author&sort=author+desc
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author&rows=0
     &facet.sort=lex
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author&rows=0&
     facet.sort=count
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author&rows=0&
     facet.sort=count&facet.mincount=2
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author&rows=0&
     facet.sort=count&facet.limit=3
Searching



     http://localhost:8983/solr/select/?
     q=computers&facet=true&facet.field=author&rows=0&
     facet.sort=count&facet.limit=3&debugQuery=true
Searching




     http://localhost:8983/solr/select?q=computer&wt=json
Searching




     http://localhost:8983/solr/select?q=computer&wt=javabin
Indexing
Indexing



     Load wikipedia-basic.xml into a text editor or web browser

     Load wikipedia-enhanced.xml into a text editor or browser

     Load example/solr/conf/schema.xml into a text editor
Indexing



     schema.xml defines field types and fields used in Solr

     Equivalent to your database schema in a RDBMS
Indexing


     Change these two fields in schema.xml to be of type “string”
     and add multiValued=”true” for each.
      <field name="links" type="string" indexed="true"
     stored="true" multiValued="true"/>
      <field name="category" type="string" indexed="true"
     stored="true" multiValued="true"/>
Indexing


     Now add this to the <fields> section of solrconfig.xml:

     <field name="source" type="string" indexed="true"
     stored="true" multiValued="false"/>

     <field name="textgen" type="textgen" indexed="true"
     stored="true" multiValued="true"/>

     Now search for the “textgen” field type definition, further up
     in the file.
Indexing



     At the bottom of solrconfig.xml add the following:
     <copyField source="text" dest="textgen"/>
Indexing



     At your command prompt, in the dev8d directory, execute:

     java -jar post.jar wikipedia-enhanced.xml
More Advanced Searching



     http://localhost:8983/solr/select?q=computers%20AND
     %20babbage&facet=true&facet.field=category&facet.mincount=
     1
More Advanced Searching



     http://localhost:8983/solr/terms?
     terms.fl=text&terms=true&terms.limit=20
More Advanced Searching



     http://localhost:8983/solr/terms?
     terms.fl=textgen&terms=true&terms.limit=20
More Advanced Searching



     http://localhost:8983/solr/terms?
     terms.fl=textgen&terms=true&terms.limit=20&terms.prefix=at
thank you
upayavira@sourcesense.com
Solr Host Configuration

       shard 1



       shard 2   searches



       shard 3
Solr Host Configuration

        shard 1



        shard 2



        shard 3




      co-ordinator
Solr Host Configuration

        shard 1



        shard 2



        shard 3




      co-ordinator




                     load balancer
Solr Host Configuration

        shard 1                      shard 1



        shard 2                      shard 2



        shard 3                      shard 3




      co-ordinator               co-ordinator




                     load balancer
Solr Host Configuration

        shard 1                      shard 1



        shard 2                      shard 2



        shard 3                      shard 3




      co-ordinator               co-ordinator




                     load balancer

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceDeep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceAmazon Web Services
 
DRUPAL Search API Solr
DRUPAL Search API SolrDRUPAL Search API Solr
DRUPAL Search API SolrAndrew Siz
 
Deep Dive into AWS CLI - the command line interface
Deep Dive into AWS CLI - the command line interfaceDeep Dive into AWS CLI - the command line interface
Deep Dive into AWS CLI - the command line interfaceJohn Varghese
 
Lightweight Webservices with Sinatra and RestClient
Lightweight Webservices with Sinatra and RestClientLightweight Webservices with Sinatra and RestClient
Lightweight Webservices with Sinatra and RestClientAdam Wiggins
 
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...Puppet
 
Deep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceDeep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceAmazon Web Services
 
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014Amazon Web Services
 
Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3makoto tsuyuki
 
Ethiopian multiplication in Perl6
Ethiopian multiplication in Perl6Ethiopian multiplication in Perl6
Ethiopian multiplication in Perl6Workhorse Computing
 
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...Building Modern and Secure PHP Applications – Codementor Office Hours with Be...
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...Arc & Codementor
 
用Tornado开发RESTful API运用
用Tornado开发RESTful API运用用Tornado开发RESTful API运用
用Tornado开发RESTful API运用Felinx Lee
 
Terraform infraestructura como código
Terraform infraestructura como códigoTerraform infraestructura como código
Terraform infraestructura como códigoVictor Adsuar
 
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)Puppet Camp Portland 2015: Introduction to Hiera (Beginner)
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)Puppet
 
Refactor Dance - Puppet Labs 'Best Practices'
Refactor Dance - Puppet Labs 'Best Practices'Refactor Dance - Puppet Labs 'Best Practices'
Refactor Dance - Puppet Labs 'Best Practices'Gary Larizza
 
Real time server
Real time serverReal time server
Real time serverthepian
 
Keeping it Small: Getting to know the Slim Micro Framework
Keeping it Small: Getting to know the Slim Micro FrameworkKeeping it Small: Getting to know the Slim Micro Framework
Keeping it Small: Getting to know the Slim Micro FrameworkJeremy Kendall
 
To Batch Or Not To Batch
To Batch Or Not To BatchTo Batch Or Not To Batch
To Batch Or Not To BatchLuca Mearelli
 
Controlling The Cloud With Python
Controlling The Cloud With PythonControlling The Cloud With Python
Controlling The Cloud With PythonLuca Mearelli
 

Was ist angesagt? (20)

Apache Hacks
Apache HacksApache Hacks
Apache Hacks
 
Deep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceDeep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line Interface
 
DRUPAL Search API Solr
DRUPAL Search API SolrDRUPAL Search API Solr
DRUPAL Search API Solr
 
Deep Dive into AWS CLI - the command line interface
Deep Dive into AWS CLI - the command line interfaceDeep Dive into AWS CLI - the command line interface
Deep Dive into AWS CLI - the command line interface
 
Lightweight Webservices with Sinatra and RestClient
Lightweight Webservices with Sinatra and RestClientLightweight Webservices with Sinatra and RestClient
Lightweight Webservices with Sinatra and RestClient
 
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...
Puppet Camp DC 2015: Stop Writing Puppet Modules: A Guide to Best Practices i...
 
Deep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line InterfaceDeep Dive: AWS Command Line Interface
Deep Dive: AWS Command Line Interface
 
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014
(DEV301) Advanced Usage of the AWS CLI | AWS re:Invent 2014
 
Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3
 
Ethiopian multiplication in Perl6
Ethiopian multiplication in Perl6Ethiopian multiplication in Perl6
Ethiopian multiplication in Perl6
 
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...Building Modern and Secure PHP Applications – Codementor Office Hours with Be...
Building Modern and Secure PHP Applications – Codementor Office Hours with Be...
 
用Tornado开发RESTful API运用
用Tornado开发RESTful API运用用Tornado开发RESTful API运用
用Tornado开发RESTful API运用
 
Terraform infraestructura como código
Terraform infraestructura como códigoTerraform infraestructura como código
Terraform infraestructura como código
 
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)Puppet Camp Portland 2015: Introduction to Hiera (Beginner)
Puppet Camp Portland 2015: Introduction to Hiera (Beginner)
 
Refactor Dance - Puppet Labs 'Best Practices'
Refactor Dance - Puppet Labs 'Best Practices'Refactor Dance - Puppet Labs 'Best Practices'
Refactor Dance - Puppet Labs 'Best Practices'
 
CodeIgniter 3.0
CodeIgniter 3.0CodeIgniter 3.0
CodeIgniter 3.0
 
Real time server
Real time serverReal time server
Real time server
 
Keeping it Small: Getting to know the Slim Micro Framework
Keeping it Small: Getting to know the Slim Micro FrameworkKeeping it Small: Getting to know the Slim Micro Framework
Keeping it Small: Getting to know the Slim Micro Framework
 
To Batch Or Not To Batch
To Batch Or Not To BatchTo Batch Or Not To Batch
To Batch Or Not To Batch
 
Controlling The Cloud With Python
Controlling The Cloud With PythonControlling The Cloud With Python
Controlling The Cloud With Python
 

Ähnlich wie Small wins in a small time with Apache Solr

Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solrNet7
 
Rails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineRails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineDavid Keener
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Enterprise search with apache solr
Enterprise search with apache solrEnterprise search with apache solr
Enterprise search with apache solrsenthil0809
 
Solr中国8月4日答疑交流v2
Solr中国8月4日答疑交流v2Solr中国8月4日答疑交流v2
Solr中国8月4日答疑交流v2longkeyy
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesRahul Singh
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesAnant Corporation
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrLucidworks (Archived)
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 

Ähnlich wie Small wins in a small time with Apache Solr (20)

Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solr
 
Rails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search EngineRails and the Apache SOLR Search Engine
Rails and the Apache SOLR Search Engine
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Apache SolrCloud
Apache SolrCloudApache SolrCloud
Apache SolrCloud
 
Enterprise search with apache solr
Enterprise search with apache solrEnterprise search with apache solr
Enterprise search with apache solr
 
Solr中国8月4日答疑交流v2
Solr中国8月4日答疑交流v2Solr中国8月4日答疑交流v2
Solr中国8月4日答疑交流v2
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for Solr
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Laravel 4 presentation
Laravel 4 presentationLaravel 4 presentation
Laravel 4 presentation
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 

Mehr von Sourcesense

Atlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad CavalcantiAtlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad CavalcantiSourcesense
 
Atlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps SessionAtlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps SessionSourcesense
 
Atlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense ReferencesAtlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense ReferencesSourcesense
 
Atlassian Roadshow 2016 intro
Atlassian Roadshow 2016 introAtlassian Roadshow 2016 intro
Atlassian Roadshow 2016 introSourcesense
 
Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015Sourcesense
 
Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015Sourcesense
 
Sharded Solr setup with master
Sharded Solr setup with masterSharded Solr setup with master
Sharded Solr setup with masterSourcesense
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StorySourcesense
 

Mehr von Sourcesense (8)

Atlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad CavalcantiAtlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad Cavalcanti
 
Atlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps SessionAtlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps Session
 
Atlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense ReferencesAtlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense References
 
Atlassian Roadshow 2016 intro
Atlassian Roadshow 2016 introAtlassian Roadshow 2016 intro
Atlassian Roadshow 2016 intro
 
Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015
 
Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015
 
Sharded Solr setup with master
Sharded Solr setup with masterSharded Solr setup with master
Sharded Solr setup with master
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents Story
 

Kürzlich hochgeladen

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Small wins in a small time with Apache Solr

  • 1. Small wins In a small time with Apache Solr
  • 2. Who am I?  My (Buddhist) name is Upayavira  Consultant with Sourcesense, specialising in search and operational technologies  A member of the Apache Software Foundation
  • 3. Who are Sourcesense?  Open Source integrator, specialising in:  Search  Business Intelligence  Content Management  Application Lifecycle Management  Offices in London, Amsterdam, Milan and Rome
  • 4. Committers and Contributors  Search:  Lucene/Solr – contributor  Hibernate Search – committer  Lucene Infinispan integration – lead developer  Apache UIMA – committer  CMS:  Apache Chemistry – contributor  Apache Jackrabbit – contributor  JBoss GateIn Portal – committer  OpenSSO-Alfresco - contributor
  • 5. What is Lucene?  Lucene is a Java information retrieval library  Provides free text search facilities  Started in 2000, by Doug Cutting  A project of the Apache Software Foundation  It is designed to be embedded in Java apps
  • 6. What is Solr?  Solr is an enterprise search server based on Lucene  Wraps Lucene with a RESTful web interface  Provides configurable schema  Provides replication functionality
  • 7. Solr Design User queries Solr SearchHandler instance Lucene index UpdateRequestHandler content application
  • 8. Prerequisites  Java, preferably Java 6  Apache Solr 1.4.1  http://www.sourcesense.com/dev8d-solr.zip
  • 9. Prerequisites  Extract your Solr distribution  At a command prompt: – cd into the unzipped distribution directory – cd into the example directory – Enter: java -jar start.jar  Visit http://localhost:8983/solr/ in a browser. If you see a welcome message, your Solr works  Unpack your dev8d-solr.zip file  At another command prompt, cd into your dev8d-solr directory
  • 10. Checking Solr Works  Visit http://localhost:8983/solr/admin/  You should see the Solr admin page.  Click statistics link  You'll see NumDocs: 0  There's nothing in the index, so searches won't show much  So we need to index some sample content
  • 11. Indexing Sample Content  In your dev8d-solr directory (extracted from the zip), at a command prompt:  Java -jar post.jar wikipedia-basic.xml
  • 12. Searching  http://localhost:8983/solr/select?q=*:*
  • 13. Searching  http://localhost:8983/solr/select?q=computers
  • 14. Searching  http://localhost:8983/solr/select?q=computer systems
  • 15. Searching  http://localhost:8983/solr/select?q=computers OR systems
  • 16. Searching  http://localhost:8983/solr/select?q=computers AND systems
  • 17. Searching  http://localhost:8983/solr/select?q="computer systems"
  • 18. Searching  http://localhost:8983/solr/select?q="computer systems"~10
  • 19. Searching  http://localhost:8983/solr/select?q=computers NOT data
  • 20. Searching  http://localhost:8983/solr/select?q=computers -data
  • 21. Searching  http://localhost:8983/solr/select/?q=computers&fl=title
  • 22. Searching  http://localhost:8983/solr/select/?q=computers&fq=author:yobot
  • 23. Searching  http://localhost:8983/solr/select/? q=computers&fq=author:yobot&fl=title,author
  • 24. Searching  http://localhost:8983/solr/select/? q=computers&rows=10&start=10&fl=title
  • 25. Searching  http://localhost:8983/solr/select/?q=title:system&fl=title
  • 26. Searching  http://localhost:8983/solr/select/? q=computers&fl=title,author&sort=author+desc
  • 27. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author
  • 28. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author&rows=0 &facet.sort=lex
  • 29. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author&rows=0& facet.sort=count
  • 30. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author&rows=0& facet.sort=count&facet.mincount=2
  • 31. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author&rows=0& facet.sort=count&facet.limit=3
  • 32. Searching  http://localhost:8983/solr/select/? q=computers&facet=true&facet.field=author&rows=0& facet.sort=count&facet.limit=3&debugQuery=true
  • 33. Searching  http://localhost:8983/solr/select?q=computer&wt=json
  • 34. Searching  http://localhost:8983/solr/select?q=computer&wt=javabin
  • 36. Indexing  Load wikipedia-basic.xml into a text editor or web browser  Load wikipedia-enhanced.xml into a text editor or browser  Load example/solr/conf/schema.xml into a text editor
  • 37. Indexing  schema.xml defines field types and fields used in Solr  Equivalent to your database schema in a RDBMS
  • 38. Indexing  Change these two fields in schema.xml to be of type “string” and add multiValued=”true” for each. <field name="links" type="string" indexed="true" stored="true" multiValued="true"/> <field name="category" type="string" indexed="true" stored="true" multiValued="true"/>
  • 39. Indexing  Now add this to the <fields> section of solrconfig.xml:  <field name="source" type="string" indexed="true" stored="true" multiValued="false"/>  <field name="textgen" type="textgen" indexed="true" stored="true" multiValued="true"/>  Now search for the “textgen” field type definition, further up in the file.
  • 40. Indexing  At the bottom of solrconfig.xml add the following: <copyField source="text" dest="textgen"/>
  • 41. Indexing  At your command prompt, in the dev8d directory, execute:  java -jar post.jar wikipedia-enhanced.xml
  • 42. More Advanced Searching  http://localhost:8983/solr/select?q=computers%20AND %20babbage&facet=true&facet.field=category&facet.mincount= 1
  • 43. More Advanced Searching  http://localhost:8983/solr/terms? terms.fl=text&terms=true&terms.limit=20
  • 44. More Advanced Searching  http://localhost:8983/solr/terms? terms.fl=textgen&terms=true&terms.limit=20
  • 45. More Advanced Searching  http://localhost:8983/solr/terms? terms.fl=textgen&terms=true&terms.limit=20&terms.prefix=at
  • 47. Solr Host Configuration shard 1 shard 2 searches shard 3
  • 48. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator
  • 49. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer
  • 50. Solr Host Configuration shard 1 shard 1 shard 2 shard 2 shard 3 shard 3 co-ordinator co-ordinator load balancer
  • 51. Solr Host Configuration shard 1 shard 1 shard 2 shard 2 shard 3 shard 3 co-ordinator co-ordinator load balancer