SlideShare ist ein Scribd-Unternehmen logo
1 von 20
First steps with Solr 4.7
Achraf KRID
Plan
•Definition
•Installation (with tomcat 7)
•Querying, faceting
•Indexation
•Configuration
•Symfony Integration
•Comparaison
Solr ?
SolrTM is the popular, blazing fast open source enterprise search platform from
the Apache LuceneTM project. Its major features include powerful full-text
search, hit highlighting, faceted search, near real-time indexing, dynamic
clustering, database integration, rich document (e.g., Word, PDF) handling, and
geospatial search.
Install Tomcat 7 + Solr 4.7
● Sudo apt-get install tomcat7 tomcat7-admin
● /etc/tomcat7/tomcat-users.xml
● http://localhost:8080/manager/html
● curl http://archive.apache.org/dist/lucene/solr/4.7.2/solr-4.7.2.tgz | tar xvz
● cp ~/solr-4.7.2/example/lib/ext/* /usr/share/tomcat7/lib/
● cp ~/solr-4.7.2/dist/solr-4.7.2.war /var/lib/tomcat7/webapps/solr.war
● cp -R ~/solr-4.6.1/example/solr /var/lib/tomcat
● chown -R tomcat7:tomcat7 /var/lib/tomcat7/solr
<tomcat-users>
<role rolename="manager-gui"/>
<user username="root" password="root" roles="manager-gui,admin-gui"/>
</tomcat-users>
Querying Data
java -jar start.jar
java -jar post.jar *.xml
http://solr/select?q=electronics
http://solr/select?q=electronics&sort=price+desc
http://solr/select?q=electronics&rows=50&start=50
http://solr/select?q=electronics&fl=name+price
http://solr/select?q=electronics&fq=inStock:true
Facets
&facet=true&facet.field=cat&facet.field=inStock
&facet.query=price:[0 TO 10]&facet.query=price:[10 TO *]
Schema.xml
●
01:<?xml version="1.0" encoding="UTF-8" ?>
●
02:<schema name="example" version="1.1">
●
03: <types>
●
04: <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
●
05: </types>
●
06:
●
07:
●
08: <fields>
●
09: <field name="id" type="string" indexed="true" stored="true" required="true" />
●
10: <field name="category" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true"/>
●
11: <field name="size" type="string" indexed="true" stored="true"/>
●
12: <field name="text" type="string" indexed="true" stored="false" multiValued="true"/>
●
13: </fields>
●
14:
●
15: <uniqueKey>id</uniqueKey>
●
16: <defaultSearchField>text</defaultSearchField>
●
17: <solrQueryParser defaultOperator="AND"/>
●
18:
●
19: <copyField source="id" dest="text"/>
●
20: <copyField source="category" dest="text"/>
●
21: <copyField source="size" dest="text"/>
●
22:
●
23:</schema>
Where You Describe Your Data
Schema.xml
●<field> Describes How You Deal With Specific Named Fields
●<dynamicField> Describes How To Deal With Fields That Match A Glob
(Unless There Is A Specific <field> For Them)
●<copyField> Describes How To Construct Fields From Other Fields
<field name="title" type="text" stored=”false” />
<dynamicField name="price*" type="sfloat" indexed="true" />
<copyField source="*" dest="catchall" />
Schema.xml
Analyzer
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
Schema.xml
Tokenizers : A Tokenizer splits a stream of characters (from each individual
field value) into a series of tokens.There can be only one Tokenizer in each
Analyzer.
Token Filters :Tokens produced by the Tokenizer are passed through a series
of Token Filters that add, change, or remove tokens. The field is then indexed by
the resulting token stream.
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
AnalysisTool: Output
Solrconfig.xml
solrconfig.xml is where you configure options for how this Solr instance should
behave.
Solrconfig.xml
 DataImportHandler
● Create lib/ directory in /var/lib/tomcat/solr
● Add the data import handler jars to lib/ :
cp ~/solr-4.7.2/dist/solr-dataimporthandler-*.jar /var/lib/tomcat7/solr/lib
● Add DBMS Driver (for mysql → mysql-connector-java-bin.jar)
● Add requestHndler to solr/conf/solrconfig.xml
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
Data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://127.0.0.1/restofraisweb" user="root" password="root" />
<document name="restofrais">
<entity name="Restaurant" query="SELECT id, nomRestaurant, ville_restaurant_id,code_postal_id,
type_cuisine_id FROM Restaurant">
<field column="id" name="id" />
<field column="nomRestaurant" name="nomRestaurant" />
<entity name="Ville" query="SELECT designation FROM Ville where id=$
{Restaurant.ville_restaurant_id}" >
<field column="designation" name="ville" />
</entity>
<entity name="Boisson" query="SELECT designation FROM restaurants_boissons inner join
Boisson on Boisson.id = restaurants_boissons.boisson_id where restaurants_boissons.restaurant_id=$
{Restaurant.id}" >
<field column="designation" name="boissons" />
</entity>
</entity>
</document>
</dataConfig>
Symfony Integration
•Solr->Solarium->nelmio_solarium
•Composer.json
•AppKernel.php
•Config/config.yml
{
"require": {
"nelmio/solarium-bundle": "2.*"
}
}
public function registerBundles()
{
$bundles = array(
...
new NelmioSolariumBundleNelmioSolariumBundle(),
...
);
...
}
nelmio_solarium:
endpoints:
default:
host: 172.16.0.219
port: 8080
path: /solr/restofrais
# core: solr
timeout: 5
clients:
default:
client_class:
SolariumCoreClientClient
adapter_class:
SolariumCoreClientAdapterHttp
Symfony Integration
• /**
• * @Route("/hello/{name}", name="_demo_hello")
• * @Template()
• */
• public function helloAction($name)
• {
•
• $client = $this->get('solarium.client');
•
• $select = $client->createSelect();
•
• $select->setQuery("ville:".$name);
•
• $results = $client->select($select);
• return array(
• 'search_results' => $results,
• );
Apache Solr vs ElasticSearch
Solr & ElasticSearch
•Lucene Apache Based
•Faceting
•Boosting
Solr :
•Pivot Facets
•One set of fields per schema, one schema per core
ElasticSearch :
•REST API
•Structured Query DSL
•Percolation
Apache Solr vs ElasticSearch
Apache Solr vs ElasticSearch
Links
●Solr wiki
http://wiki.apache.org/solr/
●Install Solr 4.6 with Tomcat 7 on Debian 7 :
http://pacoup.com/2014/02/05/install-solr-4-6-with-tomcat-7-on-debian-7
●solr-vs-elasticsearch.com :
http://solr-vs-elasticsearch.com
●Nelmio Solarium Bundle :
https://github.com/nelmio/NelmioSolariumBundle
●The Many Facets of Apache Solr, Yonik Seeley
http://www.youtube.com/watch?v=LyjiLYN-qIk

Weitere ähnliche Inhalte

Was ist angesagt?

Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't oneRecon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't oneArtem I. Baranov
 
CNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in LinuxCNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in LinuxSam Bowne
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance ImprovementsRonald Bradford
 
Operating Systems - A Primer
Operating Systems - A PrimerOperating Systems - A Primer
Operating Systems - A PrimerSaumil Shah
 
127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on Linux127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on LinuxSam Bowne
 
Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]RootedCON
 
CNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: ShellcodeCNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: ShellcodeSam Bowne
 
2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep Dive2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep DiveShawn Wells
 
Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#Wipro
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談hackstuff
 
CNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: ShellcodeCNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: ShellcodeSam Bowne
 
Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Roman Elizarov
 
One Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform ExploitationOne Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform ExploitationQuinn Wilton
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialAntonios Giannopoulos
 
CNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on LinuxCNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on LinuxSam Bowne
 
Orchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorOrchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorRaphaël PINSON
 
Course lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented ProgrammingCourse lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented ProgrammingJonathan Salwan
 

Was ist angesagt? (20)

Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't oneRecon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
 
Network programming
Network programmingNetwork programming
Network programming
 
CNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in LinuxCNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in Linux
 
Linux audit framework
Linux audit frameworkLinux audit framework
Linux audit framework
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Operating Systems - A Primer
Operating Systems - A PrimerOperating Systems - A Primer
Operating Systems - A Primer
 
127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on Linux127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on Linux
 
Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]
 
CNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: ShellcodeCNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: Shellcode
 
2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep Dive2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep Dive
 
Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談
 
CNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: ShellcodeCNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: Shellcode
 
Inheritance
InheritanceInheritance
Inheritance
 
Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Wait for your fortune without Blocking!
Wait for your fortune without Blocking!
 
One Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform ExploitationOne Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform Exploitation
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
 
CNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on LinuxCNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on Linux
 
Orchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorOrchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and Mspectator
 
Course lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented ProgrammingCourse lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented Programming
 

Andere mochten auch

งานนำเสนอ1
งานนำเสนอ1งานนำเสนอ1
งานนำเสนอ1See Tester
 
M6 thai-2551
M6 thai-2551M6 thai-2551
M6 thai-2551Walk4Fun
 
Fake book mary
Fake book maryFake book mary
Fake book mary110374
 
Redes socales 1
Redes socales   1Redes socales   1
Redes socales 1pipemaho
 
เอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่านเอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่านnaaikawaii
 
Fake book mary
Fake book maryFake book mary
Fake book mary110374
 
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาคโครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาคWalk4Fun
 
Dronhub.net presentation
Dronhub.net presentationDronhub.net presentation
Dronhub.net presentationVadym Melnyk
 
Introduction to Xamarin
Introduction to XamarinIntroduction to Xamarin
Introduction to XamarinVadym Melnyk
 
แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์Walk4Fun
 
Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)Roadshow2014
 
Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016Zainal Arifin
 
Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02kilar
 

Andere mochten auch (20)

Engish day
Engish dayEngish day
Engish day
 
งานนำเสนอ1
งานนำเสนอ1งานนำเสนอ1
งานนำเสนอ1
 
M6 thai-2551
M6 thai-2551M6 thai-2551
M6 thai-2551
 
Fake book mary
Fake book maryFake book mary
Fake book mary
 
E-waste
E-wasteE-waste
E-waste
 
Redes socales 1
Redes socales   1Redes socales   1
Redes socales 1
 
เอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่านเอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่าน
 
Fake book mary
Fake book maryFake book mary
Fake book mary
 
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาคโครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
 
future of gmos
  future of gmos  future of gmos
future of gmos
 
Dronhub.net presentation
Dronhub.net presentationDronhub.net presentation
Dronhub.net presentation
 
Our people tab 3
Our people tab 3Our people tab 3
Our people tab 3
 
Introduction to Xamarin
Introduction to XamarinIntroduction to Xamarin
Introduction to Xamarin
 
แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์
 
BG holidays
BG holidaysBG holidays
BG holidays
 
REC76_profile
REC76_profileREC76_profile
REC76_profile
 
Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)
 
Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016
 
икона
иконаикона
икона
 
Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02
 

Ähnlich wie Solr a.b-ab

Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solrNet7
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverLucidworks (Archived)
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relationJay Bharat
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrJayesh Bhoyar
 
Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Sajindbg Dbg
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdutionXuan-Chao Huang
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going stronglucenerevolution
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Viral Solani
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 

Ähnlich wie Solr a.b-ab (20)

Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solr
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
NIO-Grizly.pdf
NIO-Grizly.pdfNIO-Grizly.pdf
NIO-Grizly.pdf
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Solr a.b-ab

  • 1. First steps with Solr 4.7 Achraf KRID
  • 2. Plan •Definition •Installation (with tomcat 7) •Querying, faceting •Indexation •Configuration •Symfony Integration •Comparaison
  • 3. Solr ? SolrTM is the popular, blazing fast open source enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search.
  • 4. Install Tomcat 7 + Solr 4.7 ● Sudo apt-get install tomcat7 tomcat7-admin ● /etc/tomcat7/tomcat-users.xml ● http://localhost:8080/manager/html ● curl http://archive.apache.org/dist/lucene/solr/4.7.2/solr-4.7.2.tgz | tar xvz ● cp ~/solr-4.7.2/example/lib/ext/* /usr/share/tomcat7/lib/ ● cp ~/solr-4.7.2/dist/solr-4.7.2.war /var/lib/tomcat7/webapps/solr.war ● cp -R ~/solr-4.6.1/example/solr /var/lib/tomcat ● chown -R tomcat7:tomcat7 /var/lib/tomcat7/solr <tomcat-users> <role rolename="manager-gui"/> <user username="root" password="root" roles="manager-gui,admin-gui"/> </tomcat-users>
  • 5. Querying Data java -jar start.jar java -jar post.jar *.xml http://solr/select?q=electronics http://solr/select?q=electronics&sort=price+desc http://solr/select?q=electronics&rows=50&start=50 http://solr/select?q=electronics&fl=name+price http://solr/select?q=electronics&fq=inStock:true
  • 7. Schema.xml ● 01:<?xml version="1.0" encoding="UTF-8" ?> ● 02:<schema name="example" version="1.1"> ● 03: <types> ● 04: <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> ● 05: </types> ● 06: ● 07: ● 08: <fields> ● 09: <field name="id" type="string" indexed="true" stored="true" required="true" /> ● 10: <field name="category" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true"/> ● 11: <field name="size" type="string" indexed="true" stored="true"/> ● 12: <field name="text" type="string" indexed="true" stored="false" multiValued="true"/> ● 13: </fields> ● 14: ● 15: <uniqueKey>id</uniqueKey> ● 16: <defaultSearchField>text</defaultSearchField> ● 17: <solrQueryParser defaultOperator="AND"/> ● 18: ● 19: <copyField source="id" dest="text"/> ● 20: <copyField source="category" dest="text"/> ● 21: <copyField source="size" dest="text"/> ● 22: ● 23:</schema> Where You Describe Your Data
  • 8. Schema.xml ●<field> Describes How You Deal With Specific Named Fields ●<dynamicField> Describes How To Deal With Fields That Match A Glob (Unless There Is A Specific <field> For Them) ●<copyField> Describes How To Construct Fields From Other Fields <field name="title" type="text" stored=”false” /> <dynamicField name="price*" type="sfloat" indexed="true" /> <copyField source="*" dest="catchall" />
  • 9. Schema.xml Analyzer <fieldType name="text_greek" class="solr.TextField"> <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/> </fieldType>
  • 10. Schema.xml Tokenizers : A Tokenizer splits a stream of characters (from each individual field value) into a series of tokens.There can be only one Tokenizer in each Analyzer. Token Filters :Tokens produced by the Tokenizer are passed through a series of Token Filters that add, change, or remove tokens. The field is then indexed by the resulting token stream. <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
  • 12. Solrconfig.xml solrconfig.xml is where you configure options for how this Solr instance should behave.
  • 13. Solrconfig.xml  DataImportHandler ● Create lib/ directory in /var/lib/tomcat/solr ● Add the data import handler jars to lib/ : cp ~/solr-4.7.2/dist/solr-dataimporthandler-*.jar /var/lib/tomcat7/solr/lib ● Add DBMS Driver (for mysql → mysql-connector-java-bin.jar) ● Add requestHndler to solr/conf/solrconfig.xml <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler>
  • 14. Data-config.xml <dataConfig> <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://127.0.0.1/restofraisweb" user="root" password="root" /> <document name="restofrais"> <entity name="Restaurant" query="SELECT id, nomRestaurant, ville_restaurant_id,code_postal_id, type_cuisine_id FROM Restaurant"> <field column="id" name="id" /> <field column="nomRestaurant" name="nomRestaurant" /> <entity name="Ville" query="SELECT designation FROM Ville where id=$ {Restaurant.ville_restaurant_id}" > <field column="designation" name="ville" /> </entity> <entity name="Boisson" query="SELECT designation FROM restaurants_boissons inner join Boisson on Boisson.id = restaurants_boissons.boisson_id where restaurants_boissons.restaurant_id=$ {Restaurant.id}" > <field column="designation" name="boissons" /> </entity> </entity> </document> </dataConfig>
  • 15. Symfony Integration •Solr->Solarium->nelmio_solarium •Composer.json •AppKernel.php •Config/config.yml { "require": { "nelmio/solarium-bundle": "2.*" } } public function registerBundles() { $bundles = array( ... new NelmioSolariumBundleNelmioSolariumBundle(), ... ); ... } nelmio_solarium: endpoints: default: host: 172.16.0.219 port: 8080 path: /solr/restofrais # core: solr timeout: 5 clients: default: client_class: SolariumCoreClientClient adapter_class: SolariumCoreClientAdapterHttp
  • 16. Symfony Integration • /** • * @Route("/hello/{name}", name="_demo_hello") • * @Template() • */ • public function helloAction($name) • { • • $client = $this->get('solarium.client'); • • $select = $client->createSelect(); • • $select->setQuery("ville:".$name); • • $results = $client->select($select); • return array( • 'search_results' => $results, • );
  • 17. Apache Solr vs ElasticSearch Solr & ElasticSearch •Lucene Apache Based •Faceting •Boosting Solr : •Pivot Facets •One set of fields per schema, one schema per core ElasticSearch : •REST API •Structured Query DSL •Percolation
  • 18. Apache Solr vs ElasticSearch
  • 19. Apache Solr vs ElasticSearch
  • 20. Links ●Solr wiki http://wiki.apache.org/solr/ ●Install Solr 4.6 with Tomcat 7 on Debian 7 : http://pacoup.com/2014/02/05/install-solr-4-6-with-tomcat-7-on-debian-7 ●solr-vs-elasticsearch.com : http://solr-vs-elasticsearch.com ●Nelmio Solarium Bundle : https://github.com/nelmio/NelmioSolariumBundle ●The Many Facets of Apache Solr, Yonik Seeley http://www.youtube.com/watch?v=LyjiLYN-qIk