Inside Solr 5 - Bangalore Solr/Lucene Meetup

October 13-15, 2015 • Austin, TX
http://lucenerevolution.org

COMMUNITY
CUSTOMERS PRODUCTS
Apache Solr +
Lucidworks

Search is more than just a box.

personal.
contextual.
actionable.
Search makes data

Search can be smarter.
location search history query security context
Personal, contextual, relevant results: consumer-
like simplicity and power in the enterprise.

Product Offering
Environment
Features
Support Level
Additional Support
Availability
Response Time
Number of Incidents
Pricing Model
Solr
Enterprise
24x7
SLA-Backed
Unlimited Incidents
Per Node
Dev Support (4 Contacts)
Operational Support
Regular Health Checks
Security
Log Analysis / SiLK Support
Dashboards & Reporting
Enhanced Admin UI
Fusion
Dev Support (4 Contacts)
Operational Support
Regular Health Checks
24x7
SLA-Backed
Unlimited Incidents
Per Node
Security
Crawlers & Connectors
Log Analysis / SiLK Support
Enhanced Admin UI
Data Enrichment
Machine Learning
Recommendations
Advanced Relevancy Tuning
Developer
Support
How-To Support
Knowledge Base
Fusion Support
9x5
SLA-Backed
Unlimited Incidents
Per Named Developer
ProductionDevelopment

• Get Started
• Dig in
• Go Big
• Get Finished
• Sneak peak
Inside Apache Solr 5

• Easy to start/stop
./bin/solr {start|stop}
• Create collections:
./bin/solr create -c <COLL_NAME>
• No more WAR! Web container (Jetty) is now an
implementation detail
• Scripts to support installing and running Solr as a
service on Linux.
Get Started

JSON’s great:
• Solr 5 “does the right thing” for JSON out of the box
Except when it isn’t:
• Most data isn’t JSON
• Solr handles CSV, XML, Rich Content out of the box
without having to install plugins
Your Content, Your Way

Your Content, Your Way
• Solr 5 will ship Tika 1.7, adding:
• OCR support
• PST and Matlab
• Better Date Handling
• More ﬂexibility with spatial units

• Stats and Pivot faceting now work
together
• Focused on accuracy of results
• First few steps in uniﬁcation of all
facet types with stats and
aggregations
• http://lucidworks.com/blog/you-
got-stats-in-my-facets/
Pivots and Stats

• Schema API: REST API for adding ﬁeld types, and
dynamic ﬁelds
• Managing Request Handlers through API
• Implicit registration of replication, Real Time Get
and Administration Handlers
• Improved APIs for managing collections
API Goodness

Lucene 5 Highlights
• Stronger index safety guarantees
• Reduced memory usage in a number of areas
• No more FieldCache (replaced w/
UninvertingReader)
• Multi-valued sorting and suggesters
• Better IO defaults when using SSDs
• More efﬁcient handling of merging stored ﬁelds

Go Big
• Many scaling improvements focused on interactions with
Zookeeper:
• Split cluster state management reduces chattiness in
large multi-tenant implementations
• Improved performance for Overseer operations >40%
• Better timeout defaults based on real-world testing
• See my Lucene Revolution Keynote for more details:
http://bit.ly/shalinRevKeynote

Distributed IDF
• IDF = Inverse Document Frequency = A measure of the
relative importance of a word in a collection
• 4 implementations:
• LocalStatsCache: Local Stats
• ExactStatsCache: One time use aggregation
• ExactSharedStatsCache: Stats shared across requests
• LRUStatsCache: Stats shared in an LRU cache across
requests

• Ease of getting started means
nothing if you can’t stay
running in production
• Jepsen tests simulate network
partitions, data loss, i.e. “The
Real World”
• https://github.com/
LucidWorks/jepsen/tree/solr-
jepsen
• http://bit.ly/solr-jepsen
Get Finished

Stability Improvements
• Protection of ZK content
• ReplicationHandler now has an option to throttle the
speed of replication
• More control over terminating long running queries
• Finite default timeouts for select and update requests

• Facets and Analytics:
• Mix and match all facet types and stats (SOLR-6352,
SOLR-6353, SOLR-4212)
• Percentiles via t-digest (SOLR-6350)
• Replication performance (SOLR-6816)
• Finish off Conﬁg APIs (various)
• Data location aware ValueSource implementation for fast
changing distributed data
• First class support for more languages OOTB
Near Term Road Map

Resources
Release Notes:
• Solr: http://wiki.apache.org/solr/ReleaseNote50
• Lucene: https://wiki.apache.org/lucene-java/
ReleaseNote50
Lucidworks: http://www.lucidworks.com
Shalin Shekhar Mangar
• shalin@apache.org
• Twitter: https://twitter.com/shalinmangar

Credits
What’s new in Solr 5.0 — Anshum Gupta
• http://www.slideshare.net/anshumg/solr-50
Lucidworks webinar “Inside Solr 5” - Grant Ingersoll
• http://www.slideshare.net/lucidworks/webinar-inside-
apache-solr-5

Inside Solr 5 - Bangalore Solr/Lucene Meetup

Inside Solr 5 - Bangalore Solr/Lucene Meetup

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (16)

Ähnlich wie Inside Solr 5 - Bangalore Solr/Lucene Meetup

Ähnlich wie Inside Solr 5 - Bangalore Solr/Lucene Meetup (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Inside Solr 5 - Bangalore Solr/Lucene Meetup