This document provides an overview of the origins and development of Solr from its beginnings at CNET in 2004 to its current state and future plans. Some key points:
- Solr originated from CNET's need for an alternative to a discontinued enterprise search product and was first called "Fusion" then "SOLR" before being contributed to the Apache Software Foundation.
- Major milestones included distributed searching capabilities, faceting, spellchecking, and the introduction of SolrCloud in version 4 for distributed indexing with no single points of failure.
- Recent enhancements include document routing for improved distribution, collection aliases, and a REST API for managing the Solr schema.
- Future plans include greater scal
2. Origins
of
Solr
• CNET
driven
to
find
alterna6ves
to
discon6nued
commercial
enterprise
search
product
• Plan
A:
ATOMICS
(Apache
TO
MySQL
In
CNET
Search)
– Standalone
server
speaking
XML
over
HTTP
– Meet
majority
of
“search”
needs
– hLp://conferences.oreillynet.com/cs/mysqluc2005/view/e_sess/7066
• Plan
B:
“Something
based
on
Lucene”
– Started
Summer
2004
– First
prototype
called
“Fusion”,
later
renamed
SOLAR
(Search
On
Lucene
And
Resin)
10. Seamless
Online
Shard
Splijng
Shard2_0
Shard1
replica
leader
Shard2
replica
leader
Shard3
replica
leader
Shard2_1
1. New
sub-‐shards
created
in
“construc6on”
state
2. Leader
starts
forwarding
applicable
updates,
which
are
buffered
by
the
sub-‐shards
3. Leader
index
is
split
and
installed
on
the
sub-‐shards
4. Sub-‐shards
apply
buffered
updates
then
become
“ac6ve”
leaders
and
old
shard
becomes
“inac6ve”
update
11. Cloud
Enhancements
• Request
forwarding
– In
a
mul6-‐collec6on
cluster,
any
node
can
handle/
forward
requests
for
any
collec6on
• Collec6on
Aliases
http://localhost:8983/solr/admin/collections
?action=CREATEALIAS
&name=northeast
&collections=NY,NJ,PA,CT,ME,MA,NH,RI,VT
• Coming
Soon:
Shard
Aliases
12. Schema
REST
API
• Restlet
is
now
integrated
with
Solr
• Get
a
specific
field
curl
http://localhost:8983/solr/schema/fields/price
{"field":{
"name":"price",
"type":"float",
"indexed":true,
"stored":true
}}
• Get
all
fields
curl
http://localhost:8983/solr/schema/fields
• Get
En6re
Schema!
curl
http://localhost:8983/solr/schema
13. Dynamic
Schema
• Add
a
new
field
(Solr
4.4)
curl
-‐XPUT
http://localhost:8983/solr/schema/fields/strength
-‐d
‘
{"type":”float",
"indexed":"true”}
‘
• Works
in
distributed
(cloud)
mode
too!
• Future:
More
schemaless
– Reality:
there
is
no
such
thing
for
Lucene
based
systems
– Type
guessing
for
fields
we
haven’t
seen
before
14. Future
• Greater
scalability
• More
“NoSQL”
– More
ways
to
update
&
manipulate
documents
• Analy6cs
– More
powerful
face6ng,
func6ons,
sta6s6cs
• Improved
Rela6onal
queries
• More
dynamic
(sejngs
&
configura6on)
• Con6nued
focus
on
ease
of
use