A tutorial on what is ElasticSearch and how to use it effectively in a real project.
The talk discusses how to integrate a search experience in an existing application, showing all the steps from downloading&configuring elastic search, to building the UI and wire the search logic (in a Rails application).
The talk was presented at RubyConf 2013.
2. Who am I?
I’m Luca Bonmassar (@openmosix)
# 31
# Italian living in San Francisco (and Stockholm)
I work at Gild
I love building products
[for fun, profit and boredom]
Tuesday, 5 November 13
3. Search use case
You’re building a product
User generated content
Let (other) users find or discover this content
Tuesday, 5 November 13
4. Search is
NOT easy
It usually starts as
but then you want to support AND, OR, NOT,
double quotes on multiple fields so
and then it goes like
Tuesday, 5 November 13
6. Agenda
Let’s define a “pet project”
Boilerplate (download, install, scaffold,
config, bla bla bla, yadda, yadda, yadda)
Build a website w/ simple search
Build a more advanced search
What next (homework)
Tuesday, 5 November 13
15. ElasticSearch
III
1. Download / set up a ES cluster
2. Define settings and data mapping (opt)
3. Index Data
4. Query Index
MongoDB
Elastic Search
{
"ok"
:
true,
"status"
:
200,
"version"
:
{
"number"
:
"0.90.5",
"lucene_version"
:
"4.4"
},
"tagline"
:
"You
Know,
for
Search"
}
Tuesday, 5 November 13
curl
-‐X
GET
'h.p://localhost:9200/
ruby_gems/_search?from=0&size=25&pre.y'
> Curl
16. ES
download +
setup
> wget http://download.elasticsearch.org/
elasticsearch/elasticsearch/
elasticsearch-0.90.6.tar.gz
> tar zxvf elasticsearch-0.90.6.tar.gz
> sudo mv elasticsearch-0.90.6 /usr/local/
Hint #1: you need Java
Hint #2: you need Oracle Java
Tuesday, 5 November 13
17. ES config
> ls elasticsearch-0.90.6/config/
Logging.yml: where to log, how much to log
Elasticsearch.yml: all server config. Defines:
Name of the cluster (change it!!!)
Node parameters (master/slave, store data/router)
Sharing and # replicas
Paths
Plugins
Memory (JVM, heap, memory locking)
Network config
“Gateway” (cluster backup)
Recovery
Discovery
Slow log + GC log
Default options are good enough for dev env
Tuesday, 5 November 13
19. Profit
(you are now an
ElasticSearch
expert - go and
tell the world)
Tuesday, 5 November 13
20. ElasticSearch
operations
Create a “RubyGem” Index
Defines a “RubyGem” Index data mapping
Index data
(e.g. upload data from MongoDB to ES
index = POST)
Query (= GET)
Tuesday, 5 November 13
21. Tire
now Re-Tire ;(
A ruby gem wrapping ElasticSearch REST
APIs into a powerful ruby DSL
ActiveModel integration
Rake tasks and utilities to load and query
ElasticSearch
Tuesday, 5 November 13
24. Indexing
Get a record
Convert it to JSON format (to_indexed_json)
Push it to Elastic Search (.update_index)
...under the hood...
Tuesday, 5 November 13
25. Index
(all data)
Naive (POST on index for each record):
Use bulk updates:
...under the hood...
Tuesday, 5 November 13
40. Deployment
I
Run your own cluster
Some learnings:
at least 3 nodes
memory profiling / GC
install very good monitoring (github.com/
karmi/elasticsearch-paramedic)
more RAM is (always) better
Check IOPS (if on AWS)
Pros:
Total control
Cheaper (lot cheaper)
Cons:
Can be a nightmare / Require dedicated devop
Tuesday, 5 November 13
41. Deployment
II
ElasticSearch as a service
http://found.no
http://searchly.com
http://bonsai.io
Pros:
Get cluster up & running in a minute
Focus on dev, not troubleshooting
Professional support
Cons:
Expensive
Can be in the wrong region / hosting provider
Expensive
Did I say expensive?
Tuesday, 5 November 13