SlideShare ist ein Scribd-Unternehmen logo
1 von 90
Downloaden Sie, um offline zu lesen
Real-time search in Drupal.
Meet Elasticsearch
By Alexei Gorobets
asgorobets
Elasticsearch
Flexible and powerful open
source, distributed real-time
search and analytics engine
for the cloud
Why use
Elasticsearch?
● RESTful API
● Open Source
● JSON over HTTP
● based on Lucene
● distributed
● highly available
● schema free
● massively scalable
Setup in 2 steps:
1. Extract the archive
2. > bin/elasticsearch
How to use it?
> curl -XGET localhost:9200/?pretty
> curl -XGET localhost:9200/?pretty
{
"ok" : true,
"status" : 200,
"name" : "Infinity",
"version" : {
"number" : "0.90.1",
"snapshot_build" : false,
"lucene_version" : "4.3"
},
"tagline" : "You Know, for Search"
}
> curl -XGET localhost:9200/?pretty
action (verb)
> curl -XGET localhost:9200/?pretty
node + port
> curl -XGET localhost:9200/?pretty
path
> curl -XGET localhost:9200/?pretty
query string
Let's index some data
> PUT /index/type/id
Where?
It's very similar to
database in SQL
> PUT /index/type/id
What?
Table
Content type,
Entity type,
any kind of type you decide
> PUT /index/type/id
Which?
Node ID,
Entity ID,
any kind of serial ID
> PUT /mysite/node/1 -d
{
"nid": "1",
"status": "1",
"title": "Hello elasticsearch",
"body": "First elasticsearch document"
}
> PUT /mysite/node/1 -d
{
"nid": "1",
"status": "1",
"title": "Hello elasticsearch",
"body": "First elasticsearch document"
}
{
"ok":true,
"_index":"mysite",
"_type":"node",
"_id":"1",
"_version":1
}
Let's GET some data
> GET /mysite/node/1
{
"_index" : "mysite",
"_type" : "node",
"_id" : "1",
"_version" : 1,
"exists" : true,
"_source" : {
"nid":"1",
"status":"1",
"title":"Hello elasticsearch",
"body":"First elasticsearch document"
}
> GET /mysite/node/1?fields=title,body
Get specific fields
> GET /mysite/node/1?fields=title,body
Get specific fields
> GET /mysite/node/1/_source
Get source only
Let's UPDATE some data
> PUT /mysite/node/1 -d
{
"status":"0"
}
> PUT /mysite/node/1 -d
{
"ok":true,
"_index":"mysite",
"_type":"node",
"_id":"1",
"_version":2
}
{
"status":"0"
}
UPDATE = DELETE + PUT
Let's DELETE some data
> DELETE /mysite/node/1
> DELETE /mysite/node/1
{
"ok":true,
"found":true,
"_index":"mysite",
"_type":"node",
"_id":"1",
"_version":3
}
Distributed, Highly Available
> PUT /new_index -d '{
"settings" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}'
Concurrency, Version control
> PUT /myapp/node/1?version=1
{
"title": "hi girl"
}
> PUT /myapp/node/1?version=1
{
"title": "hi girl"
}
{
"_index": "myapp",
"_type": "node",
"_id": "1",
"_version": 1,
"created": false
}
> PUT /myapp/node/1?version=1
{
"title": "hey boy"
}
# 200
> PUT /myapp/node/1?version=1
{
"title": "hey boy"
}
# 409
> version conflict, current [2], provided [1]
Let's SEARCH for something
> GET /_search
> GET /_search
{
"took" : 32,
"timed_out" : false,
"_shards" : {
"total" : 20,
"successful" : 20,
"failed" : 0
},
"hits" : { results... }
}
Let's SEARCH in multiple
indices and types
> GET /index/_search
> GET /index/type/_search
> GET /index1,index2/_search
> GET /myapp_*/type, entity_*/_search
Let's PAGINATE results
> GET /_search?size=10&from=20
size = results per page
from = starting from
Let's search oldschool
> GET /_search?q=title:elasticsearch
> GET /_search?q=nid:60
+title:awesome
+status:1
+created:[1369917354 TO *]
?q=title:awesome%20%2Bcreated:
[1369917354%20TO%20*]%2Bstatus:1
+title:awesome
+status:1
+created:[1369917354 TO *]
The ugly encoding =)
Query DSL style
> GET /_search -d
{
"query": {
"match": "awesome"
}
}
> GET /_search -d
{
"query": {
"match" : {
"title" : {
"query" : "+awesome -poor",
"boost" : 2.0,
}
}
}
}
Mappings and types
Core types
* string
* number
* date
* boolean
Complex types
* array type
* object type
* nested type
Others:
ip type
geo point
geo shape
attachments
Define type mapping
> PUT /myapp/node -d
{
"node" : {
"properties" : {
"message" : {
"type" : "string",
"store" : true
}
}
}
}
Indexed fields
Full text
analyzed
== is splitted into
terms
Term
not analyzed
== is stored as is
> PUT /myapp/node -d
{
"node" : {
"properties" : {
"name" : {
"type" : "string",
"store" : true,
“index”: “not_analyzed”
}
}
}
}
Dynamic mapping
Analysis and indexing
Inverted index
1. “The quick brown fox
jumped over the lazy
dog”
2. “Quick brown foxes
leap over lazy dogs in
summer”
Term Doc_1 Doc_2
-------------------------
Quick | | X
The | X |
brown | X | X
dog | X |
dogs | | X
fox | X |
foxes | | X
in | | X
jumped | X |
lazy | X | X
leap | | X
over | X | X
quick | X |
summer | | X
the | X |
Analyzer
Tokenizers
● standard
● keyword
● whitespace
● ngram
TokenFilters
standard
lowercase
stop
truncate
snowball
> GET /_analyze?analyzer=standard -d
'this is a test baby'
{
"tokens" : [ {
"token" : "test",
"start_offset" : 10,
"end_offset" : 14,
"type" : "<ALPHANUM>",
"position" : 4
}, {
"token" : "baby",
"start_offset" : 15,
"end_offset" : 19,
"type" : "<ALPHANUM>",
"position" : 5
} ]
}
Autocomplete fields
Queries & Filters
Queries & Filters
full text search
relevance score
heavy
not cacheable
exact match
show or hide
lightning fast
cacheable
Combine Filters & Queries
> GET /_search -d
{
"query": {
"filtered": {
"query": {
"match": { "title": "awesome" }
},
"filter": {
"term": { "type": "article" }
}
}
}
}
and Sorting
> GET /_search -d
{
"query": {
"filtered": {
"query": {
"match": { "title": "awesome" }
},
"filter": {
"term": { "type": "article" }
}
}
}
"sort": {"date":"desc"}
}
Relevance. Explain API
Term frequency
How often does the term appear in the field?
The more often, the more relevant.
Inverse document frequency
How often does each term appear in the
index? The more often, the less relevant. T
Field norm
How long is the field? The longer it is, the less
likely it is that words in the field will be
relevant.
and Facets
Facets on Amazon
> GET /_search -d
{
"facets": {
"home_team": {
"terms": {
"field": "field_home_team"
}
}
}
}
> GET /_search -d
{
"facets": {
"home_team": {
"terms": {
"field": "field_home_team"
}
}
}
}
Give your facet a name
> GET /_search -d
{
"facets": {
"home_team": {
"terms": {
"field": "field_home_team"
}
}
}
}
Your facet filter can be:
● Terms
● Range
● Histogram
● Date Histogram
● Filter
● Query
● Statistical
● Terms Stats
● Geo Distance
"facets" : {
"home_team" : {
"_type" : "terms",
"missing" : 203,
"total" : 100,
"other" : 42,
"terms" : [ {
"term" : "hou",
"count" : 8
}, {
"term" : "sln",
"count" : 6
}, ...
STOP! I want this in Drupal?
Available modules:
Elasticsearch
Elasticsearch Connector
Search API elasticsearch
Development directions:
1. Search API implementation
2. Field Storage API
3. Alternative backends
Available modules:
Elasticsearch
Elasticsearch Connector
Search API elasticsearch
Field Storage API implementation
Elasticsearch field storage sandbox by Damien Tournoud
Started in July 2011
Field Storage API implementation
Elasticsearch field storage sandbox by Damien Tournoud
Started in July 2011
Elasticsearch EntityFieldQuery sandbox
https://drupal.org/sandbox/asgorobets/2073151
Let's DEMO
Let the Search be with you
Real-time search in Drupal with Elasticsearch @Moldcamp

Weitere ähnliche Inhalte

Was ist angesagt?

Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearch
bart-sk
 
How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.
Diep Nguyen
 

Was ist angesagt? (20)

Go database/sql
Go database/sqlGo database/sql
Go database/sql
 
Scrapy workshop
Scrapy workshopScrapy workshop
Scrapy workshop
 
Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearch
 
Web Crawling Modeling with Scrapy Models #TDC2014
Web Crawling Modeling with Scrapy Models #TDC2014Web Crawling Modeling with Scrapy Models #TDC2014
Web Crawling Modeling with Scrapy Models #TDC2014
 
The ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch pluginsThe ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch plugins
 
MySQL in Go - Golang NE July 2015
MySQL in Go - Golang NE July 2015MySQL in Go - Golang NE July 2015
MySQL in Go - Golang NE July 2015
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
 
Pydata-Python tools for webscraping
Pydata-Python tools for webscrapingPydata-Python tools for webscraping
Pydata-Python tools for webscraping
 
How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with Elasticsearch
 
Elasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easyElasticsearch Distributed search & analytics on BigData made easy
Elasticsearch Distributed search & analytics on BigData made easy
 
Fun with Python
Fun with PythonFun with Python
Fun with Python
 
Ch7(publishing my sql data on the web)
Ch7(publishing my sql data on the web)Ch7(publishing my sql data on the web)
Ch7(publishing my sql data on the web)
 
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
 
Developing OpenResty Framework
Developing OpenResty FrameworkDeveloping OpenResty Framework
Developing OpenResty Framework
 
Node.js and Parse
Node.js and ParseNode.js and Parse
Node.js and Parse
 
Sphinx on Rails
Sphinx on RailsSphinx on Rails
Sphinx on Rails
 
How to discover 1352 Wordpress plugin 0days in one hour (not really)
How to discover 1352 Wordpress plugin 0days in one hour (not really)How to discover 1352 Wordpress plugin 0days in one hour (not really)
How to discover 1352 Wordpress plugin 0days in one hour (not really)
 
Fun with exploits old and new
Fun with exploits old and newFun with exploits old and new
Fun with exploits old and new
 

Andere mochten auch (7)

Extending media presentation
Extending media presentationExtending media presentation
Extending media presentation
 
Media management in Drupal @Moldcamp
Media management in Drupal @MoldcampMedia management in Drupal @Moldcamp
Media management in Drupal @Moldcamp
 
Dependency injection in Drupal 8
Dependency injection in Drupal 8Dependency injection in Drupal 8
Dependency injection in Drupal 8
 
Why drupal
Why drupalWhy drupal
Why drupal
 
Создание дистрибутивов Drupal. Почему, зачем и как?
Создание дистрибутивов Drupal. Почему, зачем и как?Создание дистрибутивов Drupal. Почему, зачем и как?
Создание дистрибутивов Drupal. Почему, зачем и как?
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Migrate in Drupal 8
Migrate in Drupal 8Migrate in Drupal 8
Migrate in Drupal 8
 

Ähnlich wie Real-time search in Drupal with Elasticsearch @Moldcamp

Ähnlich wie Real-time search in Drupal with Elasticsearch @Moldcamp (20)

DRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCH
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Academy PRO: Querying Elasticsearch
Academy PRO: Querying ElasticsearchAcademy PRO: Querying Elasticsearch
Academy PRO: Querying Elasticsearch
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
JSON and the APInauts
JSON and the APInautsJSON and the APInauts
JSON and the APInauts
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
Ams adapters
Ams adaptersAms adapters
Ams adapters
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and Python
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
 
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
 
Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"
 
Finding the right stuff, an intro to Elasticsearch with Ruby/Rails
Finding the right stuff, an intro to Elasticsearch with Ruby/RailsFinding the right stuff, an intro to Elasticsearch with Ruby/Rails
Finding the right stuff, an intro to Elasticsearch with Ruby/Rails
 
RESTful API 제대로 만들기
RESTful API 제대로 만들기RESTful API 제대로 만들기
RESTful API 제대로 만들기
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Real-time search in Drupal with Elasticsearch @Moldcamp