Elasticsearch is an open-source, distributed search and analytics engine built on Apache Lucene. It allows storing, searching, and analyzing large volumes of data quickly and in near real-time. Key concepts include being schema-free, document-oriented, and distributed. Indices can be created to store different types of documents. Mapping defines how documents are indexed. Documents can be added, retrieved, updated, and deleted via RESTful APIs. Queries can be used to search for documents matching search criteria. Faceted search provides aggregated data based on search queries. Elastica provides a PHP client for interacting with Elasticsearch.
2. Concepts
• Elastic search is an open source(Apache 2), Distributed, RESTful,
Search Engine built on top of Apache Lucene
• Schema Free & Document Oriented
• Support JSON Model
• Elastic Search allows you to completely control how a JSON
document gets mapped into the search on a per type and per
index level.
• Multi Tenancy – Support for more than one index, support for
more than one type per index
• Distributed Nature - Indices are broken down into shards, each
shard with 0 or more replicas
• In RDBMS terms, index corresponds to database, type
corresponds to table, a document corresponds to a table row and
a field corresponds to a table column.
3. Create Index
• The create index API allows to instantiate an index
• Curl Example for making Sales index (index name should be in lowercase)
$ curl -XPOST 'http://localhost:9200/sales/‘
• Each index created can have specific settings associated with it. Following
example create index sales with 3 shards, each with 2 replicas
curl - XPOST 'http://localhost:9200/sales/' -d '{
"settings" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}‘
• Reference link :
http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html
4. Mapping
• Mapping is the process of defining how a document should be mapped to
the Search Engine
• If no mapping is defined, elasticsearch will guess the kind of the data and
map it.
• In ES, an index may store documents of different “mapping types”
• The put mapping API allows to register specific mapping definition for a
specific type. Example – mapping for Order type
curl -XPOST 'http://localhost:9200/sales/order1/_mapping' -d '
{
"order1":
{
"properties":
{
"entity_id":{"type":"integer"},
"increment_id":{"type":"string","index":"not_analyzed"},
"status":{"type":"string"}
}
}
}‘
5. Mapping
• Get Mapping available in index.
Following curl examples returned all the type and its associate
mapping available in sales index
curl –XGET ‘localhost:9200/sales/_mapping?pretty=1’
• Get Mapping of type
curl – XGET‘localhost:9200/sales/order1/_mapping?pretty=1’
• Reference link :
http://www.elasticsearch.org/guide/reference/mapping/index.html
6. Add document
• The following example inserts the JSON document into the
“sales” index, under a type called “order1” with an id of 1:
curl -XPOST 'http://localhost:9200/sales/order1/1' -d '
{
"entity_id":1,
"increment_id":"1000001",
“status":"shipped",
}'
• Reference link:
http://www.elasticsearch.org/guide/reference/api/index_.html
7. GET API (Get data)
• The get API allows to get a typed JSON document from the
index based on its id. The following example gets a JSON
document from an index called sales, under a type called
order1, with id valued 1:
curl -XGET 'localhost:9200/sales/order1/1?pretty=1'
• The get operation allows to specify a set of fields that will be
returned by passing the fields parameter. For example:
curl -XGET 'localhost:9200/sales/order/1?fields=entity_id?pretty=1‘
• Reference link :
http://www.elasticsearch.org/guide/reference/api/get.html
• For Multi Get Api
http://www.elasticsearch.org/guide/reference/api/multi-get.html
8. Search API (Search data)
•
The search API allows to execute a search query and get back search hits that
match the query. Following query returns the document which have entity_id
1
curl -XGET 'http://localhost:9200/sales/order1/_search' -d '{
"query" : {
"term" : { “entity_id" : 1 }
}
}
'
•
•
The additional parameter for search API are from, size, search_type, sort,fields
etc.
curl -XGET 'http://localhost:9200/sales/order1/_search' -d '{
"query" : {
"term" : {"status" : "confirmed" }
},
"from" :0, "size" :1,"sort" :[{"entity_id" : "desc"],"fields":["entity_id","increment_id"]
}
‘
Reference Link :
http://www.elasticsearch.org/guide/reference/api/search/request-body.html
9. Multi - Search API (Search data)
• The search API can be applied to multiple types within an index,
and across multiple indices with support for the multi index syntax.
For example, we can search on all documents across all types within
the sales index:
curl -XGET 'http://localhost:9200/sales/_search?q=status:confirmed‘
• We can also search within specific types:
curl -XGET 'http://localhost:9200/sales/order,order1/_search?q=status:confirmed‘
• We can also search all orders with a certain field across several
indices:
curl -XGET 'http://localhost:9200/sales,newsales/order1/_search?q=entity_id:1‘
• we can search all orders across all available indices using _all
placeholder:
curl - XGET 'http://localhost:9200/_all/order1/_search?q=entity_id:1‘
• even search across all indices and all types:
curl -XGET 'http://localhost:9200/_search?q=entity_id:1'
10. Update API
• The update API allows to update a document based on a script
provided. Following example update the status field of document
which has id 1 with new value.
curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{
"script" : "ctx._source.status= newStatus",
"params" : {
"newStatus" : " confirmed"
}
}‘
• We can also add a new field to the document:
curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{
"script" : "ctx._source.newField = "new field intoduced""
}‘
• We can also remove a field from the document:
curl -XPOST 'localhost:9200/sales/order1/1/_update' -d '{
"script" : "ctx._source.remove("newField")"
}‘
• Reference link :
http://www.elasticsearch.org/guide/reference/api/update.html
11. Delete API
• The delete API allows to delete a typed JSON document from a
specific index based on its id. The following example deletes the
JSON document from an index called sales, under a type called
order1, with id valued 1:
curl -XDELETE 'http://localhost:9200/sales/order1/1‘
• Delete entire type
curl -XDELETE 'http://localhost:9200/sales/order1‘
• The delete by query API allows to delete documents from one or
more indices and one or more types based on a query:
curl -XDELETE 'http://localhost:9200/sales/order1/_query?q=entity_id:1‘
curl -XDELETE 'http://localhost:9200/sales/_query?q=entity_id:1'
curl -XDELETE 'http://localhost:9200/sales/order1/_query' -d '{
"term" : { “status" : “confirmed" }
}'
12. Count API
• The count API allows to easily execute a query and get the number of
matches for that query. It can be executed across one or more indices and
across one or more types.
curl -XGET 'http://localhost:9200/sales/order/_count' -d '
{
"term":{"status":"confirmed"}
}'
curl -XGET 'http://localhost:9200/_count' -d '
{
"term":{"status":"confirmed"}
}'
curl -XGET 'http://localhost:9200/sales/order,order1/_count' -d '
{
"term":{"status":"confirmed"}
}'
• Reference Link :
http://www.elasticsearch.org/guide/reference/api/count.html
13. Facet Search
• Facets provide aggregated data based on a search query.
• A terms facet can return facet counts for various facet values for a
specific field. ElasticSearch supports more facet implementations,
such as range, statistical or date histogram facets.
• The field used for facet calculations must be of type numeric,
date/time or be analyzed as a single token.
• You can give the facet a custom name and return multiple facets in
one request.
• Now, let’s query the index for products which has category id 3 and
retrieve a terms facet for the brands field. We will name the facet
simply: Brands (Example of facet terms)
curl -XGET 'localhost:9200/category/products/_search?pretty=1' -d '
{
"query": {"term":{"category_id":3} },
"facets":
{
"Brands": {"terms":{"fields":["brands"],"size":10,"order":"term"}}
}
}'
•
Reference link:
http://www.elasticsearch.org/guide/reference/api/search/facets/
http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.html
14. •
Facet search
Range facet allows to specify a set of ranges and get both the number of docs
(count) that fall within each range, and aggregated data either based on the field,
or using another field.
curl -XGET 'localhost:9200/sales/order/_search?pretty=1' -d '
{
"query" : {"term" : {"status" : "confirmed"} },
"facets" : {
"range1" : {
"range" : {
"grand_total" : [
{ "to" : 50 },
{ "from" : 20, "to" : 70 },
{ "from" : 70, "to" : 120 },
{ "from" : 150 }
]
}
}
},
"sort":[{"entity_id":"asc"}]
}'
• Reference link :
http://www.elasticsearch.org/guide/reference/api/search/facets/range-facet.html
15. Elastica
• Elastica is an Open Source PHP client for the elasticsearch search
engine/database.
• Reference Link : http://www.elastica.io/en
• To use Elastica, Download and Include Elastica in a project using PHP
autoload.
function __autoload_elastica ($class)
{
$path = str_replace('_', '/', $class);
if (file_exists('/xampp/htdocs/project/Elastica/lib/' . $path . '.php'))
{
require_once('/xampp/htdocs/project/Elastica/lib/' . $path . '.php');
}
}
spl_autoload_register('__autoload_elastica');
• Connecting to ElasticSearch:
On single node :
$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));
• It is quite easy to start a elasticsearch cluster simply by starting multiple
instances of elasticsearch on one server or on multiple servers. One of the
goals of the distributed search index is availability. If one server goes
down, search results should still be served.
$elasticaClient- = new Elastica_Client('servers'=>array(array('host' =>
'192.168.0.27','port' => '9200'), array('host' => '192.168.0.27','port' => '9201')));
17. Elastica Add documents
$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' =>
'9200'));
$elasticaIndex = $elasticaClient ->getIndex('sales');
$elasticaType = $elasticaIndex->getType('order');
// The Id of the document
$id = 1;
// Create a document
$record = array('entity_id'=>1,
‘increment_id'=>‘100001',‘status'=>‘confirmed');
$recordDocument = new Elastica_Document($id, $record);
// Add record to type
$elasticaType->addDocument($ recordDocument );
// Refresh Index
$elasticaType->getIndex()->refresh();
18. Elastica Get Document
$elasticaClient- = new Elastica_Client(array('host' =>
'192.168.0.27','port' => '9200'));
$index = $elasticaClient->getIndex('sales');
//get index
$type = $index->getType('order');
//get type
$Doc = $type->getDocument($id)->getData(); //get data
19. Elastica Update Document
$elasticaClient- = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));
$index = $elasticaClient->getIndex('sales');
//get index
$type = $index->getType('order');
//get type
$id = 1;
//id of document which need to be updated
$newVal = 'confirmed';
//value to be updated
$update = new Elastica_Script("ctx._source.status = newval", array('newval' => $newVal));
$res=$type->updateDocument($id,$update);
if(!empty($res))
{
$val=$res->getData();
if($val['ok'])
{
echo "updated";
}
else
{
echo “value not updated";
}
}
else
{
echo “value not updated";
}
20. Elastica Search Documents
• The search API allows to execute a search query and get back
search hits that match the query.
• Search API consists following major methods:
– Query String
– Term
– Terms
– Range
– Bool Query
– Filter (it also contain Filter_term, Filter_Range etc)
– Facets (it contain Facet_Range, Facet_Terms,Facet_Filter,
Facet_Query, Facet_statistical etc.)
– Query (where we can set fields for output, limit , sorting)
21. Search Documents – Query String
$elasticaClient = new Elastica_Client(array('host' => '192.168.0.27','port' => '9200'));
$elasticaIndex = $elasticaClient->getIndex('sales');
$elasticaType = $elasticaIndex->getType('order');
$elasticaQueryString = new Elastica_Query_QueryString();
$elasticaQueryString->setQuery((string) “shipped*");
$elasticaQueryString->setFields(array(‘status')); //we can set 1 or more than 1 field in query string
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($elasticaQueryString);
$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total'));
$elasticaQuery->setFrom(0);
$elasticaQuery->setLimit(20);
$sort = array("entity_id" => "desc");
$elasticaQuery->setSort($sort);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
$totalResults = $ elasticaResultSet ->getTotalHits();
$elasticaResults = $elasticaResultSet ->getResults();
foreach ($elasticaResults as $elasticaResult)
{
print_r($elasticaResult->getData());
22. Search Documents – Query Term
$elasticaQueryTerm = new Elastica_Query_Term();
$elasticaQueryTerm->setTerm('entity_id',1);
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($elasticaQueryTerm);
$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total'));
$elasticaQuery->setFrom(0);
$elasticaQuery->setLimit(20);
$sort = array("entity_id" => “asc");
$elasticaQuery->setSort($sort);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
23. Search Documents – Query Terms
$elasticaQueryTerms = new Elastica_Query_Terms();
//for query terms, you can specify 1 or more than 1 value per field
$elasticaQueryTerms->setTerms('entity_id', array(1,2,3,4,5));
$elasticaQueryTerms->addTerm(6);
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($elasticaQueryTerms);
$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total'));
$elasticaQuery->setFrom(0);
$elasticaQuery->setLimit(20);
$sort = array("entity_id" => “asc");
$elasticaQuery->setSort($sort);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
24. Search Documents – Query Range
$elasticaQueryRange = new Elastica_Query_Range();
//for range query , you can specify from, from & to or to only
$elasticaQueryRange->addField('entity_id', array('from' => 10,"to"=>14));
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($elasticaQueryRange);
$elasticaQuery->setFields(array('increment_id','entity_id','billing_name','grand_total'));
$elasticaQuery->setFrom(0);
$elasticaQuery->setLimit(20);
$sort = array("entity_id" => “asc");
$elasticaQuery->setSort($sort);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
25. Search Documents – Bool Query
•
•
The bool query maps to Lucene BooleanQuery
Bool Query contains clause Occurrence – must, should, must_not
$boolQuery = new Elastica_Query_Bool();
$elasticaQueryString = new Elastica_Query_QueryString();
$elasticaQueryString ->setQuery(‘shoh*');
$elasticaQueryString->setFields(array('‘billing_name, ‘shipping_name'));
$boolQuery->addMust($elasticaQueryString);
$elasticaQueryTerm = new Elastica_Query_Term();
$elasticaQueryTerm->setTerm('entity_id',1);
$boolQuery->addMust($elasticaQueryTerm );
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($boolQuery);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
26. Search Documents – Query Filters
•
When doing things like facet navigation, sometimes only the hits are needed to be filtered by
the chosen facet, and all the facets should continue to be calculated based on the original
query. The filter element within the search request can be used to accomplish it.
$elasticaQueryString
= new Elastica_Query_QueryString();
$elasticaQueryString->setQuery('*');
$elasticaQueryString->setFields(array('increment_id'));
$filteredQuery = new Elastica_Query_Filtered($elasticaQueryString,new
Elastica_Filter_Range('created_at', array('from' => '2011-01-04 07:36:00','to' => '2013-01-04
19:36:25')));
$elasticaQuery
= new Elastica_Query();
$elasticaQuery->setQuery($filteredQuery);
$elasticaResultSet = $elasticaType->search($elasticaQuery);
27. Elastica - Facet Terms
$elasticaQuery = new Elastica_Query();
$elasticaQuery->setQuery($boolQuery);
//set main query
$facet = new Elastica_Facet_Terms('status Facet');
$facet->setField('status');
$facet->setOrder(‘term');
//another options are reverse_term,count,reverse_count
$facet->setSize(5);
$elasticaQuery->addFacet($facet);
//adding facet to query
$elasticaResultSet = $elasticaType->search($elasticaQuery);
$facets = $ elasticaResultSet ->getFacets(); //get facets data
foreach($facets as $k=>$v)
{
if(isset($v['terms']) && is_array($v['terms']))
{
$data['facets'][$k]=$v['terms'];
}
}