The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
12. Inside the Search Engine
The Index Documents Fields
Not that different from ye olde database?...
13. Indexing documents
PUT /megacorp/employee/1
{
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
PUT /megacorp/employee/2
{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}
PUT /megacorp/employee/3
{
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about": "I like to build cabinets",
"interests": [ "forestry" ]
}
14. Queries and Filters
GET /megacorp/employee/_search?q=last_name:Smith "hits": [
{
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
},
{
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [ "music" ]
}
}
]
GET /megacorp/employee/_search
{
"query" : {
"filtered" : {
"filter" : {
"range" : {
"age" : { "gt" : 21 }
}
},
"query" : {
"match" : {
"last_name" : "smith"
}
}
}
}
}
15. Full-Text Search
GET /megacorp/employee/_search
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
"hits": [
{
"_score": 0.16273327,
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
},
{
"_score": 0.016878016,
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [ "music" ]
}
}
]
16. Analysis and Analyzers
Set the shape to semi-transparent by calling Set_Trans(5)
Standard analyzer
set, the, shape, to, semi, transparent, by, calling, set_trans, 5
Simple analyzer
set, the, shape, to, semi, transparent, by, calling, set, trans
Whitespace analyzer
Set, the, shape, to, semi-transparent, by, calling, Set_Trans(5)
English language analyzer
set, shape, semi, transpar, call, set_tran, 5
20. The Liferay Search architecture
Liferay
Portal
Assets:
web content,
message boards,
wiki pages...
Search
infrastructure
(Magic
happens
here)
Search
engine(s)
Indices,
documents,
analysis...
21. The Liferay Search Engine plugins
public interface SearchEngine {
public IndexSearcher getIndexSearcher();
public IndexWriter getIndexWriter();
}
public class ElasticsearchSearchEngine
extends BaseSearchEngine
public class ElasticsearchIndexSearcher
extends BaseIndexSearcher
public class ElasticsearchIndexWriter
extends BaseIndexWriter
public class SolrSearchEngine
extends BaseSearchEngine
public class SolrIndexSearcher
extends BaseIndexSearcher
public class SolrIndexWriter
extends BaseIndexWriter
35. Modularity and Search
● OSGi
● Liferay's default Search Engine: now a plugin in itself
● Extension points in Search
○ Node Settings contributors → fine tune your cluster
○ Index Settings contributors → fine tune your shards and
logs
○ Analyzers and Mappings contributors → fine tune your
fields and queries
37. Why Elasticsearch?
Best of breed
Built for modern web applications
Distributed and clusterable by design
Lucene based
Multi-tenancy
Great vendor support
Great monitoring tools: Marvel, Logstash
38. Great for Developers
Open Source
Amazing documentation
High "just works" factor, e.g. zero-config indexing and clustering
REST for queries, health, admin - everything
Update live settings programmatically
Great Java Client API
Pretty JSON for talks ;-)
42. Security: Shield
Protect your Liferay index with a username and password
SSL/TLS encryption for traffic within the Liferay Elasticsearch cluster
Elasticsearch plugin - no need for an external security solution
Restrict access to Liferay Portal instances with IP filtering