SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Sören Schneider, Alkacon Software 
WORKSHOP TRACK Using the SOLR Collector 
27.11.2014
1.Brief Introduction Into Solr 
2.Common Mistakes Using OpenCms & Solr 
3.Using the Solr Collector (DEMO) 
4.Spellchecking in OpenCms Using Solr 
Agenda
●Solr is a very versatile and powerfool search engine that supports various features 
●This functionality comes with the price of increased complexity to handle Solr 
●Many customizations available 
●All fields composing a single document are typed 
Brief Solr Introduction
●Data structures of Solr‘s documents are defined the file schema.xml 
●Performing changes on this file requires reindexing 
●Dynamic Fields cope with that limitiation 
●Can be used without being explicitely defined in the schema using wildcards 
Defining Solr‘s Data Structure
Solr: Indexing Content 
a: date 
b: text 
c: string 
Solr processing (through analyzers, filters and tokenizers) 
a: date 
b: string 
c: string
●„Direct“ usage of OpenCms & Solr requires a basic understanding of Solr 
●Use proper datatypes in respect of individual usecase, gain knowledge of filters 
●Know the query syntax (for appropriate datatypes) 
●Most common mistakes of OpenCms users result in insufficient knowledge of Solr basics 
OpenCms & Solr
1.Using inproper types 
●„text“ vs „string“ 
●Formulating correct queries 
2.Issues regarding mapping OpenCms <->Solr 
3.(Encoding Problems) 
Common Mistakes Using Solr & OpenCms
●String 
●Stores its content as exact string 
●No tokenization / processing is being performed 
●Useful when searching for exact value 
●Text 
●Tokenization and processing is performed 
●Useful when a part of the content is searched for 
„text“ vs „string“
●OpenCms‘s copies the entire XML content into a single(!) locale-aware Solr field of type „text“ for each locale 
●Particular information of a resource is made searchable in OpenCms using two approaches 
●Automatic mapping of properties to Solr fields 
●Manual definintion of mappings 
Making Your Content Searchable
Indexing Content w/o Searchsettings 
Solr processing (through analyzers, filters and tokenizers) 
x: text 
a: date 
b: string 
c: string
Indexing Content with Searchsettings 
a: date 
b: text 
c: string 
Solr processing (through analyzers, filters and tokenizers) 
a: date 
b: string 
c: string
●Mapping happens in the scheme of the appropriate resource type 
●Excerpt 
Solr – OpenCms Interaction: Mapping 
<xsd:schema 
… 
<xsd:annotation 
<xsd:appinfo 
<searchsettings> 
<searchsetting element= "City" searchcontent="true"> 
<solrfield targetfield= "city" sourcefield="_s" 
</searchsetting> … 
Resource type element name
Element Mapping Attributes 
Attribute Name 
Effect on the Solr Field 
targetfield* 
The resulting name 
locale 
Write content only for specific locale 
sourcefield 
Defines the resulting type 
copyfields 
Copies the value to a different field 
default 
Sets a default value 
boost 
Sets a boost for the field
●Users complain about problems regarding certain Characters – mostly German Umlauts – in Solr results 
●In nearly all cases the sole problem lies within the integration of Solr to the servlet cotainer which is not happening in UTF-8 
●Extra note for Tomcat users: Please check whether you appended the required attributes all appropriate „<Connector>“s ;-) 
Using UTF-8 in Solr
●Live Demo 
15 
Live Demo 
Demo 
Demo 
Demo 
Demo 
デモ
WYSIWYG Spellchecker
●The Spellchecker has been realized using Solr 
●Solr already provides a flexible component named „SpellCheckComponent“ 
●This component supports inline spellchecking of Solr queries 
●Source for suggestions can be specified by Solr fields or text files 
WYSIWIG Spellchecker
●The „SpellCheckComponent“ is widely used to implement the „Did you mean?“-feature known by popular search engines 
●The component is 
●Reliable and mature 
●Fast 
●Plus, Solr is already available in OpenCms 
Why using Solr as Spellchecker
●If both usecases use the same component, how do the implementations actually differ? 
●„Did you mean?“ builds source of suggested words based on the entire data, the search runs on. Usually only a single hit is returned. 
●The WYSIWYG spellchecker builds ist source of suggestions based on a data that solely contains the dictionary for a single language 
Differences Between Usecases in Regards of Implementation
●Spellchecking has been realized using another Solr core that resides in WEB-INF/spellcheck 
●As the only purpose of this core is to contain spellcheck information, the schema.xml file is as simple as it gets 
●Why using another Solr core instead of the default core that‘s used by OpenCms? 
●Dictionaries are stored as one Solr index per language 
How to model this scenario using Solr?
●Sadly, the spellchecking interfaces of tinyMCE and Solr are incompatible 
Problems regarding tinyMCE and Solr 
Solr 
tinyMCE
Comparison Spellcheck Responses 
{ 
"id":"c0", 
"result":{„hsoue":[„house„, „has“]} 
} 
"spellcheck":{ "suggestions":[ „hsoue",{"numFound":5, "startOffset":0, "endOffset":4, "origFreq":0, "suggestion":[{"word":„house","freq": 53}, {"word":"has","freq":271}, … ]}, "correctlySpelled",false, "collation","hsue„ ]},
●A new component had to be realized in OpenCms that basically 
●Accepts spellcheck requests from tinyMCE 
●Handles tinyMCE and Solr communication and message conversion 
●Checks and (re-)builds spellcheck indices 
●The appropriate code is found in org.opencms.search.solr.spellcheck 
Glueing the Pieces together
●Dictionaries can be edited easily in OpenCms 
●Those indices are automatically filled by flat text files, one word per line 
●Support for multiple languages 
●To access the dicts, have a look at the directory org.opencms.workplace.spellcheck/resources/ 
Spellchecker in OpenCms
●Adding a new language 
1.Create new Solr field in schema.xml 
2.Create new dictionary file inside VFS 
3.Restart OpenCms 
●Adding words to the custom dict 
Extending the Spellchecker
●Any Questions? 
26 
Any Questions? 
Fragen? 
Questions ? 
Questiones? 
¿Preguntas? 
質問
Sören Schneider 
Alkacon Software GmbH 
http://www.alkacon.com 
http://www.opencms.org 
Thank you very much for your attention! 
27

Weitere ähnliche Inhalte

Was ist angesagt?

Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBAntonios Giannopoulos
 
Php classes in mumbai
Php classes in mumbaiPhp classes in mumbai
Php classes in mumbaiaadi Surve
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_newAntonios Giannopoulos
 
Umleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB appUmleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB appLenz Gschwendtner
 
Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet Keir Bowden
 
MuleSoft ESB Scripting Example
MuleSoft ESB Scripting ExampleMuleSoft ESB Scripting Example
MuleSoft ESB Scripting Exampleakashdprajapati
 
Accessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESBAccessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESBSrinu Prasad
 
Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!Eric Palakovich Carr
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a bossFrancisco Ribeiro
 
PHP 5.6 New and Deprecated Features
PHP 5.6  New and Deprecated FeaturesPHP 5.6  New and Deprecated Features
PHP 5.6 New and Deprecated FeaturesMark Niebergall
 
10 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.610 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.6Webline Infosoft P Ltd
 
Php Unit With Zend Framework Zendcon09
Php Unit With Zend Framework   Zendcon09Php Unit With Zend Framework   Zendcon09
Php Unit With Zend Framework Zendcon09Michelangelo van Dam
 
Becoming A Drupal Master Builder
Becoming A Drupal Master BuilderBecoming A Drupal Master Builder
Becoming A Drupal Master BuilderPhilip Norton
 
A WordPress workshop at Cefalo
A WordPress workshop at Cefalo A WordPress workshop at Cefalo
A WordPress workshop at Cefalo Beroza Paul
 

Was ist angesagt? (20)

Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
 
Php classes in mumbai
Php classes in mumbaiPhp classes in mumbai
Php classes in mumbai
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
 
Umleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB appUmleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB app
 
Tp web
Tp webTp web
Tp web
 
PHP7 Presentation
PHP7 PresentationPHP7 Presentation
PHP7 Presentation
 
Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet
 
MuleSoft ESB Scripting Example
MuleSoft ESB Scripting ExampleMuleSoft ESB Scripting Example
MuleSoft ESB Scripting Example
 
Ajax
AjaxAjax
Ajax
 
Accessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESBAccessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESB
 
Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
 
PHP 5.6 New and Deprecated Features
PHP 5.6  New and Deprecated FeaturesPHP 5.6  New and Deprecated Features
PHP 5.6 New and Deprecated Features
 
Lumberjack XPath 101
Lumberjack XPath 101Lumberjack XPath 101
Lumberjack XPath 101
 
10 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.610 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.6
 
Php Unit With Zend Framework Zendcon09
Php Unit With Zend Framework   Zendcon09Php Unit With Zend Framework   Zendcon09
Php Unit With Zend Framework Zendcon09
 
Becoming A Drupal Master Builder
Becoming A Drupal Master BuilderBecoming A Drupal Master Builder
Becoming A Drupal Master Builder
 
Xmla4js
Xmla4jsXmla4js
Xmla4js
 
Catalyst MVC
Catalyst MVCCatalyst MVC
Catalyst MVC
 
A WordPress workshop at Cefalo
A WordPress workshop at Cefalo A WordPress workshop at Cefalo
A WordPress workshop at Cefalo
 

Ähnlich wie OpenCms Days 2014 - Using the SOLR collector

Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and SparkLucidworks
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with SolrTravis Carlson
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and SparkLucidworks
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)Igor Talevski
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Lucidworks
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Lucidworks
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)Hon Weng Chong
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 

Ähnlich wie OpenCms Days 2014 - Using the SOLR collector (20)

Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr5
Solr5Solr5
Solr5
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with Solr
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)
 
Objective-C
Objective-CObjective-C
Objective-C
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 

Mehr von Alkacon Software GmbH & Co. KG

OpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCmsOpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCmsAlkacon Software GmbH & Co. KG
 
OpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological serviceOpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological serviceAlkacon Software GmbH & Co. KG
 
OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containersOpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containersAlkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portalsOpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portalsAlkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and GruntOpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and GruntAlkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TSOpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TSAlkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...Alkacon Software GmbH & Co. KG
 

Mehr von Alkacon Software GmbH & Co. KG (20)

OpenCms Days 2016: Multilingual websites with OpenCms
OpenCms Days 2016:   Multilingual websites with OpenCmsOpenCms Days 2016:   Multilingual websites with OpenCms
OpenCms Days 2016: Multilingual websites with OpenCms
 
OpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCmsOpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCms
 
OpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological serviceOpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological service
 
OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository
 
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
 
OpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spotOpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spot
 
OpenCms Days 2015 Next generation repository
OpenCms Days 2015  Next generation repositoryOpenCms Days 2015  Next generation repository
OpenCms Days 2015 Next generation repository
 
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
 
OpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explainedOpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explained
 
OpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and JenkinsOpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and Jenkins
 
OpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containersOpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containers
 
OpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCmsOpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCms
 
OpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernmentOpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernment
 
OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta
 
OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?
 
OpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals companyOpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals company
 
OpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portalsOpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portals
 
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and GruntOpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
 
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TSOpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
 
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
 

Kürzlich hochgeladen

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Kürzlich hochgeladen (20)

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

OpenCms Days 2014 - Using the SOLR collector

  • 1. Sören Schneider, Alkacon Software WORKSHOP TRACK Using the SOLR Collector 27.11.2014
  • 2. 1.Brief Introduction Into Solr 2.Common Mistakes Using OpenCms & Solr 3.Using the Solr Collector (DEMO) 4.Spellchecking in OpenCms Using Solr Agenda
  • 3. ●Solr is a very versatile and powerfool search engine that supports various features ●This functionality comes with the price of increased complexity to handle Solr ●Many customizations available ●All fields composing a single document are typed Brief Solr Introduction
  • 4. ●Data structures of Solr‘s documents are defined the file schema.xml ●Performing changes on this file requires reindexing ●Dynamic Fields cope with that limitiation ●Can be used without being explicitely defined in the schema using wildcards Defining Solr‘s Data Structure
  • 5. Solr: Indexing Content a: date b: text c: string Solr processing (through analyzers, filters and tokenizers) a: date b: string c: string
  • 6. ●„Direct“ usage of OpenCms & Solr requires a basic understanding of Solr ●Use proper datatypes in respect of individual usecase, gain knowledge of filters ●Know the query syntax (for appropriate datatypes) ●Most common mistakes of OpenCms users result in insufficient knowledge of Solr basics OpenCms & Solr
  • 7. 1.Using inproper types ●„text“ vs „string“ ●Formulating correct queries 2.Issues regarding mapping OpenCms <->Solr 3.(Encoding Problems) Common Mistakes Using Solr & OpenCms
  • 8. ●String ●Stores its content as exact string ●No tokenization / processing is being performed ●Useful when searching for exact value ●Text ●Tokenization and processing is performed ●Useful when a part of the content is searched for „text“ vs „string“
  • 9. ●OpenCms‘s copies the entire XML content into a single(!) locale-aware Solr field of type „text“ for each locale ●Particular information of a resource is made searchable in OpenCms using two approaches ●Automatic mapping of properties to Solr fields ●Manual definintion of mappings Making Your Content Searchable
  • 10. Indexing Content w/o Searchsettings Solr processing (through analyzers, filters and tokenizers) x: text a: date b: string c: string
  • 11. Indexing Content with Searchsettings a: date b: text c: string Solr processing (through analyzers, filters and tokenizers) a: date b: string c: string
  • 12. ●Mapping happens in the scheme of the appropriate resource type ●Excerpt Solr – OpenCms Interaction: Mapping <xsd:schema … <xsd:annotation <xsd:appinfo <searchsettings> <searchsetting element= "City" searchcontent="true"> <solrfield targetfield= "city" sourcefield="_s" </searchsetting> … Resource type element name
  • 13. Element Mapping Attributes Attribute Name Effect on the Solr Field targetfield* The resulting name locale Write content only for specific locale sourcefield Defines the resulting type copyfields Copies the value to a different field default Sets a default value boost Sets a boost for the field
  • 14. ●Users complain about problems regarding certain Characters – mostly German Umlauts – in Solr results ●In nearly all cases the sole problem lies within the integration of Solr to the servlet cotainer which is not happening in UTF-8 ●Extra note for Tomcat users: Please check whether you appended the required attributes all appropriate „<Connector>“s ;-) Using UTF-8 in Solr
  • 15. ●Live Demo 15 Live Demo Demo Demo Demo Demo デモ
  • 17. ●The Spellchecker has been realized using Solr ●Solr already provides a flexible component named „SpellCheckComponent“ ●This component supports inline spellchecking of Solr queries ●Source for suggestions can be specified by Solr fields or text files WYSIWIG Spellchecker
  • 18. ●The „SpellCheckComponent“ is widely used to implement the „Did you mean?“-feature known by popular search engines ●The component is ●Reliable and mature ●Fast ●Plus, Solr is already available in OpenCms Why using Solr as Spellchecker
  • 19. ●If both usecases use the same component, how do the implementations actually differ? ●„Did you mean?“ builds source of suggested words based on the entire data, the search runs on. Usually only a single hit is returned. ●The WYSIWYG spellchecker builds ist source of suggestions based on a data that solely contains the dictionary for a single language Differences Between Usecases in Regards of Implementation
  • 20. ●Spellchecking has been realized using another Solr core that resides in WEB-INF/spellcheck ●As the only purpose of this core is to contain spellcheck information, the schema.xml file is as simple as it gets ●Why using another Solr core instead of the default core that‘s used by OpenCms? ●Dictionaries are stored as one Solr index per language How to model this scenario using Solr?
  • 21. ●Sadly, the spellchecking interfaces of tinyMCE and Solr are incompatible Problems regarding tinyMCE and Solr Solr tinyMCE
  • 22. Comparison Spellcheck Responses { "id":"c0", "result":{„hsoue":[„house„, „has“]} } "spellcheck":{ "suggestions":[ „hsoue",{"numFound":5, "startOffset":0, "endOffset":4, "origFreq":0, "suggestion":[{"word":„house","freq": 53}, {"word":"has","freq":271}, … ]}, "correctlySpelled",false, "collation","hsue„ ]},
  • 23. ●A new component had to be realized in OpenCms that basically ●Accepts spellcheck requests from tinyMCE ●Handles tinyMCE and Solr communication and message conversion ●Checks and (re-)builds spellcheck indices ●The appropriate code is found in org.opencms.search.solr.spellcheck Glueing the Pieces together
  • 24. ●Dictionaries can be edited easily in OpenCms ●Those indices are automatically filled by flat text files, one word per line ●Support for multiple languages ●To access the dicts, have a look at the directory org.opencms.workplace.spellcheck/resources/ Spellchecker in OpenCms
  • 25. ●Adding a new language 1.Create new Solr field in schema.xml 2.Create new dictionary file inside VFS 3.Restart OpenCms ●Adding words to the custom dict Extending the Spellchecker
  • 26. ●Any Questions? 26 Any Questions? Fragen? Questions ? Questiones? ¿Preguntas? 質問
  • 27. Sören Schneider Alkacon Software GmbH http://www.alkacon.com http://www.opencms.org Thank you very much for your attention! 27