SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Technology 
Drives 
Business 
CUSTOM SOLR TOKENIZER 
FLEXIBLE TOKENIZER WITH JFLEX 
2014 BerlinBuzzword
Agenda 
• ME & SHI 
• JFLEX Tokenizer 
• Motivation 
• JFlex ?! 
• Solr implementation 
• Demo 
• Q & A
Markus Klose – Search Consultant 
• Expertise in Solr, Lucene, Elasticsearch, 
Fast ESP 
• Certified Apache Solr Trainer 
• Speaker, Blogger, Coder 
• Author “Einführung in Apache Solr” 
• @markus_klose
SHI GmbH & Co KG 
2013 
2011 
Delivering mission-critical data-driven solution for multiple industries. 
Partnering with 
Partnering with LucidWorks 
2000 Embracing Open Source. 
1994 
Foundation. Development of home-grown information retrieval 
platform. 
2014
OUR MISSION 
Vendor-independent IT Consulting and Software Engineering company. 
Dedicated to deliver next generation Semantic Search, Big Data and Exploratory Data 
Analytics solutions. 
Using Enterprise Data Hub approach for 360° data integration. 
And helping customers to Accelerate (e)Business through better technology adoption 
and data utilization.
Technology 
Drives 
Business 
CUSTOM TOKENIZER WITH JFLEX 
JFlex based tokenizer - the idea is not new, but great
Motivation 1 
• In customer projects we have to deal very 
often with custom „meta“ data 
• IDs 
• Type designation 
• Product description 
• How to face that problem? PatternTokenizer?
Motivation 2 
• Use and combine 
existing tools to be more 
flexible 
• Configuration over 
Coding 
• JFlex allready used in 
ClassicTokenizer / 
StandardTokenizer
UseCase – Type designation 
• Product Data 
• nymj3x1,5 / nym-j 3x1,5 / nymj 3x1,5 / nym-j 3 x 
1,5 
• Search Input 
• nymj 3 1,5 / nym-j 3x1,5 
• Index 
• nymj315 / nymj / nym / j / 315 / 3 / 15
JFlex - The Fast Scanner Generator 
• JFlex is a lexical analyzer generator (aka 
scanner generator) 
• Current version 1.5.1 
• Download - http://jflex.de/download.html 
• Mailing Lists 
• BSD-style license 
• CLI API & GUI
JFlex - The Fast Scanner Generator 
• Berlin Buzzword 26.05.2014 
• LETTERS -> „Berlin“, „Buzzword“ 
• LETTERS and SPACE -> „Berlin Buzzword“ 
• DIGITS -> „26“, „05“, „2014“ 
• DIGITS and . -> „26.05.2014“ 
• LETTERS and SPACE or DIGITS and . 
-> „Berlin Buzzword“ , „26.05.2014“
Custom Tokenizer – Project Setup 
• JAVA - TokenizerFactory 
–> typical factory, tokenizer configuration 
• JAVA - Tokenizer 
-> base class, token manipulation 
• JFLEX – Scanner 
-> description of token patterns 
• (JAVA – Scanner) 
-> Generated scanner
Demo 
ISBN Tokenizer / URL Tokenizer 
https://github.com/scherziglu
Resources 
• JFlex Tokenizer 
• GitHub (https://github.com/scherziglu) 
• Solr Source Code (e.g. ClassicTokenizer) 
• @markus-klose / @SHIEngineers 
• JFlex Websites 
http://jflex.de/ 
• Q & A
CONTACT 
SHI GmbH & Co KG 
Curt-Frenzel-Str. 12 
86167 Augsburg 
Germany 
info@shi-gmbh.com 
+49.821.74 82 633 0 
mma@shi-gmbh.com mk@shi-gmh.com 
@markus_klose 
dwr@sgi-gmbh.com 
@SHIEngineers @wrigley_dan

Weitere ähnliche Inhalte

Was ist angesagt?

AstriCon2020 The Great Migration
AstriCon2020 The Great MigrationAstriCon2020 The Great Migration
AstriCon2020 The Great MigrationJason Park
 
Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.  Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted. Kangaroot
 
Piwik presentation 2011
Piwik presentation 2011Piwik presentation 2011
Piwik presentation 2011Matthieu Aubry
 
App Services - Connecting the dots of Web Mobile and Integration_published
App Services - Connecting the dots of Web Mobile and Integration_publishedApp Services - Connecting the dots of Web Mobile and Integration_published
App Services - Connecting the dots of Web Mobile and Integration_publishedWagner Silveira
 
AWS as a code - using ansible
 AWS as a code - using ansible  AWS as a code - using ansible
AWS as a code - using ansible serkancapkan
 
KIWI IoT Presentation
KIWI IoT PresentationKIWI IoT Presentation
KIWI IoT PresentationJeff Katz
 
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»Anna Shymchenko
 
TYPO3 and t3kit overview
TYPO3 and t3kit overviewTYPO3 and t3kit overview
TYPO3 and t3kit overviewJozef Spisiak
 
IoT-Stockholm-Intro_to_BLE
IoT-Stockholm-Intro_to_BLEIoT-Stockholm-Intro_to_BLE
IoT-Stockholm-Intro_to_BLEShahzada Hatim
 
Avoid SPOF in Cloud-native Apps
Avoid SPOF in Cloud-native AppsAvoid SPOF in Cloud-native Apps
Avoid SPOF in Cloud-native AppsThang Chung
 
OSGi for IoT: the good, the bad and the ugly - Tim Verbelen
OSGi for IoT: the good, the bad and the ugly - Tim VerbelenOSGi for IoT: the good, the bad and the ugly - Tim Verbelen
OSGi for IoT: the good, the bad and the ugly - Tim Verbelenmfrancis
 
Introducing Fn Project
Introducing Fn ProjectIntroducing Fn Project
Introducing Fn ProjectAyumu Aizawa
 
AWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberAWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberRolf Koski
 
TAD Summit 2016 - The Mobile World Up Side Down
TAD Summit 2016 - The Mobile World Up Side DownTAD Summit 2016 - The Mobile World Up Side Down
TAD Summit 2016 - The Mobile World Up Side DownDaniel-Constantin Mierla
 
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy System
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy SystemIoT in the Cloud: Build and Unleash the Value in your Renewable Energy System
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy SystemMark Heckler
 
Virtual training InfluxCloud 2018
Virtual training   InfluxCloud 2018Virtual training   InfluxCloud 2018
Virtual training InfluxCloud 2018InfluxData
 
Will ServerLess kill containers and Operations
Will ServerLess kill containers and OperationsWill ServerLess kill containers and Operations
Will ServerLess kill containers and OperationsStephane Woillez
 

Was ist angesagt? (20)

AstriCon2020 The Great Migration
AstriCon2020 The Great MigrationAstriCon2020 The Great Migration
AstriCon2020 The Great Migration
 
From AIX to Zero-ops by Pierre Baillet
From AIX to Zero-ops by Pierre BailletFrom AIX to Zero-ops by Pierre Baillet
From AIX to Zero-ops by Pierre Baillet
 
Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.  Cloud demystified, what remains after the fog has lifted.
Cloud demystified, what remains after the fog has lifted.
 
Piwik presentation 2011
Piwik presentation 2011Piwik presentation 2011
Piwik presentation 2011
 
App Services - Connecting the dots of Web Mobile and Integration_published
App Services - Connecting the dots of Web Mobile and Integration_publishedApp Services - Connecting the dots of Web Mobile and Integration_published
App Services - Connecting the dots of Web Mobile and Integration_published
 
AWS as a code - using ansible
 AWS as a code - using ansible  AWS as a code - using ansible
AWS as a code - using ansible
 
KIWI IoT Presentation
KIWI IoT PresentationKIWI IoT Presentation
KIWI IoT Presentation
 
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»
Артем Логинов «NoSQL DBMSs review and non-relational approaches to store data»
 
TYPO3 and t3kit overview
TYPO3 and t3kit overviewTYPO3 and t3kit overview
TYPO3 and t3kit overview
 
OpenStack Summit Hong Kong
OpenStack Summit Hong KongOpenStack Summit Hong Kong
OpenStack Summit Hong Kong
 
IoT-Stockholm-Intro_to_BLE
IoT-Stockholm-Intro_to_BLEIoT-Stockholm-Intro_to_BLE
IoT-Stockholm-Intro_to_BLE
 
Avoid SPOF in Cloud-native Apps
Avoid SPOF in Cloud-native AppsAvoid SPOF in Cloud-native Apps
Avoid SPOF in Cloud-native Apps
 
OSGi for IoT: the good, the bad and the ugly - Tim Verbelen
OSGi for IoT: the good, the bad and the ugly - Tim VerbelenOSGi for IoT: the good, the bad and the ugly - Tim Verbelen
OSGi for IoT: the good, the bad and the ugly - Tim Verbelen
 
Introducing Fn Project
Introducing Fn ProjectIntroducing Fn Project
Introducing Fn Project
 
AWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberAWS Finland Meetup 2019 October
AWS Finland Meetup 2019 October
 
SIP is hard, let's go shopping!
SIP is hard, let's go shopping!SIP is hard, let's go shopping!
SIP is hard, let's go shopping!
 
TAD Summit 2016 - The Mobile World Up Side Down
TAD Summit 2016 - The Mobile World Up Side DownTAD Summit 2016 - The Mobile World Up Side Down
TAD Summit 2016 - The Mobile World Up Side Down
 
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy System
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy SystemIoT in the Cloud: Build and Unleash the Value in your Renewable Energy System
IoT in the Cloud: Build and Unleash the Value in your Renewable Energy System
 
Virtual training InfluxCloud 2018
Virtual training   InfluxCloud 2018Virtual training   InfluxCloud 2018
Virtual training InfluxCloud 2018
 
Will ServerLess kill containers and Operations
Will ServerLess kill containers and OperationsWill ServerLess kill containers and Operations
Will ServerLess kill containers and Operations
 

Andere mochten auch

Power point...en version casi terminado...........khkjhvjgb
Power point...en version casi terminado...........khkjhvjgbPower point...en version casi terminado...........khkjhvjgb
Power point...en version casi terminado...........khkjhvjgbfernanda_amat
 
Agua y sales minerales
Agua y sales mineralesAgua y sales minerales
Agua y sales mineralesmerchealari
 
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)NAXOS Deutschland GmbH
 
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...NAXOS Deutschland GmbH
 
8. Community Training ITmitte.de - technische Neuerungen 2012
8. Community Training ITmitte.de - technische Neuerungen 20128. Community Training ITmitte.de - technische Neuerungen 2012
8. Community Training ITmitte.de - technische Neuerungen 2012Community ITmitte.de
 
4 mario saavedra temuco
4 mario saavedra temuco4 mario saavedra temuco
4 mario saavedra temucoINACAP
 
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)NAXOS Deutschland GmbH
 
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)NAXOS Deutschland GmbH
 
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!HighHeels-Boutique.com
 
First 50
First 50First 50
First 50Jackane
 
Trackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaften
Trackoid Rescue - eine mobile Lösung zur Unterstützung von RettungsmannschaftenTrackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaften
Trackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaftentrackoid
 
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...NAXOS Deutschland GmbH
 

Andere mochten auch (20)

José cabezas
José cabezasJosé cabezas
José cabezas
 
Power point...en version casi terminado...........khkjhvjgb
Power point...en version casi terminado...........khkjhvjgbPower point...en version casi terminado...........khkjhvjgb
Power point...en version casi terminado...........khkjhvjgb
 
Gv act1 situación problema
Gv act1 situación problemaGv act1 situación problema
Gv act1 situación problema
 
Agua y sales minerales
Agua y sales mineralesAgua y sales minerales
Agua y sales minerales
 
Octubre172013 clase-5
Octubre172013 clase-5Octubre172013 clase-5
Octubre172013 clase-5
 
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten August 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
 
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...
Blu-ray, DVD- und CD-Neuheiten November 2013 Nr. 4 (Im Vertrieb der NAXOS Deu...
 
Hrv musik
Hrv musikHrv musik
Hrv musik
 
8. Community Training ITmitte.de - technische Neuerungen 2012
8. Community Training ITmitte.de - technische Neuerungen 20128. Community Training ITmitte.de - technische Neuerungen 2012
8. Community Training ITmitte.de - technische Neuerungen 2012
 
4 mario saavedra temuco
4 mario saavedra temuco4 mario saavedra temuco
4 mario saavedra temuco
 
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
CD-Neuheiten September 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
 
Präsentationen
PräsentationenPräsentationen
Präsentationen
 
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
DVD-SonderVÖ-Neuheiten Juni 2011 (Im Vertrieb der NAXOS Deutschland GmbH)
 
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!
Weinshop Weingrube.com präsentiert Weine aus dem Weingut Scheiblhofer!
 
Jmorenomar tfg0112
Jmorenomar tfg0112Jmorenomar tfg0112
Jmorenomar tfg0112
 
First 50
First 50First 50
First 50
 
Unidad3
Unidad3Unidad3
Unidad3
 
Boarder
BoarderBoarder
Boarder
 
Trackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaften
Trackoid Rescue - eine mobile Lösung zur Unterstützung von RettungsmannschaftenTrackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaften
Trackoid Rescue - eine mobile Lösung zur Unterstützung von Rettungsmannschaften
 
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...
Blu-ray, DVD- und CD-Neuheiten April Nr. 1 (Im Vertrieb der NAXOS Deutschland...
 

Ähnlich wie Custom Solr Tokenizer Flexible Tokenizer with JFlex

Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull lucenerevolution
 
GraphTour - Neo4j Database Overview
GraphTour - Neo4j Database OverviewGraphTour - Neo4j Database Overview
GraphTour - Neo4j Database OverviewNeo4j
 
Cincom Smalltalk News
Cincom Smalltalk NewsCincom Smalltalk News
Cincom Smalltalk NewsESUG
 
GraphTalk Copenhagen - Introduction to Graphs and Neo4j
GraphTalk Copenhagen - Introduction to Graphs and Neo4jGraphTalk Copenhagen - Introduction to Graphs and Neo4j
GraphTalk Copenhagen - Introduction to Graphs and Neo4jNeo4j
 
Knolidge - Discover What You Have
Knolidge - Discover What You HaveKnolidge - Discover What You Have
Knolidge - Discover What You Haveknolidge
 
Logmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetupLogmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetuplogmatic.io
 
What is the Siemens Open Library, and How it Decreased Development Time for E...
What is the Siemens Open Library, and How it Decreased Development Time for E...What is the Siemens Open Library, and How it Decreased Development Time for E...
What is the Siemens Open Library, and How it Decreased Development Time for E...DMC, Inc.
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallDr. Haxel Consult
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 OverviewNeo4j
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Webinar: Value Gain by Modernizing with Applicationinsights1.5
Webinar: Value Gain by Modernizing with Applicationinsights1.5Webinar: Value Gain by Modernizing with Applicationinsights1.5
Webinar: Value Gain by Modernizing with Applicationinsights1.5panagenda
 
Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?C4Media
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 
MarkLogic User Group - Best of MLW and Search + Semantics
MarkLogic User Group - Best of MLW and Search + SemanticsMarkLogic User Group - Best of MLW and Search + Semantics
MarkLogic User Group - Best of MLW and Search + SemanticsMatt Turner
 
Extendable Applications in Go
Extendable Applications in GoExtendable Applications in Go
Extendable Applications in Gophilipsahli
 
Presentation meetup ElasticSearch Paris #10
Presentation meetup ElasticSearch Paris #10Presentation meetup ElasticSearch Paris #10
Presentation meetup ElasticSearch Paris #10Renaud Boutet
 
Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildAcquia
 
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...Yves Hwang
 

Ähnlich wie Custom Solr Tokenizer Flexible Tokenizer with JFlex (20)

Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
 
GraphTour - Neo4j Database Overview
GraphTour - Neo4j Database OverviewGraphTour - Neo4j Database Overview
GraphTour - Neo4j Database Overview
 
Cincom Smalltalk News
Cincom Smalltalk NewsCincom Smalltalk News
Cincom Smalltalk News
 
GraphTalk Copenhagen - Introduction to Graphs and Neo4j
GraphTalk Copenhagen - Introduction to Graphs and Neo4jGraphTalk Copenhagen - Introduction to Graphs and Neo4j
GraphTalk Copenhagen - Introduction to Graphs and Neo4j
 
Knolidge - Discover What You Have
Knolidge - Discover What You HaveKnolidge - Discover What You Have
Knolidge - Discover What You Have
 
Logmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetupLogmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetup
 
What is the Siemens Open Library, and How it Decreased Development Time for E...
What is the Siemens Open Library, and How it Decreased Development Time for E...What is the Siemens Open Library, and How it Decreased Development Time for E...
What is the Siemens Open Library, and How it Decreased Development Time for E...
 
Scalable Search Analytics
Scalable Search AnalyticsScalable Search Analytics
Scalable Search Analytics
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 Overview
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Webinar: Value Gain by Modernizing with Applicationinsights1.5
Webinar: Value Gain by Modernizing with Applicationinsights1.5Webinar: Value Gain by Modernizing with Applicationinsights1.5
Webinar: Value Gain by Modernizing with Applicationinsights1.5
 
Data mining tools overall
Data mining tools overallData mining tools overall
Data mining tools overall
 
Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
MarkLogic User Group - Best of MLW and Search + Semantics
MarkLogic User Group - Best of MLW and Search + SemanticsMarkLogic User Group - Best of MLW and Search + Semantics
MarkLogic User Group - Best of MLW and Search + Semantics
 
Extendable Applications in Go
Extendable Applications in GoExtendable Applications in Go
Extendable Applications in Go
 
Presentation meetup ElasticSearch Paris #10
Presentation meetup ElasticSearch Paris #10Presentation meetup ElasticSearch Paris #10
Presentation meetup ElasticSearch Paris #10
 
Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wild
 
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...
Transforming Enterprise Release Management in Elastic Beanstalk using Jenkins...
 

Mehr von SHI Search | Analytics | Big Data

Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließen
Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließenMit Customer-Journey-Analytics und Recommendations neue Potenziale erschließen
Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließenSHI Search | Analytics | Big Data
 
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...SHI Search | Analytics | Big Data
 
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)SHI Search | Analytics | Big Data
 
Suche und Navigation in Online-Shops. Mit Apache Solr und Elasticsearch
Suche und Navigation in Online-Shops. Mit Apache Solr und ElasticsearchSuche und Navigation in Online-Shops. Mit Apache Solr und Elasticsearch
Suche und Navigation in Online-Shops. Mit Apache Solr und ElasticsearchSHI Search | Analytics | Big Data
 
Setting-up Elasticsearch, Logstash, Kibana für agile Datenanalyse
Setting-up Elasticsearch, Logstash, Kibana für agile DatenanalyseSetting-up Elasticsearch, Logstash, Kibana für agile Datenanalyse
Setting-up Elasticsearch, Logstash, Kibana für agile DatenanalyseSHI Search | Analytics | Big Data
 
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...SHI Search | Analytics | Big Data
 

Mehr von SHI Search | Analytics | Big Data (15)

Buzzword Bingo E-Commerce
Buzzword Bingo E-CommerceBuzzword Bingo E-Commerce
Buzzword Bingo E-Commerce
 
E commerce-tag berlin-nichts_im_sortiment_gefunden
E commerce-tag berlin-nichts_im_sortiment_gefundenE commerce-tag berlin-nichts_im_sortiment_gefunden
E commerce-tag berlin-nichts_im_sortiment_gefunden
 
Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließen
Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließenMit Customer-Journey-Analytics und Recommendations neue Potenziale erschließen
Mit Customer-Journey-Analytics und Recommendations neue Potenziale erschließen
 
Apache Solr - die Moderne Open Source Technologie
Apache Solr - die Moderne Open Source TechnologieApache Solr - die Moderne Open Source Technologie
Apache Solr - die Moderne Open Source Technologie
 
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...
Neue Potentiale durch Recommendations erschliessen und Conversions steigern (...
 
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)
Neue Kundenpotenziale durch Recommendations erschließen (Vortrag E-Commerce Tag)
 
Mehr Umsatz mit einer intelligenten Shop-Suche
Mehr Umsatz mit einer intelligenten Shop-SucheMehr Umsatz mit einer intelligenten Shop-Suche
Mehr Umsatz mit einer intelligenten Shop-Suche
 
What’s new in Apache Solr 4.7 und Elasticsearch 1.1
What’s new in Apache Solr 4.7 und Elasticsearch 1.1What’s new in Apache Solr 4.7 und Elasticsearch 1.1
What’s new in Apache Solr 4.7 und Elasticsearch 1.1
 
Suche und Navigation in Online-Shops. Mit Apache Solr und Elasticsearch
Suche und Navigation in Online-Shops. Mit Apache Solr und ElasticsearchSuche und Navigation in Online-Shops. Mit Apache Solr und Elasticsearch
Suche und Navigation in Online-Shops. Mit Apache Solr und Elasticsearch
 
Setting-up Elasticsearch, Logstash, Kibana für agile Datenanalyse
Setting-up Elasticsearch, Logstash, Kibana für agile DatenanalyseSetting-up Elasticsearch, Logstash, Kibana für agile Datenanalyse
Setting-up Elasticsearch, Logstash, Kibana für agile Datenanalyse
 
Elasticsearch Cluster Management mit Marvel
Elasticsearch Cluster Management mit MarvelElasticsearch Cluster Management mit Marvel
Elasticsearch Cluster Management mit Marvel
 
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...
Apache Solr vs. Elasticsearch - And The Winner Is...! Ein Vergleich der Shoot...
 
Überblick über die Suchplattform LucidWorks Search 2.1
Überblick über die Suchplattform LucidWorks Search 2.1Überblick über die Suchplattform LucidWorks Search 2.1
Überblick über die Suchplattform LucidWorks Search 2.1
 
Relevantes schneller finden – mit-Lucene und Solr
Relevantes schneller finden – mit-Lucene und SolrRelevantes schneller finden – mit-Lucene und Solr
Relevantes schneller finden – mit-Lucene und Solr
 
Jax 2012 - Apache Solr as Enterprise Search Platform
Jax 2012 - Apache Solr as Enterprise Search PlatformJax 2012 - Apache Solr as Enterprise Search Platform
Jax 2012 - Apache Solr as Enterprise Search Platform
 

Kürzlich hochgeladen

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 

Kürzlich hochgeladen (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 

Custom Solr Tokenizer Flexible Tokenizer with JFlex

  • 1. Technology Drives Business CUSTOM SOLR TOKENIZER FLEXIBLE TOKENIZER WITH JFLEX 2014 BerlinBuzzword
  • 2. Agenda • ME & SHI • JFLEX Tokenizer • Motivation • JFlex ?! • Solr implementation • Demo • Q & A
  • 3. Markus Klose – Search Consultant • Expertise in Solr, Lucene, Elasticsearch, Fast ESP • Certified Apache Solr Trainer • Speaker, Blogger, Coder • Author “Einführung in Apache Solr” • @markus_klose
  • 4. SHI GmbH & Co KG 2013 2011 Delivering mission-critical data-driven solution for multiple industries. Partnering with Partnering with LucidWorks 2000 Embracing Open Source. 1994 Foundation. Development of home-grown information retrieval platform. 2014
  • 5. OUR MISSION Vendor-independent IT Consulting and Software Engineering company. Dedicated to deliver next generation Semantic Search, Big Data and Exploratory Data Analytics solutions. Using Enterprise Data Hub approach for 360° data integration. And helping customers to Accelerate (e)Business through better technology adoption and data utilization.
  • 6. Technology Drives Business CUSTOM TOKENIZER WITH JFLEX JFlex based tokenizer - the idea is not new, but great
  • 7. Motivation 1 • In customer projects we have to deal very often with custom „meta“ data • IDs • Type designation • Product description • How to face that problem? PatternTokenizer?
  • 8. Motivation 2 • Use and combine existing tools to be more flexible • Configuration over Coding • JFlex allready used in ClassicTokenizer / StandardTokenizer
  • 9. UseCase – Type designation • Product Data • nymj3x1,5 / nym-j 3x1,5 / nymj 3x1,5 / nym-j 3 x 1,5 • Search Input • nymj 3 1,5 / nym-j 3x1,5 • Index • nymj315 / nymj / nym / j / 315 / 3 / 15
  • 10. JFlex - The Fast Scanner Generator • JFlex is a lexical analyzer generator (aka scanner generator) • Current version 1.5.1 • Download - http://jflex.de/download.html • Mailing Lists • BSD-style license • CLI API & GUI
  • 11. JFlex - The Fast Scanner Generator • Berlin Buzzword 26.05.2014 • LETTERS -> „Berlin“, „Buzzword“ • LETTERS and SPACE -> „Berlin Buzzword“ • DIGITS -> „26“, „05“, „2014“ • DIGITS and . -> „26.05.2014“ • LETTERS and SPACE or DIGITS and . -> „Berlin Buzzword“ , „26.05.2014“
  • 12. Custom Tokenizer – Project Setup • JAVA - TokenizerFactory –> typical factory, tokenizer configuration • JAVA - Tokenizer -> base class, token manipulation • JFLEX – Scanner -> description of token patterns • (JAVA – Scanner) -> Generated scanner
  • 13. Demo ISBN Tokenizer / URL Tokenizer https://github.com/scherziglu
  • 14. Resources • JFlex Tokenizer • GitHub (https://github.com/scherziglu) • Solr Source Code (e.g. ClassicTokenizer) • @markus-klose / @SHIEngineers • JFlex Websites http://jflex.de/ • Q & A
  • 15. CONTACT SHI GmbH & Co KG Curt-Frenzel-Str. 12 86167 Augsburg Germany info@shi-gmbh.com +49.821.74 82 633 0 mma@shi-gmbh.com mk@shi-gmh.com @markus_klose dwr@sgi-gmbh.com @SHIEngineers @wrigley_dan

Hinweis der Redaktion

  1. 9
  2. 36/840 E+P USB-Kabel 000(VE10) 6-30 3S+1Ö M12FR-3L 1x2
  3. JFlex is a lexical analyzer generator (also known as scanner generator) for Java(tm), written in Java(tm). It is also a rewrite of the very useful tool JLex which was developed by Elliot Berk at Princeton University. As Vern Paxson states for his C/C++ tool flex: They do not share any code though. JFlex is designed to work together with the LALR parser generator CUP by Scott Hudson, and the Java modification of Berkeley Yacc BYacc/J by Bob Jamison. It can also be used together with other parser generators like ANTLR or as a standalone tool. JFlex has three mailing lists: jflex-announce is low traffic and read-only for announcements of new releases, jflex-users is for help and discussions, and jflex-devel for developer discussions. If you would like to subscribe to either of the first two, please enter your email address below, check the appropriate boxes, and press [subscribe]. For the developer list, see the mailing lists page. Creating java classes based on a grammar that parses input
  4. Show factory & solrconfig.xml Show Tokenizer -> incrementToken Show JFlex File + Compilation
  5. Step 1 nur text Step 2 simple kombination Step 3 kompex setup ISBN url protocol://subdomain.site.domain/directory