SlideShare a Scribd company logo
1 of 19
Lucene Performance
        Workshop




Lucid Imagination, Inc.




                          Lucid Imagination, Inc.   1
Intro


About the speaker and Lucid Imagination
Agenda
 Lucene and performance
 Lucid Gaze for Lucene: UI and API
 Key statistics
 Examples
 Q & A session




                           Lucid Imagination, Inc.




                                                     Lucid Imagination, Inc.   2
Lucene and performance
  Perceived performance issues can have different causes
  Classic JVM problems, classic solutions
   heap size
   garbage collection
   stack size
   HotSpot
  Lucene/Search-related issues: beyond JVM tuning
    Indexing performance: indexing too slow, strange
    slowdowns during indexing
    Search performance: search too slow in general, or for
                           Lucid Imagination, Inc.



    certain types of queries


                                            Lucid Imagination, Inc.   3
Common Lucene performance issues


   Indexing:
     Too many segments being created
     Too many Token-s / TokenStream-s
     Too many Documents / Fields
   Searching:
     Too many IndexReader-s / IndexSearcher-s
     High RAM usage of IndexReader
     Slow response times for certain queries
   Application-level logging may not be up to the task
   Profiler is too low-level and too intrusive
                               Lucid Imagination, Inc.

   We need a lightweight probe to peek at vital Lucene statistics


                                                         Lucid Imagination, Inc.   4
Lucid Gaze for Lucene (LG4L)
Target audience and applications
  Tool for developers
  Performance monitoring
  Statistics collection
  Drop-in replacement for lucene-core-2.4.1.jar




                           Lucid Imagination, Inc.




                                                     Lucid Imagination, Inc.   5
Available information

Statistics (per time unit):
  IndexReader / IndexWriter:
    Number of documents and fields retrieved / created
    Number of IndexWriter / IndexReader / Directory instances created
      And the number of live instances!
    Memory consumption of IndexWriter / IndexReader instances
  Analysis:
    Number of Analyzers / TokenFilters / Tokenizers
    Number of TokenStream-s and Token-s
  Search:
    Number of searches and their average time
                            Lucid Imagination, Inc.
    Number of opened IndexSearcher-s
  Storage:
      Number of Lucene Directory instances created
                                                      Lucid Imagination, Inc.   6
Available information: metrics

Lists and histograms
  Count and a list of Analyzer, Tokenizer, TokenFilter
  instances
  Directory implementations
  Top-N queries:
    Queries with largest numbers of hits
    Queries that took longest to execute


  All this data is available as log, persistent DB and through the API

                              Lucid Imagination, Inc.




                                                        Lucid Imagination, Inc.   7
In-memory and RRD storage

Retaining historical values of collected statistics
  In-memory
    No files, no configuration hassles
    Concise overview periodically written to log (optional)
    Uses Java logging
  RRD (Round-Robin Database)
    Persistent round-robin database
        Single database of a constant size
        E.g. hourly, daily, weekly, monthly, yearly statistics
    Suitable for long-term monitoring
    Many more metrics and statistics tracked
    Can be accessed concurrentlyImagination,other applications
                             Lucid
                                   from Inc.




                                                                 Lucid Imagination, Inc.   8
Configuration

Java properties or gaze.properties
  List of properties supplied as -Dlucid.gaze...
  gaze.properties on classpath
  Configurability:
    Turning on/off selected monitors
    Producing debug output
    Using in-memory or RRD log retention
    Configuring RRD archives (to scale historical data over different
    periods)


                               Lucid Imagination, Inc.




                                                         Lucid Imagination, Inc.   9
API

Facade with static methods: LuceneCore
  Programmatic access to all statistics groups
  Retrieve top-N queries
  Retrieving additional metrics (e.g. histograms of analyzers,
  tokenizers, directory implementations, tracking of IndexReader /
  IndexWriter instances and their memory consumption)
  Enabling / disabling monitors
  Resetting statistics (useful for creating snapshots)


                             Lucid Imagination, Inc.




                                                       Lucid Imagination, Inc.   10
Example: indexing performance tuning

 Based on the contrib/benchmark suite
   Test impact of number of buffered docs
   Other interesting observations
     Number of documents / fields
     Number of tokens / token streams
     Number of IndexReader / Directory instances
     Number of IndexSearchers




                              Lucid Imagination, Inc.




                                                        Lucid Imagination, Inc.   11
Example: console output




                Lucid Imagination, Inc.




                                          Lucid Imagination, Inc.   12
Example: RRD Inspector




               Lucid Imagination, Inc.




                                         Lucid Imagination, Inc.   13
RRD Inspector (2)




             Lucid Imagination, Inc.




                                       Lucid Imagination, Inc.   14
RRD Inspector (3)




             Lucid Imagination, Inc.




                                       Lucid Imagination, Inc.   15
Performance impact of
     performance monitoring
Overhead of using LG4L
  Benchmarks (in contrib/benchmark) slower by ~10-15% on
  average, memory consumption higher by ~10%


  Remember: you can turn off some monitors!




                          Lucid Imagination, Inc.




                                                    Lucid Imagination, Inc.   16
Conclusions


Lucene can perform fantastically
   ... but it can't outmaneuver sub-optimal design or weak
  configuration
LG4L helps to understand the causes of poor performance
  Insight into high-level statistics that relate to Lucene API
  Round-robin database for tracking historical data
  LG4L is lightweight!




                              Lucid Imagination, Inc.




                                                        Lucid Imagination, Inc.   17
Q&A




Download and documentation:
  http://www.lucidimagination.com/Downloads/LucidGaze-for-Lucene
                           Lucid Imagination, Inc.




                                                     Lucid Imagination, Inc.   18
Example: LG4L with Solr


INFO: * AnalysisStats:
INFO: * AnalysisStats:
INFO:
INFO:    counters: {toks=258}
         counters: {toks=258}
INFO:
INFO:    metrics: {tns={=2, WhitespaceTokenizer=8},
         metrics: {tns={=2,
tfs={WordDelimiterFilter=8, StopFilter=8, SynonymFilter=8, LowerCaseFilter=8,
tfs={WordDelimiterFilter=8,                                  LowerCaseFilter=8,
EnglishPorterFilter=8},
EnglishPorterFilter=8},
ans={org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer=1,
ans={org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer=1,
org.apache.solr.analysis.TokenizerChain=6,
org.apache.solr.analysis.TokenizerChain=6,
org.apache.solr.schema.FieldType$DefaultAnalyzer=13,
org.apache.lucene.analysis.WhitespaceAnalyzer=1,
org.apache.lucene.analysis.WhitespaceAnalyzer=1,
org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer=1}}
INFO: * DocumentStats:
INFO: * DocumentStats:
INFO:
INFO:    counters: {docs=1, fields=20}
         counters: {docs=1,
INFO: * IndexStats:
INFO: * IndexStats:
INFO:
INFO:    counters: {ir_isdC=1, ir_C=4, iw_C=0, ir_newC=7, ir_tpC=12,
         counters: {ir_isdC=1,              iw_C=0,
iw_segs=0, iw_buf=0, ir_tdC=10, ir_ram=1343504, iw_ram=0}
iw_segs=0, iw_buf=0, ir_tdC=10, ir_ram=1343504, iw_ram=0}
INFO: * SearchStats:
INFO: * SearchStats:
INFO:
INFO:    counters: {dfC=11, rwrC=3, rwrT=30265, srchrC=14, srchT=90133076,
         counters: {dfC=11,           rwrT=30265, srchrC=14,
srchC=6}
srchC=6}
INFO: * StoreStats:
INFO: * StoreStats:
INFO:
INFO:    counters: {dirC=8}
         counters: {dirC=8}       Lucid Imagination, Inc.
INFO:
INFO:    metrics: {dir_t={FSDirectory=8}}
         metrics: {dir_t={FSDirectory=8}}


 … but of course you should use LucidGaze for Solr instead!

                                                        Lucid Imagination, Inc.   19

More Related Content

What's hot

Spring Boot on Amazon Web Services with Spring Cloud AWS
Spring Boot on Amazon Web Services with Spring Cloud AWSSpring Boot on Amazon Web Services with Spring Cloud AWS
Spring Boot on Amazon Web Services with Spring Cloud AWSVMware Tanzu
 
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...HostedbyConfluent
 
Stanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkStanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkReynold Xin
 
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeSimplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeDatabricks
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearchJoey Wen
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDatabricks
 
Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmarkkevin han
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing DataWorks Summit
 
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017Alex Robinson
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And Whatudaymoogala
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight OverviewJacques Nadeau
 
Rethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsRethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsYingjun Wu
 

What's hot (20)

Spring Boot on Amazon Web Services with Spring Cloud AWS
Spring Boot on Amazon Web Services with Spring Cloud AWSSpring Boot on Amazon Web Services with Spring Cloud AWS
Spring Boot on Amazon Web Services with Spring Cloud AWS
 
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
 
Stanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkStanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache Spark
 
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta LakeSimplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearch
 
SHACL by example
SHACL by exampleSHACL by example
SHACL by example
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
Yahoo Cloud Serving Benchmark
Yahoo Cloud Serving BenchmarkYahoo Cloud Serving Benchmark
Yahoo Cloud Serving Benchmark
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 
Rethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsRethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming Systems
 

Viewers also liked

Query Latency Optimization with Lucene
Query Latency Optimization with LuceneQuery Latency Optimization with Lucene
Query Latency Optimization with Lucenelucenerevolution
 
Approaching Join Index: Presented by Mikhail Khludnev, Grid Dynamics
Approaching Join Index: Presented by Mikhail Khludnev, Grid DynamicsApproaching Join Index: Presented by Mikhail Khludnev, Grid Dynamics
Approaching Join Index: Presented by Mikhail Khludnev, Grid DynamicsLucidworks
 
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid Dynamics
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid DynamicsFaceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid Dynamics
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid DynamicsLucidworks
 
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)Ikki Ohmukai
 
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJPYahoo!デベロッパーネットワーク
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simonlucenerevolution
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovationsLucidworks (Archived)
 
The mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the marketThe mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the marketPaul Williamson
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Coterie 9 11
Coterie 9 11Coterie 9 11
Coterie 9 11LaRue
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条彰 村地
 
Lucene rev preso bialecki solr crawlers-lr
Lucene rev preso bialecki solr crawlers-lrLucene rev preso bialecki solr crawlers-lr
Lucene rev preso bialecki solr crawlers-lrLucidworks (Archived)
 
Hellosong
HellosongHellosong
Hellosongtanica
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yettanica
 

Viewers also liked (20)

Query Latency Optimization with Lucene
Query Latency Optimization with LuceneQuery Latency Optimization with Lucene
Query Latency Optimization with Lucene
 
Approaching Join Index: Presented by Mikhail Khludnev, Grid Dynamics
Approaching Join Index: Presented by Mikhail Khludnev, Grid DynamicsApproaching Join Index: Presented by Mikhail Khludnev, Grid Dynamics
Approaching Join Index: Presented by Mikhail Khludnev, Grid Dynamics
 
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid Dynamics
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid DynamicsFaceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid Dynamics
Faceting with Lucene Block Join Query: Presented by Oleg Savrasov, Grid Dynamics
 
Block join toranomaki
Block join toranomakiBlock join toranomaki
Block join toranomaki
 
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)
学術コンテンツサービスでの活用事例@Lucene/Solr勉強会(2015.5.13)
 
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
第16回Lucene/Solr勉強会 – ランキングチューニングと定量評価 #SolrJP
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
 
Nlp4 l intro-20150513
Nlp4 l intro-20150513Nlp4 l intro-20150513
Nlp4 l intro-20150513
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovations
 
The mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the marketThe mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the market
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Coterie 9 11
Coterie 9 11Coterie 9 11
Coterie 9 11
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条
 
Customized Navigation Using SOLR
Customized Navigation Using SOLRCustomized Navigation Using SOLR
Customized Navigation Using SOLR
 
Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValuesColumn Stride Fields aka. DocValues
Column Stride Fields aka. DocValues
 
Lucene rev preso bialecki solr crawlers-lr
Lucene rev preso bialecki solr crawlers-lrLucene rev preso bialecki solr crawlers-lr
Lucene rev preso bialecki solr crawlers-lr
 
Hellosong
HellosongHellosong
Hellosong
 
Ob12 01st
Ob12 01stOb12 01st
Ob12 01st
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yet
 

Similar to Understanding Lucene Search Performance

Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrLucidworks (Archived)
 
December 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over HadoopDecember 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over HadoopYahoo Developer Network
 
SplunkLive! Detroit April 2013 - Domino's Pizza
SplunkLive! Detroit April 2013 - Domino's PizzaSplunkLive! Detroit April 2013 - Domino's Pizza
SplunkLive! Detroit April 2013 - Domino's PizzaSplunk
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsDamien Dallimore
 
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...Dan Cundiff
 
dlux - Splunk Technical Overview
dlux - Splunk Technical Overviewdlux - Splunk Technical Overview
dlux - Splunk Technical OverviewDavid Lutz
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackRohit Sharma
 
2015 03-16-elk at-bsides
2015 03-16-elk at-bsides2015 03-16-elk at-bsides
2015 03-16-elk at-bsidesJeremy Cohoe
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
SplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunk
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsDigitalOcean
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerPradeep Kilambi
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga towerGordon Chung
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksLucidworks
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupJelena Zanko
 
Splunk in Nordstrom: IT Operations
Splunk in Nordstrom: IT OperationsSplunk in Nordstrom: IT Operations
Splunk in Nordstrom: IT OperationsTimur Bagirov
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysisPramod Toraskar
 

Similar to Understanding Lucene Search Performance (20)

Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for Solr
 
December 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over HadoopDecember 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over Hadoop
 
SplunkLive! Detroit April 2013 - Domino's Pizza
SplunkLive! Detroit April 2013 - Domino's PizzaSplunkLive! Detroit April 2013 - Domino's Pizza
SplunkLive! Detroit April 2013 - Domino's Pizza
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring Applications
 
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...
 
dlux - Splunk Technical Overview
dlux - Splunk Technical Overviewdlux - Splunk Technical Overview
dlux - Splunk Technical Overview
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
2015 03-16-elk at-bsides
2015 03-16-elk at-bsides2015 03-16-elk at-bsides
2015 03-16-elk at-bsides
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
SplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.comSplunkLive! Salt Lake City June 2013 - Ancestry.com
SplunkLive! Salt Lake City June 2013 - Ancestry.com
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult Steps
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out Ceilometer
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga tower
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid Meetup
 
Splunk in Nordstrom: IT Operations
Splunk in Nordstrom: IT OperationsSplunk in Nordstrom: IT Operations
Splunk in Nordstrom: IT Operations
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysis
 
Using the Splunk Java SDK
Using the Splunk Java SDKUsing the Splunk Java SDK
Using the Splunk Java SDK
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 

More from Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 

More from Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 

Recently uploaded

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Recently uploaded (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Understanding Lucene Search Performance

  • 1. Lucene Performance Workshop Lucid Imagination, Inc. Lucid Imagination, Inc. 1
  • 2. Intro About the speaker and Lucid Imagination Agenda Lucene and performance Lucid Gaze for Lucene: UI and API Key statistics Examples Q & A session Lucid Imagination, Inc. Lucid Imagination, Inc. 2
  • 3. Lucene and performance Perceived performance issues can have different causes Classic JVM problems, classic solutions heap size garbage collection stack size HotSpot Lucene/Search-related issues: beyond JVM tuning Indexing performance: indexing too slow, strange slowdowns during indexing Search performance: search too slow in general, or for Lucid Imagination, Inc. certain types of queries Lucid Imagination, Inc. 3
  • 4. Common Lucene performance issues Indexing: Too many segments being created Too many Token-s / TokenStream-s Too many Documents / Fields Searching: Too many IndexReader-s / IndexSearcher-s High RAM usage of IndexReader Slow response times for certain queries Application-level logging may not be up to the task Profiler is too low-level and too intrusive Lucid Imagination, Inc. We need a lightweight probe to peek at vital Lucene statistics Lucid Imagination, Inc. 4
  • 5. Lucid Gaze for Lucene (LG4L) Target audience and applications Tool for developers Performance monitoring Statistics collection Drop-in replacement for lucene-core-2.4.1.jar Lucid Imagination, Inc. Lucid Imagination, Inc. 5
  • 6. Available information Statistics (per time unit): IndexReader / IndexWriter: Number of documents and fields retrieved / created Number of IndexWriter / IndexReader / Directory instances created And the number of live instances! Memory consumption of IndexWriter / IndexReader instances Analysis: Number of Analyzers / TokenFilters / Tokenizers Number of TokenStream-s and Token-s Search: Number of searches and their average time Lucid Imagination, Inc. Number of opened IndexSearcher-s Storage: Number of Lucene Directory instances created Lucid Imagination, Inc. 6
  • 7. Available information: metrics Lists and histograms Count and a list of Analyzer, Tokenizer, TokenFilter instances Directory implementations Top-N queries: Queries with largest numbers of hits Queries that took longest to execute All this data is available as log, persistent DB and through the API Lucid Imagination, Inc. Lucid Imagination, Inc. 7
  • 8. In-memory and RRD storage Retaining historical values of collected statistics In-memory No files, no configuration hassles Concise overview periodically written to log (optional) Uses Java logging RRD (Round-Robin Database) Persistent round-robin database Single database of a constant size E.g. hourly, daily, weekly, monthly, yearly statistics Suitable for long-term monitoring Many more metrics and statistics tracked Can be accessed concurrentlyImagination,other applications Lucid from Inc. Lucid Imagination, Inc. 8
  • 9. Configuration Java properties or gaze.properties List of properties supplied as -Dlucid.gaze... gaze.properties on classpath Configurability: Turning on/off selected monitors Producing debug output Using in-memory or RRD log retention Configuring RRD archives (to scale historical data over different periods) Lucid Imagination, Inc. Lucid Imagination, Inc. 9
  • 10. API Facade with static methods: LuceneCore Programmatic access to all statistics groups Retrieve top-N queries Retrieving additional metrics (e.g. histograms of analyzers, tokenizers, directory implementations, tracking of IndexReader / IndexWriter instances and their memory consumption) Enabling / disabling monitors Resetting statistics (useful for creating snapshots) Lucid Imagination, Inc. Lucid Imagination, Inc. 10
  • 11. Example: indexing performance tuning Based on the contrib/benchmark suite Test impact of number of buffered docs Other interesting observations Number of documents / fields Number of tokens / token streams Number of IndexReader / Directory instances Number of IndexSearchers Lucid Imagination, Inc. Lucid Imagination, Inc. 11
  • 12. Example: console output Lucid Imagination, Inc. Lucid Imagination, Inc. 12
  • 13. Example: RRD Inspector Lucid Imagination, Inc. Lucid Imagination, Inc. 13
  • 14. RRD Inspector (2) Lucid Imagination, Inc. Lucid Imagination, Inc. 14
  • 15. RRD Inspector (3) Lucid Imagination, Inc. Lucid Imagination, Inc. 15
  • 16. Performance impact of performance monitoring Overhead of using LG4L Benchmarks (in contrib/benchmark) slower by ~10-15% on average, memory consumption higher by ~10% Remember: you can turn off some monitors! Lucid Imagination, Inc. Lucid Imagination, Inc. 16
  • 17. Conclusions Lucene can perform fantastically ... but it can't outmaneuver sub-optimal design or weak configuration LG4L helps to understand the causes of poor performance Insight into high-level statistics that relate to Lucene API Round-robin database for tracking historical data LG4L is lightweight! Lucid Imagination, Inc. Lucid Imagination, Inc. 17
  • 18. Q&A Download and documentation: http://www.lucidimagination.com/Downloads/LucidGaze-for-Lucene Lucid Imagination, Inc. Lucid Imagination, Inc. 18
  • 19. Example: LG4L with Solr INFO: * AnalysisStats: INFO: * AnalysisStats: INFO: INFO: counters: {toks=258} counters: {toks=258} INFO: INFO: metrics: {tns={=2, WhitespaceTokenizer=8}, metrics: {tns={=2, tfs={WordDelimiterFilter=8, StopFilter=8, SynonymFilter=8, LowerCaseFilter=8, tfs={WordDelimiterFilter=8, LowerCaseFilter=8, EnglishPorterFilter=8}, EnglishPorterFilter=8}, ans={org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer=1, ans={org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer=1, org.apache.solr.analysis.TokenizerChain=6, org.apache.solr.analysis.TokenizerChain=6, org.apache.solr.schema.FieldType$DefaultAnalyzer=13, org.apache.lucene.analysis.WhitespaceAnalyzer=1, org.apache.lucene.analysis.WhitespaceAnalyzer=1, org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer=1}} INFO: * DocumentStats: INFO: * DocumentStats: INFO: INFO: counters: {docs=1, fields=20} counters: {docs=1, INFO: * IndexStats: INFO: * IndexStats: INFO: INFO: counters: {ir_isdC=1, ir_C=4, iw_C=0, ir_newC=7, ir_tpC=12, counters: {ir_isdC=1, iw_C=0, iw_segs=0, iw_buf=0, ir_tdC=10, ir_ram=1343504, iw_ram=0} iw_segs=0, iw_buf=0, ir_tdC=10, ir_ram=1343504, iw_ram=0} INFO: * SearchStats: INFO: * SearchStats: INFO: INFO: counters: {dfC=11, rwrC=3, rwrT=30265, srchrC=14, srchT=90133076, counters: {dfC=11, rwrT=30265, srchrC=14, srchC=6} srchC=6} INFO: * StoreStats: INFO: * StoreStats: INFO: INFO: counters: {dirC=8} counters: {dirC=8} Lucid Imagination, Inc. INFO: INFO: metrics: {dir_t={FSDirectory=8}} metrics: {dir_t={FSDirectory=8}} … but of course you should use LucidGaze for Solr instead! Lucid Imagination, Inc. 19