SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Realtimestream and 
realtime search engine 
Sang Song 
fastcatsearch.org 
1
About Me 
• www.linkedin.com/profile/view?id=295484775 
• facebook.com/songaal 
• swsong@websqrd.com 
2
Agenda 
• Introduction 
• Search Architecture 
• Realtime Indexing 
3
Introduction 
4
Goal 
• Like Splunk 
• Indexing streaming log data 
• Search log data in real-time 
5
Big data 
• Data sets so large and complex for database 
• Difficult to process them using traditional data processing 
• 3Vs 
• Volume : Large quantity of data 
• Variety : Diverse set of data 
• Velocity : speed of data 
출처 : wikipedia 
6
About Fastcatsearch 
• Distributed system 
• Fast indexing 
• Fast queries 
• Popular keyword 
• GS cetification 
• 70+ references 
• Open source 
• Muti-platform 
• Easy web management 
tool 
• Dictionary management 
• Plugin extension 
7
Reference 
8
History 
• Fastcatsearch v1 (2010-2011) 
• Single machine 
• <150 QPS 
• Fastcatsearch v2 (2013-Now) 
• Distributed system 
• Multi collection result aggregation 
• >200+ Query per second 
• Fastcatsearch v3 (alpha) 
• Realtime indexing/searching 
• Schema-free 
• Shard/replica 
• Geo spatial search 
9
Search Architecture 
10
11
Realtime Indexing 
12
Store log data 
• HDFS 
• Write once static file 
• Flume 
• Collecting, aggregating, and moving large amounts of 
log data 
13
14
Flume config 
agent1.sources = r1 
agent1.sinks = hdfssink 
agent1.channels = c1 
agent1.sources.r1.type = netcat 
agent1.sources.r1.bind = localhost 
agent1.sources.r1.port = 44443 
agent1.sinks.hdfssink.type = hdfs 
agent1.sinks.hdfssink.hdfs.path = hdfs://192.168.189.173:9000/flume/events 
agent1.sinks.hdfssink.hdfs.file.Type = SequenceFile #DataStream 
agent1.sinks.hdfssink.hdfs.writeFormat = Text 
agent1.sinks.hdfssink.hdfs.batchSize= 10 
agent1.channels.c1.type = memory 
agent1.channels.c1.capacity = 1000 
agent1.channels.c1.transactionCapacity = 100 
agent1.sources.r1.channels = c1 
agent1.sinks.hdfssink.channel = c1 
$ ./flume-ng agent -f /home/swsong/flume/conf/flume.conf -n agent1 
15
Flume append? 
16
Fastcatsearch 
HDFS Indexer 
Merger 
SSeSegegmgmmeenentnt t Searcher 
Index File 
Issue 
- Segment file commit 
- Doc deletion 
17
Import using Flume 
1. FileSystem fs = FileSystem.get(URI.create(uriPath), conf); 
2. Configuration conf = fs.getConf(); 
3. FileStatus[] status = fs.listStatus(new Path(dirPath)); 
4. SequenceFile.Reader.Option opt = SequenceFile.Reader.file(status[i].getPath()); 
5. for (int i = 0; i < status.length; i++) { 
6. SequenceFile.Reader reader = new SequenceFile.Reader(conf, opt); 
7. Writable key = (Writable) ReflectionUtils.newInstance( 
reader.getKeyClass(), conf); 
8. Writable value = (Writable) ReflectionUtils.newInstance( 
reader.getValueClass(), conf); 
9. while (reader.next(key, value)) { 
10. Map<String, Object> parsedEvent = parseEvent(key.toString(), 
value.toString()); 
11. if (parsedEvent != null) { 
12. eventQueue.add(parsedEvent); 
} 
} 
} 
18
Making index segment 
• Index has multiple segments 
• Document writer 
• Index writer 
• Search index writer 
• Field index writer 
• Group index writer 
19
Segment commit issue 
• Update / Delete documents 
• Not in-place update 
• Append and delete operation 
• Deletion for previous segments 
• Mark as deleted 
20
Segment merge issue 
• Performance 
• 2(n+m) in time and space 
• Size Compaction - Deleted docs removed. 
segment #1 segment #2 segment #3 
segment #4 
merge to new segment 
21
Segment merge issue 
• Why merge? 
• Segment count grows fast 
• Search index = Search all leaf segments in turn 
• Document deletion 
22
Inverted Indexing 
Posting index term1 
term3 term5 term7 
Postings 
term1 posting1 term2 posting2 term3 posting3 
term4 posting4 term5 posting5 term6 posting6 
Good for sequential writing to disk 
23
Inverted Indexing 
How about b tree? 
block 
block block block 
Memory 
block block block block block block 
block block block block block block block block 
Flush occurs much of data random writing to disk 
File 
24
Search in realtime 
seg #1 seg #2 seg #3 seg #4 
1. New created segment 
Searchable data 
25
Search in realtime 
seg #1 seg #2 seg #3 seg #4 
2. Merge segments 
Searchable data 
26
Search in realtime 
seg #1 seg #2 seg #3 seg #4 seg #5 
4. Remove segments 
3. New merged segment 
Searchable data 
27
Search in realtime 
Searchable data 
seg #1 seg #5 
5. Searching data 
28
Search in realtime 
Searchable data 
seg #1 seg #5 
seg #6 
New created segment 
Do this process constantly 
29
Visualization 
• Lucene's merge visualization 
• http://www.youtube.com/watch?v=ojcpvIY3QgA 
• Python script + Python Image Library + MEncoder 
30
Questions? 
31
Learn More 
• http://fastcatsearch.org/ 
• https://www.facebook.com/groups/fastcatsearch/ 
32

Weitere ähnliche Inhalte

Was ist angesagt?

Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
MongoDB
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)
Sammy Fung
 

Was ist angesagt? (20)

Easy REST with OpenAPI
Easy REST with OpenAPIEasy REST with OpenAPI
Easy REST with OpenAPI
 
Mass Report Generation Using REST APIs
Mass Report Generation Using REST APIsMass Report Generation Using REST APIs
Mass Report Generation Using REST APIs
 
RIPE NCC DNS Measurements Hackathon 2017 - DNS Propagation Inconsistencies
RIPE NCC DNS Measurements Hackathon 2017 - DNS Propagation InconsistenciesRIPE NCC DNS Measurements Hackathon 2017 - DNS Propagation Inconsistencies
RIPE NCC DNS Measurements Hackathon 2017 - DNS Propagation Inconsistencies
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
 
Real time fulltext search with sphinx
Real time fulltext search with sphinxReal time fulltext search with sphinx
Real time fulltext search with sphinx
 
Analysing GitHub commits with R
Analysing GitHub commits with RAnalysing GitHub commits with R
Analysing GitHub commits with R
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQL
 
Analysing GitHub commits with R
Analysing GitHub commits with RAnalysing GitHub commits with R
Analysing GitHub commits with R
 
Jumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDBJumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDB
 
RethinkDB - the open-source database for the realtime web
RethinkDB - the open-source database for the realtime webRethinkDB - the open-source database for the realtime web
RethinkDB - the open-source database for the realtime web
 
Analysing GitHub commits with R
Analysing GitHub commits with RAnalysing GitHub commits with R
Analysing GitHub commits with R
 
Server Logs: After Excel Fails
Server Logs: After Excel FailsServer Logs: After Excel Fails
Server Logs: After Excel Fails
 
Analysing GitHub commits with R
Analysing GitHub commits with RAnalysing GitHub commits with R
Analysing GitHub commits with R
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
grlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked Datagrlc: Bridging the Gap Between RESTful APIs and Linked Data
grlc: Bridging the Gap Between RESTful APIs and Linked Data
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)
 

Andere mochten auch

SaaS Flatform metering and billing
SaaS Flatform metering and billingSaaS Flatform metering and billing
SaaS Flatform metering and billing
상욱 송
 

Andere mochten auch (10)

Realtime search engine concept
Realtime search engine conceptRealtime search engine concept
Realtime search engine concept
 
Midokura @ OpenStack Seattle
Midokura @ OpenStack SeattleMidokura @ OpenStack Seattle
Midokura @ OpenStack Seattle
 
SaaS Flatform metering and billing
SaaS Flatform metering and billingSaaS Flatform metering and billing
SaaS Flatform metering and billing
 
Let's Go (golang)
Let's Go (golang)Let's Go (golang)
Let's Go (golang)
 
빌링:미터링 Bss platform구현
빌링:미터링 Bss platform구현빌링:미터링 Bss platform구현
빌링:미터링 Bss platform구현
 
클라우드 서비스운영 플랫폼 가루다
클라우드 서비스운영 플랫폼 가루다클라우드 서비스운영 플랫폼 가루다
클라우드 서비스운영 플랫폼 가루다
 
[D2 CAMPUS] 2016 한양대학교 프로그래밍 경시대회 문제풀이
[D2 CAMPUS] 2016 한양대학교 프로그래밍 경시대회 문제풀이[D2 CAMPUS] 2016 한양대학교 프로그래밍 경시대회 문제풀이
[D2 CAMPUS] 2016 한양대학교 프로그래밍 경시대회 문제풀이
 
Data Center Design / Microservices
Data Center Design / MicroservicesData Center Design / Microservices
Data Center Design / Microservices
 
범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)
 
Realtime search at Yammer
Realtime search at YammerRealtime search at Yammer
Realtime search at Yammer
 

Ähnlich wie Realtimestream and realtime fastcatsearch

SplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
SplunkLive! Zürich 2014 Beginner Workshop: Getting started with SplunkSplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
SplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
Georg Knon
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overview
scrazzl
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshop
jenny_splunk
 
Getting Started with Splunk Break out Session
Getting Started with Splunk Break out SessionGetting Started with Splunk Break out Session
Getting Started with Splunk Break out Session
Georg Knon
 
Lessons Learned from Building SW at Google
Lessons Learned from Building SW at GoogleLessons Learned from Building SW at Google
Lessons Learned from Building SW at Google
adrianionel
 

Ähnlich wie Realtimestream and realtime fastcatsearch (20)

SplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
SplunkLive! Zürich 2014 Beginner Workshop: Getting started with SplunkSplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
SplunkLive! Zürich 2014 Beginner Workshop: Getting started with Splunk
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overview
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
Getting started with Splunk - Break out Session
Getting started with Splunk - Break out SessionGetting started with Splunk - Break out Session
Getting started with Splunk - Break out Session
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshop
 
Getting Started with Splunk Break out Session
Getting Started with Splunk Break out SessionGetting Started with Splunk Break out Session
Getting Started with Splunk Break out Session
 
Using Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauUsing Perforce Data in Development at Tableau
Using Perforce Data in Development at Tableau
 
Elasticsearch from the trenches
Elasticsearch from the trenchesElasticsearch from the trenches
Elasticsearch from the trenches
 
Making Session Stores More Intelligent
Making Session Stores More IntelligentMaking Session Stores More Intelligent
Making Session Stores More Intelligent
 
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
 
Workshop: Big Data Visualization for Security
Workshop: Big Data Visualization for SecurityWorkshop: Big Data Visualization for Security
Workshop: Big Data Visualization for Security
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
 
Lessons Learned from Building SW at Google
Lessons Learned from Building SW at GoogleLessons Learned from Building SW at Google
Lessons Learned from Building SW at Google
 
Building Software Systems at Google and Lessons Learned
Building Software Systems at Google and Lessons LearnedBuilding Software Systems at Google and Lessons Learned
Building Software Systems at Google and Lessons Learned
 
Druid at naver.com - part 1
Druid at naver.com - part 1Druid at naver.com - part 1
Druid at naver.com - part 1
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkGelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
 

Mehr von 상욱 송

네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
상욱 송
 

Mehr von 상욱 송 (11)

클레이튼 BApp 서비스 현황
클레이튼 BApp 서비스 현황클레이튼 BApp 서비스 현황
클레이튼 BApp 서비스 현황
 
쿠버네티스의 이해 #2
쿠버네티스의 이해 #2쿠버네티스의 이해 #2
쿠버네티스의 이해 #2
 
쿠버네티스의 이해 #1
쿠버네티스의 이해 #1쿠버네티스의 이해 #1
쿠버네티스의 이해 #1
 
이더리움 스마트계약 보안지침 가이드 2. 솔리디티 권고안
이더리움 스마트계약 보안지침 가이드 2. 솔리디티 권고안이더리움 스마트계약 보안지침 가이드 2. 솔리디티 권고안
이더리움 스마트계약 보안지침 가이드 2. 솔리디티 권고안
 
Go 언어 성공사례 및 강점
Go 언어 성공사례 및 강점Go 언어 성공사례 및 강점
Go 언어 성공사례 및 강점
 
Java 어플리케이션 성능튜닝 Part3
Java 어플리케이션 성능튜닝 Part3Java 어플리케이션 성능튜닝 Part3
Java 어플리케이션 성능튜닝 Part3
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
Java 어플리케이션 성능튜닝 Part2
Java 어플리케이션 성능튜닝 Part2Java 어플리케이션 성능튜닝 Part2
Java 어플리케이션 성능튜닝 Part2
 
Fastcat 검색구축사례
Fastcat 검색구축사례Fastcat 검색구축사례
Fastcat 검색구축사례
 
가상화폐 개념 및 거래 기초개발
가상화폐 개념 및 거래 기초개발가상화폐 개념 및 거래 기초개발
가상화폐 개념 및 거래 기초개발
 
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
네이버 지식쇼핑과 아마존의 검색결과 페이지네비게이션 유형분석
 

Realtimestream and realtime fastcatsearch

  • 1. Realtimestream and realtime search engine Sang Song fastcatsearch.org 1
  • 2. About Me • www.linkedin.com/profile/view?id=295484775 • facebook.com/songaal • swsong@websqrd.com 2
  • 3. Agenda • Introduction • Search Architecture • Realtime Indexing 3
  • 5. Goal • Like Splunk • Indexing streaming log data • Search log data in real-time 5
  • 6. Big data • Data sets so large and complex for database • Difficult to process them using traditional data processing • 3Vs • Volume : Large quantity of data • Variety : Diverse set of data • Velocity : speed of data 출처 : wikipedia 6
  • 7. About Fastcatsearch • Distributed system • Fast indexing • Fast queries • Popular keyword • GS cetification • 70+ references • Open source • Muti-platform • Easy web management tool • Dictionary management • Plugin extension 7
  • 9. History • Fastcatsearch v1 (2010-2011) • Single machine • <150 QPS • Fastcatsearch v2 (2013-Now) • Distributed system • Multi collection result aggregation • >200+ Query per second • Fastcatsearch v3 (alpha) • Realtime indexing/searching • Schema-free • Shard/replica • Geo spatial search 9
  • 11. 11
  • 13. Store log data • HDFS • Write once static file • Flume • Collecting, aggregating, and moving large amounts of log data 13
  • 14. 14
  • 15. Flume config agent1.sources = r1 agent1.sinks = hdfssink agent1.channels = c1 agent1.sources.r1.type = netcat agent1.sources.r1.bind = localhost agent1.sources.r1.port = 44443 agent1.sinks.hdfssink.type = hdfs agent1.sinks.hdfssink.hdfs.path = hdfs://192.168.189.173:9000/flume/events agent1.sinks.hdfssink.hdfs.file.Type = SequenceFile #DataStream agent1.sinks.hdfssink.hdfs.writeFormat = Text agent1.sinks.hdfssink.hdfs.batchSize= 10 agent1.channels.c1.type = memory agent1.channels.c1.capacity = 1000 agent1.channels.c1.transactionCapacity = 100 agent1.sources.r1.channels = c1 agent1.sinks.hdfssink.channel = c1 $ ./flume-ng agent -f /home/swsong/flume/conf/flume.conf -n agent1 15
  • 17. Fastcatsearch HDFS Indexer Merger SSeSegegmgmmeenentnt t Searcher Index File Issue - Segment file commit - Doc deletion 17
  • 18. Import using Flume 1. FileSystem fs = FileSystem.get(URI.create(uriPath), conf); 2. Configuration conf = fs.getConf(); 3. FileStatus[] status = fs.listStatus(new Path(dirPath)); 4. SequenceFile.Reader.Option opt = SequenceFile.Reader.file(status[i].getPath()); 5. for (int i = 0; i < status.length; i++) { 6. SequenceFile.Reader reader = new SequenceFile.Reader(conf, opt); 7. Writable key = (Writable) ReflectionUtils.newInstance( reader.getKeyClass(), conf); 8. Writable value = (Writable) ReflectionUtils.newInstance( reader.getValueClass(), conf); 9. while (reader.next(key, value)) { 10. Map<String, Object> parsedEvent = parseEvent(key.toString(), value.toString()); 11. if (parsedEvent != null) { 12. eventQueue.add(parsedEvent); } } } 18
  • 19. Making index segment • Index has multiple segments • Document writer • Index writer • Search index writer • Field index writer • Group index writer 19
  • 20. Segment commit issue • Update / Delete documents • Not in-place update • Append and delete operation • Deletion for previous segments • Mark as deleted 20
  • 21. Segment merge issue • Performance • 2(n+m) in time and space • Size Compaction - Deleted docs removed. segment #1 segment #2 segment #3 segment #4 merge to new segment 21
  • 22. Segment merge issue • Why merge? • Segment count grows fast • Search index = Search all leaf segments in turn • Document deletion 22
  • 23. Inverted Indexing Posting index term1 term3 term5 term7 Postings term1 posting1 term2 posting2 term3 posting3 term4 posting4 term5 posting5 term6 posting6 Good for sequential writing to disk 23
  • 24. Inverted Indexing How about b tree? block block block block Memory block block block block block block block block block block block block block block Flush occurs much of data random writing to disk File 24
  • 25. Search in realtime seg #1 seg #2 seg #3 seg #4 1. New created segment Searchable data 25
  • 26. Search in realtime seg #1 seg #2 seg #3 seg #4 2. Merge segments Searchable data 26
  • 27. Search in realtime seg #1 seg #2 seg #3 seg #4 seg #5 4. Remove segments 3. New merged segment Searchable data 27
  • 28. Search in realtime Searchable data seg #1 seg #5 5. Searching data 28
  • 29. Search in realtime Searchable data seg #1 seg #5 seg #6 New created segment Do this process constantly 29
  • 30. Visualization • Lucene's merge visualization • http://www.youtube.com/watch?v=ojcpvIY3QgA • Python script + Python Image Library + MEncoder 30
  • 32. Learn More • http://fastcatsearch.org/ • https://www.facebook.com/groups/fastcatsearch/ 32