Enterprise Search Summit - Speeding Up Search

Speeding up
Enterprise Search
with
In-Memory Indexing
Gil Tene, CTO & co-Founder, Azul Systems
©2011 Azul Systems, Inc.

This Talk’s Purpose / Goals
Describe how in-memory indexing can help
enterprise search
Explain why in-memory and large memory use has
been delayed
Show how in-memory indexing is now (ﬁnally)
practical for interactive search applications


About me: Gil Tene
co-founder, CTO
@Azul Systems
Have been working on
a “think different” GC
approaches since 2002
Created Pauseless & C4
core GC algorithms
(Tene, Wolf)

A Long history building
Virtual & Physical
Machines, Operating
Systems, Enterprise
apps, etc...

* working on real-world trash compaction issues, circa 2004

About Azul
Vega

We make scalable Virtual
Machines
Have built “whatever it takes
to get job done” since 2002
3 generations of custom SMP
Multi-core HW (Vega)
Now Pure software for
commodity x86 (Zing)

C4

“Industry ﬁrsts” in Garbage
collection, elastic memory,
Java virtualization, memory
scale

About Azul
Supporting mission-critical deployments around


High level agenda
How can in-memory help things?
Why isn’t everyone just using it?
The “Application Memory Wall” problem
Some Garbage Collection banter
What an actual solution looks like...


Memory use
How many of you use heap sizes of:
F

more than ½ GB?

F

more than 1 GB?

F

more than 2 GB?

F

more than 4 GB?

F

more than 10 GB?

F

more than 20 GB?

F

more than 50 GB?


Why is In-Memory indexing interesting?


Interesting things happen when
an entire problem ﬁts in a single heap
No more out-of-process access for “hot” stuff
~1 microsecond instead of ~1 millisecond access
10s of GB/sec instead of 100s of MB/sec rates
Work elimination
No more reading/parsing/deserializing per access
No network or I/O load

These can amount to dramatic speed beneﬁts

Many business problems can now
comfortably fit in server memory
Product Catalogs are often in the 2-10GB size range
Overnight reports are often done on 10-20GB data sets
All of English Language Wikipedia text is ~29GB
Indexing “bloats” things, but not by that much...
A full index of English Language Wikipedia (including stored fields
and term vectors fits for highlighting) fits in ~78GB.
You’ll need even more than that in heap size...

CPU%
Heap size vs.
GC CPU %

100%

Live set

Heap size

Case in point:
The impact of in-memory index in Lucene
Lucene provides multiple index representation options
SimpleFSDirectory, NIOFSDirectory (On-disk ﬁle-based)
MMapDirectory (mapped ﬁles, better leverage of OS cache memory)
RAMDirectory (In-process, in-heap)

Comparing throughput (e.g. on English Lang. Wikipedia)
RAMDirectory ~2-3x throughput vs. MMapDirectory
Similar impact on average response.
Even higher impact with promising things in Lucene 4.0

But average is not all that matters
There is a good reason people aren’t doing this (yet...)

Why isn’t all search done with
in-memory indexes already?


The “Application Memory Wall”


Reality check: servers in 2012
Retail prices, major web server store (US $, Oct 2012)
24 vCore, 128GB server

≈ $5K


≈ $8K


≈ $14K


≈ $19K

64 vCore, 1TB server

≈ $36K

Cheap (< $1/GB/Month), and roughly linear to ~1TB
10s to 100s of GB/sec of memory bandwidth

The Application Memory Wall
A simple observation:
Application instances appear to be unable to
make effective use of modern server memory
capacities

The size of application instances as a % of a
server’s capacity is rapidly dropping


How much memory do applications need?
“640KB ought to be enough for anybody”
WRONG!
So what’s the right number?
6,400K?
64,000K?
640,000K?
6,400,000K?
64,000,000K?
There is no right number
Target moves at 50x-100x per decade

“I've said some stupid things and
some wrong things, but not that.
No one involved in computers
would ever say that a certain
amount of memory is enough for
all time …” - Bill Gates, 1996

“Tiny” application history
Assuming Moore’s Law means:
2010

“transistor counts grow at ≈2x
every ≈18 months”
It also means memory size grows
≈100x every 10 years

1990

1980

??? GB apps on 256 GB
Application
Memory Wall

2000

1GB apps on a 2 – 4 GB server

10MB apps on a 32 – 64 MB server

100KB apps on a ¼ to ½ MB Server
“Tiny”: would be “silly” to distribute


What is causing the
Application Memory Wall?
Garbage Collection is a clear and dominant cause
There seem to be practical heap size limits for
applications with responsiveness requirements
[Virtually] All current commercial JVMs will exhibit a
multi-second pause on a normally utilized 2-6GB heap.
It’s a question of “When” and “How often”, not “If”.
GC tuning only moves the “when” and the “how often” around

Root cause: The link between scale and responsiveness

What quality of GC is responsible
for the Application Memory Wall?
It is NOT about overhead or efﬁciency:
CPU utilization, bottlenecks, memory consumption and utilization

It is NOT about speed
Average speeds, 90%, even 95% speeds, are all perfectly ﬁne

It is NOT about minor GC events (right now)
GC events in the 10s of msec are usually tolerable for most apps

It is NOT about the frequency of very large pauses
It is ALL about the worst observable pause behavior
People avoid building/deploying visibly broken systems

GC Problems


Framing the discussion:
Garbage Collection at modern server scales
Modern Servers have 100s of GB of memory
Each modern x86 core (when actually used) produces
garbage at a rate of ¼ - ½ GB/sec +
We want to make use of that memory and throughput
in responsive applications
Monolithic stop-the-world operations are the cause of
the current Application Memory Wall
Even if they are done “only a few times a day”

One way to deal with Monolithic-STW GC

Hiccups&by&Time&Interval&
Max"per"Interval"

99%"

99.90%"

99.99%"

Max"

Hiccup&Dura*on&(msec)&

18000"
16000"
14000"
12000"
10000"
8000"
6000"
4000"
2000"
0"
0"

2000"

4000"

6000"

8000"

10000"

&Elapsed&Time&(sec)&

Hiccups&by&Percen*le&Distribu*on&


18000"
16000"

Max=16023.552&

14000"
12000"
10000"
8000"
6000"
4000"
2000"
0"

0%"

90%"

&

99%"

99.9%"

99.99%"

&
99.999%"
Percen*le&

99.9999%"

Another way to cope: “Creative Language”
“Guarantee a worst case of X msec, 99% of the time”
“Mostly” Concurrent, “Mostly” Incremental
Translation: “Will at times exhibit long monolithic stopthe-world pauses”
“Fairly Consistent”
Translation: “Will sometimes show results well outside
this range”
“Typical pauses in the tens of milliseconds”
Translation: “Some pauses are much longer than tens of
milliseconds”

Actually measuring things


Incontinuities in Java platform execution
Hiccups"by"Time"Interval"

Max"per"Interval"

99%"

99.90%"

99.99%"

Max"


1800"
1600"
1400"
1200"
1000"
800"
600"
400"
200"
0"
0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"


Hiccups"by"Percen@le"Distribu@on"

1800"


1600"

Max=1665.024&

1400"
1200"
1000"
800"
600"
400"
200"
0"


0%"

90%"

99%"

&
99.9%"

&
Percen*le&

99.99%"

99.999%"

But reality is simple
People deal with GC problems primarily by
keeping heap sizes artiﬁcially small.
Keeping heaps small has translated to avoiding
the use of in-process, in-memory contents
That needs to change...

Solving the problems behind the
Application Memory Wall


We needed to solve the right problems
Focus on the causes of the Application Memory Wall
Scale is artiﬁcially limited by responsiveness

Responsiveness must be unlinked from scale:
Data size, Throughput, Queries per second, Concurrent Users, ...
Heap size, Live Set size, Allocation rate, Mutation rate, ...
Responsiveness must be continually sustainable
Can’t ignore “rare” events

Eliminate all Stop-The-World Fallbacks
At modern server scales, any Stop-The-World fallback is a failure

The GC problems that needed solving

Robust Concurrent Marking
In the presence of high throughput (mutation and allocation rates)

Compaction that is not stop-the-world
E.g. stay responsive while compacting 100+GB heaps
Must be robust: not just a tactic to delay STW compaction

Young-Gen that is not stop-the-world
Stay responsive while promoting multi-GB data spikes
Surprisingly little work done in this speciﬁc area


Azul’s “C4” Collector
Continuously Concurrent Compacting Collector
Concurrent guaranteed-single-pass marker
Oblivious to mutation rate
Concurrent ref (weak, soft, ﬁnal) processing

Concurrent Compactor
Objects moved without stopping mutator
References remapped without stopping mutator
Can relocate entire generation (New, Old) in every GC cycle

Concurrent, compacting old generation
Concurrent, compacting new generation
No stop-the-world fallback
Always compacts, and always does so concurrently

Benefits


Sample responsiveness behavior

๏ SpecJBB + Slow churning 2GB LRU Cache
๏ Live set is ~2.5GB across all measurements
๏ Allocation rate is ~1.2GB/sec across all measurements

A JVM for Linux/x86 servers
ELIMINATES Garbage Collection as a concern for
enterprise applications
Decouples scale metrics from response time concerns
Transaction rate, data set size, concurrent users,
heap size, allocation rate, mutation rate, etc.
Leverages elastic memory for resilient operation

Sustainable Throughput:
The throughput achieved while
safely maintaining service levels


Unsustainable
Throughout

Instance capacity test: “Fat Portal”
HotSpot CMS: Peaks at ~ 3GB / 45 concurrent users

* LifeRay portal on JBoss @ 99.9% SLA of 5 second response times

Instance capacity test: “Fat Portal”
C4: still smooth @ 800 concurrent users


Lucene: English Language Wikipedia

RAMDirectory based index

Fun with jHiccup


Oracle HotSpot CMS, 1GB in an 8GB heap

Zing 5, 1GB in an 8GB heap

Max"per"Interval"

99%"

99.90%"


99.99%"

Max"

Max"per"Interval"

12000"
10000"
8000"
6000"
4000"
2000"
0"

2000"

2500"

Max"

20"
15"
10"
5"

500"

1000"

1500"

2000"

2500"

3000"

3500"

0"

500"

1000"


1500"

3000"

3500"




14000"

25"

Max=13156.352&

12000"



99.99%"

0"
0"

10000"
8000"
6000"
4000"
2000"
0"

99.90%"

25"



14000"

99%"

0%"


90%"

99%"

&99.9%"
&
Percen*le&

99.99%"

99.999%"

20"

Max=20.384&

15"
10"
5"
0"

0%"

90%"

99%"

&
99.9%"

&
Percen*le&

99.99%"

99.999%"

99.9999%"

Oracle HotSpot CMS, 1GB in an 8GB heap

Zing 5, 1GB in an 8GB heap

Max"per"Interval"

99%"

99.90%"


99.99%"

Max"

Max"per"Interval"

12000"

99.99%"

Max"

12000"

10000"
8000"
6000"
4000"
2000"
0"

10000"
8000"
6000"
4000"
2000"
0"

0"

500"

1000"

1500"

2000"

2500"

3000"

3500"

0"

500"

1000"


2000"

2500"

3000"

3500"


14000"

14000"


Max=13156.352&

12000"
10000"
8000"
6000"
4000"
2000"
0"

1500"




99.90%"

14000"



14000"

99%"

0%"

90%"

99%"

&99.9%"
&
Percen*le&

99.99%"

99.999%"

12000"
10000"
8000"
6000"
4000"
2000"
0"

0%"

90%"
Max=20.384&

Drawn to scale

99%"

&
99.9%"

99.99%"

&
Percen*le&

99.999%"

99.9999%"

Summary:
In-memory indexing and search
Interesting content can now ﬁt in memory
In-memory indexing makes things go fast
In-memory indexing makes new things possible
Change Batch to Interactive. Change business processes.

It is ﬁnally practical to use in-memory indexing
for interactive search applications
Zing makes this work...

Q&A
Lucene In-RAM English Lang. Wikipedia test:
http:/
/blog.mikemccandless.com/2012/07/lucene-index-in-ram-withazuls-zing-jvm.html

jHiccup:
http:/
/www.azulsystems.com/dev_resources/jhiccup

GC :
G. Tene, B. Iyengar and M. Wolf
C4: The Continuously Concurrent Compacting Collector
In Proceedings of ISMM’11, ACM, pages 79-88

Enterprise Search Summit - Speeding Up Search

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Enterprise Search Summit - Speeding Up Search

Ähnlich wie Enterprise Search Summit - Speeding Up Search (20)

Mehr von Azul Systems Inc.

Mehr von Azul Systems Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Enterprise Search Summit - Speeding Up Search