Motivation and goals for off-heap storage
Off-heap features and usage
Implementation overview
Preliminary benchmarks: off-heap vs. heap
Tips and best practices
2. Agenda
• Motivation and goals for off-heap storage
• Off-heap features and usage
• Implementation overview
• Preliminary benchmarks: off-heap vs. heap
• Tips and best practices
4. Why Off-heap
•
• Increase data density and reduce memory overhead
• 50+ GB user data in one JVM
• 10+ TB user data in one cluster
• Usable out-of-box without extensive GC tuning of JVM
• Maintain existing throughput performance
6. Off-heap: How Do I Use It?
• Set the off-heap memory size for the process
– Using the new property: off-heap-memory-size
• Mark regions whose entry values should be stored off-heap
– Using the new region attribute: off-heap (false | true)
• Adjust the JVM heap memory size down accordingly
– The smaller the better; at least try to keep it below 32G
• Optionally
– Configure Resource Manager
7. Off-heap Features
• Startup options
• Interaction with other features
• Resource Manager
• Monitoring & Management
• Limitations
8. Startup Options
• --off-heap-memory-size – specifies amount of off-heap
memory to allocate
• -lock-memory – specifies to lock memory from the OS
• Example:
gfsh start server –initial-heap=10G –max-heap=10G –off-heap-
memory-size=200G –lock-memory=true
9. Off-heap Interaction with Other Features
• PDX
– Values currently copied from off-heap to create a PDXInstance
• Deltas: expensive
• Compression: compatible with off-heap
• Querying: more expensive with off-heap
• EntryEvents
– Limited availability of oldValue, newValue
• Indexes
– Functional range indexes not supported (too expensive)
10. Off-heap and Resource Manager
• Out of Memory Semantics
• Eviction and Critical Thresholds
• Resource Manager API
11. Out of Memory occurs when...
• Java heap runs out of memory
– Threads start throwing OutOfMemoryError
• Off-heap runs out of memory
– Threads start throwing OutOfOffHeapMemoryException
• => causing the Geode member to close and disconnect
– Closes the Cache to prevent reading inconsistent data
– Disconnects from the Geode cluster to prevent distribution problems
or hangs
12. Eviction and Critical Thresholds for Java Heap
• CriticalHeapPercentage
– triggers LowMemoryException for puts into heap regions
– default is 90%
– critical member informs other members that it is critical
• EvictionHeapPercentage
– triggers eviction of entries in heap regions configured with
LRU_HEAP
– default is 90% of CriticalHeapPercentage
13. Eviction and Critical Thresholds for Off-heap
• CriticalOffHeapPercentage
– triggers LowMemoryException for puts into off-heap regions
– default is 90% if –off-heap-memory-size is specified
– critical member informs other members that it is critical
• EvictionOffHeapPercentage
– triggers eviction of entries in off-heap regions configured with
LRU_HEAP
– default is 90% of CriticalOffHeapPercentage if –off-heap-memory-
size is specified
15. ResourceManager API
• GemFireCache#getResourceManager()
• com.gemstone.gemfire.cache.control.ResourceManager
– exposes getters/setters for all of the heap and off-heap threshold
percentages
– Examples:
▪ public void setCriticalOffHeapPercentage(float offHeapPercentage);
▪ public float getCriticalOffHeapPercentage();
17. Statistics
name description
compactions The total number of times off-heap memory has been compacted.
compactionTime The total time spent compacting off-heap memory.
fragmentation The percentage of off-heap memory fragmentation. Updated every time a compaction is
performed.
fragments The number of fragments of free off-heap memory. Updated every time a compaction is done.
freeMemory The amount of off-heap memory, in bytes, that is not being used.
largestFragment The largest fragment of memory found by the last compaction of off heap memory. Updated
every time a compaction is done.
maxMemory The maximum amount of off-heap memory, in bytes. This is the amount of memory allocated at
startup and does not change.
objects The number of objects stored in off-heap memory.
reads The total number of reads of off-heap memory.
usedMemory The amount of off-heap memory, in bytes, that is being used to store data.
18. MBeans
MemberMXBean
getOffHeapCompactionTime -- provides the value of the compactionTime statistic
getOffHeapFragmentation -- provides the value of the fragmentation statistic
getOffHeapFreeMemory -- provides the value of the freeMemory statistic
getOffHeapObjects -- provides the value of the objects statistic
getOffHeapUsedMemory -- provides the value of the usedMemory statistic
getOffHeapMaxMemory -- provides the value of freeMemory + usedMemory
RegionMXBean
listRegionAttributes (operation)
enableOffHeapMemory (true | false)
19. Gfsh Support for Off-heap Memory
• alter disk-store: new option "--off-heap" for setting off-heap for each
region in the disk-store
• create region: new option "--off-heap" for setting off-heap
• describe member: now displays the off-heap size
• describe offline-disk-store: now shows if a region is off-heap
• describe region: now displays the off-heap region attribute
• show metrics: Now has an offheap category. The offheap metrics
are: maxMemory, freeMemory, usedMemory, objects, fragmentation,
and compactionTime
• start server: added --lock-memory, --off-heap-memory-size, --critical-
off-heap-percentage, and --eviction-off-heap-perentage
20. Off-heap Limitations
• Maximum object size limited to slightly less than 2 GB
• All data nodes must consistently configure a region to be off-
heap
• Functional Range Indexes not supported
• Keys, subscription queue entries not stored off-heap
• Fragmentation statistic is only updated during off-heap
compactions
22. Off-heap: How are We Doing It?
• Using memory that is separate from the Java heap
– Build our own Memory Manager
– Memory Manager is very finely tuned and specific to our usage
– Avoid GC overhead
▪ Avoid copying of objects for promotion between generations
▪ Garbage Collector is a major performance killer
– Use sun.misc.Unsafe API for performance
• Optimizing code to minimize usage of heap memory
• Using off-heap as primary store instead of overflowing to it
24. Off-heap Implementation
• Memory allocated in 2GB slabs
– Max data value size: ~2GB
– Object values stored serialized; blobs stored as byte arrays
– Allocation faster for values < 128KB
▪ Controlled by a system property: gemfire.OFF_HEAP_FREE_LIST_COUNT
▪ First try to allocate from the free list; if that fails, allocate from unused memory
▪ Small values (< 8B) inlined (not using any off-heap space)
• Compaction consolidates free memory to minimize
fragmentation
– Blocks writes; best to avoid by minimizing fragmentation
25. Off-heap Implementation (cont’d)
• Allocated chunks
– Header
▪ isSerialized
▪ isCompressed
▪ Size
▪ Padding size
• Free chunks
– Header
▪ Size
▪ Address of next chunk in the list
26. What is Stored On-heap vs. Off-heap
Stored On-heap Stored Off-heap
Region Meta-Data Values
Entry Meta-Data Reference Counts
Off-Heap Addresses Lists of Free Memory Blocks
Keys WAN Queue Elements
Indexes
Subscription Queue Elements
28. Off-heap: Initial Testing Results
• 256 GB user data per node across 8 nodes for total of 2 TB
of user data
• Heap-only test worked twice as hard to produce 1/3 the
updates as test using Off-Heap
– Details on the next slide
• Succeeded in scaling up to much larger in-memory data
• Increased throughput of operations for large data sets
29. Heap vs. Off-Heap Comparison
Java Heap Off-Heap
creates/sec 30,000 45,000
updates/sec 17,000 (std dev: 2130) 51,000 (std dev: 737)
Java RSS size 50 GB 32 GB
CPU load 70% (load avg 10 cpus) 32% (load avg 5 cpus)
JVM GC ConcurrentMarkSweep ConcurrentMarkSweep
GC ms/sec 777 ms 24 ms
GC marks (GC pauses) 1 per 30 sec never
31. Off-heap Rules of Thumb
• Avoid fragmentation
– In order to avoid compaction
– Avoid usage patterns that lead to fragmentation
– Many updates of varying value size
• Avoid “unfriendly” features
– Deltas
– Functional Range Indexes
– Querying
32. Off-heap Recommendations
• Do use when
– The values are relatively uniform in size
– The values are mostly less than 128K in size
– The usage patterns involve cycles of many creates followed by
destroys or clear
– The values do not need to be frequently deserialized
• Configure all data nodes with the same off-heap-memory-
size
34. We’d appreciate your thoughts...
• Would you like an API to invoke a compaction?
• Would you like to be able to configure the slab size?
• Would you like to configure the max value size for the most
efficient off-heap allocation, or maybe the size increment?
• Anything else?
• Full spec at:
https://cwiki.apache.org/confluence/display/GEODE/Off-
Heap+Memory+Spec