Customizing JVM settings for the needs of an application can be a tricky business, especially when running externally developed software such as Cassandra. In this talk I will share our experiences and the procedure that we have used to test and validate changes with Java tuning. We'll explore with two recent experiences: changes and monitoring of G1 garbage collection, and moving buffer objects off the heap.
For the talk, I'll discuss our tuning process at Knewton. I will share some of the challenges that we faced while identifying what we expected to learn. I'll discuss how we isolated and minimized variables across tests, the importance of the duration of these tests, and how we try to separate correlation from causation. I will demonstrate how to use and interpret the results of the custom scripts that we were driven to develop to gain visibility into our G1GC processes; these scripts will be open sourced.
About the Speaker
Carlos Monroy Senior Software Engineer, Knewton
Carlos Monroy is a senior engineer on the database team at Knewton, an education company that created an adaptive learning platform. Carlos has been developing software professionally since 1998. His experience holding multiple roles on the software lifecycle provides him a wholistic approach. Having used over a half dozen relational database engines, he has recently come over to the NoSQL side, first working with HBase and for the last three years Cassandra.
11. memtable_allocation_type
Cassandra allows to keep memtables and key cache objects in the native memory, instead of the Java
JVM heap.
- Used for data structures that continue growing with time
- Options:
- heap_buffers
- default value before Cassandra 3.0
- all the objects are kept in the JVM heap memory
- offheap_buffers
- cell name and values are moved to DirectBuffer objects
- offheap_objects
- moves the entire cell off heap, leaving only a pointer
11
12. Update memtable_allocation_type
cassandra-stress tool is a great starting point while
validating changes for the database configuration
12
But we needed to go the extra mile with an end-to-end test
- involving the rest of the dev team
- demonstrate the positive impact of the change to the
rest of the system
13. Update memtable_allocation_type
cassandra-stress tool is a great starting point while
validating changes for the database configuration
13
But we needed to go the extra mile with an end-to-end test
- involving the rest of the dev team
- demonstrate the positive impact of the change to the
rest of the system
15. Update memtable - Criteria
End-to-end
15
• Response time
– Timeouts
• Errors
• Throughput
• CPU consumption
• Memory used
Cassandra specific
• Cassandra
– Time spent for Garbage
Collection
• Collection
– Read and Write latencies
– Errors/Exceptions
16. Update memtable_allocation_type
Time used for Garbage Collection
offheap_buffers offheap_objects heap_buffers
16
Comparing garbage collection times with different values for memtable_allocation_type
17. Update memtable_allocation_type
Time used for Garbage Collection
offheap_buffers offheap_objects heap_buffers
17
Comparing garbage collection times with different values for memtable_allocation_type
20. memtable_allocation_type results
We are using offheap_buffers as it showed:
- the lowest average response time for requests
- lowest CPU usage
- lowest thread count created
- lowest write latency
*Results may vary
20
22. 22
Garbage First Garbage Collection (G1GC)
The G1 collector utilizes multiple background threads to scan through the heap
that it divides into regions.
It is named “Garbage first” (G1) gives preference to scan those regions that
contain the most garbage objects first.
This collector is turned on using the –XX:+UseG1GC flag.