SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Qcon London 2015.
Peter Lawrey
Higher Frequency Trading Ltd
Responding rapidly when you have
100+ GB data sets in Java
Reactive system design
• Responsive (predictable and sufficient response time)
• Resilient (can cope with failures)
• Elastic (can employ more hardware resources and grow cost
effectively).
Reactive system design
Java responds *much* faster if you can keep your data in memory
Standard JVMs scale well of a specific range of memory sizes. From
• 100 MB – it is hard to make a JVM use less memory than this
(with libraries etc)
• 32 GB – above this the memory efficiency drops and you see
significant GC pauses times.
With larger data sizes, you have to be more aware of how memory is
being used, the lifecycle of your objects and which tools will best
help you utilise more memory effectively.
Reactive system design (Resilient)
The larger the node, the longer the recovery time. It if can
gracefully rebuild at a rate of 100 MB/s (without impact production
too much) the recover time can look like this.
Data Set to recover Time to replicate at 100 MB/s
10 GB < 2 minutes
100 GB 17 minutes
1 TB 3 hours
10 TB 28 hours
100 TB 12 days
1 PB 4 months
Reactive Memory Design
Knowing the limitations of your address space.
• 31–bit memory sizes (up to 2 GB)
• 35-bit memory sizes (up to 32 GB)
• 36-bit memory sizes (up to 64 GB)
• 40-bit memory sizes (up to 1 TB)
• 48-bit memory sizes (up to 256 TB)
• Beyond.
32 bit operating system (31-bit heap)
Speedment SQL Relector
• Incrementally replicates your SQL database into Java.
• Access to SQL data with in memory speeds
Java 7 memory layout
Compress Oops in Java 7 (35-bit)
Using the default of
–XX:+UseCompressedOops
• In a 64-bit JVM, it can use “compressed” memory references.
• This allows the heap to be up to 32 GB without the overhead of 64-
bit object references. The Oracle/OpenJDK JVM still uses 64-bit
class references by default.
• As all object must be 8-byte aligned, the lower 3 bits of the
address are always 000 and don’t need to be stored. This allows
the heap to reference 4 billion * 8-bytes or 32 GB.
• Uses 32-bit references.
Compressed Oops with 8 byte alignment.
Java 8 memory layout
Compress Oops in Java 8 (36 bits)
Using the default of
–XX:+UseCompressedOops
–XX:ObjectAlignmentInBytes=16
• In a 64-bit JVM, it can use “compressed” memory references.
• This allows the heap to be up to 64 GB without the overhead of 64-
bit object references. The Oracle/OpenJDK JVM still uses 64-bit
class references by default.
• As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of
the address are always zeros and don’t need to be stored. This
allows the heap to reference 4 billion * 16-bytes or 64 GB.
• Uses 32-bit references.
64-bit references in Java (100 GB?)
• A small but significant overhead on main memory use.
• Reduces the efficiency of CPU caches as less objects can fit in.
• Can address up to the limit of the free memory. Limited to main
memory.
• GC pauses become a real concern and can take tens of second or
many minutes.
Concurrent Mark Sweep (100 GB?)
64-bit references in Java with a Concurrent Collector
• A small but significant overhead on every memory access to
support concurrent collection.
• Azul Zing with a fully concurrent collector. They can get worst case
pauses down to 1 to 10 milli-seconds.
• RedHat is developing a most concurrent collector (….) which can
support heap sizes around 100 GB with sub-second pauses.
Azul Zing Concurrent Collector (100s GB)
Azul Zing Concurrent Collector (100s GB)
• Can dramatically reduce GC pauses.
• Makes it easier to see and solve non GC pauses.
• Reduced development time to get a system to meet your SLAs.
• Can have lower throughputs.
Terracotta BigMemory
• Scale to multiple machines. Can utilize more smaller machines.
Terracotta BigMemory
• Uses off heap memory to store the bulk of the data.
Hazelcast High Density Memory Store
• Advanced memory caching
solution.
• Utilizes all main memory.
• Supports a wide range of
distributed collections.
NUMA Regions (~40 bits)
• Large machine are limited in how large a single bank of memory
can be. This varies based on the architecture.
• Ivy and Sandy bridge Xeon processors are limited to addressing 40
bits of real memory.
• In Haswell this has been lifted to 46-bits.
• Each Socket has “local” access to a bank of memory, however to
access other bank it may need to use a bus. This is much slower.
• The GC of a JVM can perform very poorly if it doesn’t sit within one
NUMA region. Ideally you want a JVM to use just one NUMA
region.
NUMA Regions (~40 bits)
Virtual address space (48-bit)
Virtual address space (48-bit)
Memory Mapped files (48+ bits)
• Memory mappings are not limited to main memory size.
• 64-bit OS support 128 TiB to 256 TiB virtual memory at once.
• For larger data sizes, memory mapping need to be managed and
cached manually.
• Can be shared between processes.
• A library can hide the 48-bit limitation by caching memory
mapping.
Memory Mapped files.
• No need to have a monolithic JVM just because you want in
memory access to all the data.
Peta Byte JVMs (50+ bits)
• If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you
will have received over 1 PB.
• Managing this much data in large servers is more complex than
your standard JVM.
• Replication is critical. Large complex systems, are more likely to
fail and take longer to recover.
• You can have systems which cannot be recovered in the normal
way. i.e. Unless you recover faster than new data is added, you will
never catch up.
Peta Byte JVMs (50+ bits)
Peta Byte JVMs (50+ bits)
Questions and Answers
peter.lawrey@higherfrequencytrading.com
@PeterLawrey
http://higherfrequencytrading.com

Weitere ähnliche Inhalte

Was ist angesagt?

Determinism in finance
Determinism in financeDeterminism in finance
Determinism in financePeter Lawrey
 
Introduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupIntroduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupPeter Lawrey
 
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Peter Lawrey
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 
Chronicle accelerate building a digital currency
Chronicle accelerate   building a digital currencyChronicle accelerate   building a digital currency
Chronicle accelerate building a digital currencyPeter Lawrey
 
Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Peter Lawrey
 
Java in High Frequency Trading
Java in High Frequency TradingJava in High Frequency Trading
Java in High Frequency TradingViktor Sovietov
 
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMQCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMAzul Systems, Inc.
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...DataStax
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redisDaeMyung Kang
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsJeff Jirsa
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsTanya Denisyuk
 
Chronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferenceChronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferencePeter Lawrey
 

Was ist angesagt? (20)

Determinism in finance
Determinism in financeDeterminism in finance
Determinism in finance
 
Introduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupIntroduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users Group
 
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Chronicle accelerate building a digital currency
Chronicle accelerate   building a digital currencyChronicle accelerate   building a digital currency
Chronicle accelerate building a digital currency
 
Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)
 
Java in High Frequency Trading
Java in High Frequency TradingJava in High Frequency Trading
Java in High Frequency Trading
 
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMQCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redis
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIs
 
Cassandra compaction
Cassandra compactionCassandra compaction
Cassandra compaction
 
Chronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferenceChronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conference
 

Andere mochten auch

Streams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyStreams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyPeter Lawrey
 
Legacy lambda code
Legacy lambda codeLegacy lambda code
Legacy lambda codePeter Lawrey
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016Peter Lawrey
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Peter Lawrey
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examplesPeter Lawrey
 
Big Data for Finance – Challenges in High-Frequency Trading
Big Data for Finance – Challenges in High-Frequency TradingBig Data for Finance – Challenges in High-Frequency Trading
Big Data for Finance – Challenges in High-Frequency TradingThink Big, a Teradata Company
 

Andere mochten auch (6)

Streams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyStreams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the ugly
 
Legacy lambda code
Legacy lambda codeLegacy lambda code
Legacy lambda code
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
 
Big Data for Finance – Challenges in High-Frequency Trading
Big Data for Finance – Challenges in High-Frequency TradingBig Data for Finance – Challenges in High-Frequency Trading
Big Data for Finance – Challenges in High-Frequency Trading
 

Ähnlich wie Responding rapidly when you have 100+ GB data sets in Java

Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesHazelcast
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Marco Tusa
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014marvin herrera
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...npinto
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...In-Memory Computing Summit
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engineBhuvaneshwaran R
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionSearce Inc
 

Ähnlich wie Responding rapidly when you have 100+ GB data sets in Java (20)

Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 

Kürzlich hochgeladen

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Kürzlich hochgeladen (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Responding rapidly when you have 100+ GB data sets in Java

  • 1. Qcon London 2015. Peter Lawrey Higher Frequency Trading Ltd Responding rapidly when you have 100+ GB data sets in Java
  • 2. Reactive system design • Responsive (predictable and sufficient response time) • Resilient (can cope with failures) • Elastic (can employ more hardware resources and grow cost effectively).
  • 3. Reactive system design Java responds *much* faster if you can keep your data in memory Standard JVMs scale well of a specific range of memory sizes. From • 100 MB – it is hard to make a JVM use less memory than this (with libraries etc) • 32 GB – above this the memory efficiency drops and you see significant GC pauses times. With larger data sizes, you have to be more aware of how memory is being used, the lifecycle of your objects and which tools will best help you utilise more memory effectively.
  • 4. Reactive system design (Resilient) The larger the node, the longer the recovery time. It if can gracefully rebuild at a rate of 100 MB/s (without impact production too much) the recover time can look like this. Data Set to recover Time to replicate at 100 MB/s 10 GB < 2 minutes 100 GB 17 minutes 1 TB 3 hours 10 TB 28 hours 100 TB 12 days 1 PB 4 months
  • 5. Reactive Memory Design Knowing the limitations of your address space. • 31–bit memory sizes (up to 2 GB) • 35-bit memory sizes (up to 32 GB) • 36-bit memory sizes (up to 64 GB) • 40-bit memory sizes (up to 1 TB) • 48-bit memory sizes (up to 256 TB) • Beyond.
  • 6. 32 bit operating system (31-bit heap)
  • 7. Speedment SQL Relector • Incrementally replicates your SQL database into Java. • Access to SQL data with in memory speeds
  • 8. Java 7 memory layout
  • 9. Compress Oops in Java 7 (35-bit) Using the default of –XX:+UseCompressedOops • In a 64-bit JVM, it can use “compressed” memory references. • This allows the heap to be up to 32 GB without the overhead of 64- bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default. • As all object must be 8-byte aligned, the lower 3 bits of the address are always 000 and don’t need to be stored. This allows the heap to reference 4 billion * 8-bytes or 32 GB. • Uses 32-bit references.
  • 10. Compressed Oops with 8 byte alignment.
  • 11. Java 8 memory layout
  • 12. Compress Oops in Java 8 (36 bits) Using the default of –XX:+UseCompressedOops –XX:ObjectAlignmentInBytes=16 • In a 64-bit JVM, it can use “compressed” memory references. • This allows the heap to be up to 64 GB without the overhead of 64- bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default. • As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of the address are always zeros and don’t need to be stored. This allows the heap to reference 4 billion * 16-bytes or 64 GB. • Uses 32-bit references.
  • 13. 64-bit references in Java (100 GB?) • A small but significant overhead on main memory use. • Reduces the efficiency of CPU caches as less objects can fit in. • Can address up to the limit of the free memory. Limited to main memory. • GC pauses become a real concern and can take tens of second or many minutes.
  • 15. 64-bit references in Java with a Concurrent Collector • A small but significant overhead on every memory access to support concurrent collection. • Azul Zing with a fully concurrent collector. They can get worst case pauses down to 1 to 10 milli-seconds. • RedHat is developing a most concurrent collector (….) which can support heap sizes around 100 GB with sub-second pauses.
  • 16. Azul Zing Concurrent Collector (100s GB)
  • 17. Azul Zing Concurrent Collector (100s GB) • Can dramatically reduce GC pauses. • Makes it easier to see and solve non GC pauses. • Reduced development time to get a system to meet your SLAs. • Can have lower throughputs.
  • 18. Terracotta BigMemory • Scale to multiple machines. Can utilize more smaller machines.
  • 19. Terracotta BigMemory • Uses off heap memory to store the bulk of the data.
  • 20. Hazelcast High Density Memory Store • Advanced memory caching solution. • Utilizes all main memory. • Supports a wide range of distributed collections.
  • 21. NUMA Regions (~40 bits) • Large machine are limited in how large a single bank of memory can be. This varies based on the architecture. • Ivy and Sandy bridge Xeon processors are limited to addressing 40 bits of real memory. • In Haswell this has been lifted to 46-bits. • Each Socket has “local” access to a bank of memory, however to access other bank it may need to use a bus. This is much slower. • The GC of a JVM can perform very poorly if it doesn’t sit within one NUMA region. Ideally you want a JVM to use just one NUMA region.
  • 25. Memory Mapped files (48+ bits) • Memory mappings are not limited to main memory size. • 64-bit OS support 128 TiB to 256 TiB virtual memory at once. • For larger data sizes, memory mapping need to be managed and cached manually. • Can be shared between processes. • A library can hide the 48-bit limitation by caching memory mapping.
  • 26. Memory Mapped files. • No need to have a monolithic JVM just because you want in memory access to all the data.
  • 27. Peta Byte JVMs (50+ bits) • If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you will have received over 1 PB. • Managing this much data in large servers is more complex than your standard JVM. • Replication is critical. Large complex systems, are more likely to fail and take longer to recover. • You can have systems which cannot be recovered in the normal way. i.e. Unless you recover faster than new data is added, you will never catch up.
  • 28. Peta Byte JVMs (50+ bits)
  • 29. Peta Byte JVMs (50+ bits)