SlideShare a Scribd company logo
1 of 30
Download to read offline
Performance optimization
techniques for Java code
Who am I and why should you
        trust me? 
●   Attila-Mihály Balázs
    http://hype-free.blogspot.com/
●   Former malware researcher (”low-level
    guy”)
●   Current Java dev (”high level dude”)
●   Spent the last ~6 monts optimizing a large
    (1 000 000+ LOC) legacy system
●   Will spend the next 6 months on it too (at
    least )
?
Question everything!
What's this about
●   Core principles
●   Demo 1: collections framework
●   Demo 2, 3, 4: synchronization performance
●   Demo 5: ugly code, is it worth it?
●   Demo 6, 7, 8: playing with Strings
●   Conclusions
●   Q&A
What this is not about
●   Selecting efficient algorithms
●   High level optimizations (architectural
    changes)

●   These are important too! (but require more
    effort, and we are going for the quick win
    here)
Core principles
●   Performance is a balence, and endless
    game of shifting bottlenecks, no silver
    bullets here!

                     CPU
                      CPU    Memory
                              Memory
      Your program




                     Disk
                      Disk   Network
                              Network
Perform on all levels!
●   Performance has many levels:
        –   Compiler (JIT): 5 to 6: 100%(1)
        –   Memory: L1/L2 cache, main memory
        –   Disk: cache, RAID, SSD
        –   Network: 10Mbit, 100Mbit, 1000Mbit
●   Until recently we had it easy (performance
    doubled every 18 months)
●   Now we need to do some work
(1) http://java.sun.com/performance/reference/whitepapers/6_performance.html
Core principles
●   Measure, measure, measure! (before,
    during, after).
●   Try using realistic data!
●   Watch out for the Heisenberg effect (more
    on this later)
●   Some things are not intuitive:
        –   Pop-question: if processing 1000
             messages takes 1 second, how long
             does the processing of 1 message take?
Core principles
●   Troughput
●   Latency
●   Thread context, context switching
●   Lock contention
●   Queueing theory
●   Profiling
●   Sampling
Feasibility – ”numbers everyone
        should know” (2)
●   L1 cache reference 0.5 ns
●   Branch mispredict 5 ns
●   L2 cache reference 7 ns
●   Mutex lock/unlock 100 ns
●   Main memory reference 100 ns
●   Compress 1K bytes with Zippy 10,000 ns
●   Send 2K bytes over 1 Gbps network 20,000 ns
●   Read 1 MB sequentially from memory 250,000 ns
●   Round trip within same datacenter 500,000 ns
●   Disk seek 10,000,000 ns
●   Read 1 MB sequentially from network 10,000,000 ns
●   Read 1 MB sequentially from disk 30,000,000 ns
●   Send packet CA->Netherlands->CA 150,000,000 n
 (2) http://research.google.com/people/jeff/stanford-295-talk.pdf
Feasability
●   Amdahl's law: The speedup of a program
    using multiple processors in parallel
    computing is limited by the time needed for
    the sequential fraction of the program.
Course of action
●   Have a clear (written?), measourable goal:
    operation X should take less than 100ms in
     99.9% of the cases
●   Measure (profile)
●   Is the goal met? → The End
●   Optimize hotspots → go to step 2
Tools
●   VisualVM
●   JProfiler
●   YourKit

●   Eclipse TPTP
●   Netbeans Profiler
Demo 1: collections framework
●   Name 3 things wrong with this code:


Vector<String> v1;
…
if (!v1.contains(s)) { v1.add(s); }
Demo 1: collections framework
●   Wrong data structure (list / array instead of
    set), hence slooow performance for large
    data sets (but not for small ones!)
●   Extra synchronization if used by a single
    thread only
●   Not actually thread safe! (only ”exception
    safe”)
Demo 1: lessons
●   Use existing classes
●   Use realistic sample data
●   Thread safety is hard!
●   Heisenberg (observer) effect
Demo 2, 3, 4: synchronization
        performance
●   If I have N units of work and use 4, it must
    be faster than using a single thread, right?
●   What does lock contention look like?
●   What does a ”synchronization train(wreck)”
    look like?
Demo 2, 3, 4: lessons
●   Use existing classes
        –   ReadWriteLock
        –   java.util.concurrent.*
●   Use realistic sample data (too short / too
    long units of work)
●   Sometimes throwing a threadpool at it
    makes it worse!
●   Consider using a private copy of the
    variable for each thread
Demo 5: ugly code, is it worth it?
 ●   Parsing a logfile
Demo 5: lessons
●   Sometimes yes, but always profile first!
Demo 6: String.substring
●   How are strings stored in Java?
Demo 6: Lesson
●   You can look inside the JRE when needed!
Demo 7: repetitive strings
Demo 7: Lessons
●   You shouldn't use String.intern:
        –   Slow
        –   You have to use it everywhere
        –   Needs hand-tuning
●   Use a WeakHashMap for caching (don't
    forget to synchronize!)
●   Use String.equals (not ==)
Demo 8: charsets
–   ASCII
–   ISO-8859-1
–   UTF-8
–   UTF-16
Demo 8: lessons
●   Use UTF-8 where possible
Conclusions
●   Measure twice, cut once
●   Don't trust advice you didn't test! (including
    mine)
●   Most of the time you don't need to sacrifice
    clean code for performant code
Conclusions
●   Slides:
        –   Google Groups
        –   http://hype-free.blogspot.com/
        –   x_at_y_or_z@yahoo.com
●   Source code:
        –   http://code.google.com/p/hype-
              free/source/browse/#svn/trunk/java-
              perfopt-201003
●   Profiler evaluation licenses
Resources
●   https://visualvm.dev.java.net/
●   http://www.ej-technologies.com/
●   http://blog.ej-technologies.com/
●   http://www.yourkit.com/
●   http://www.yourkit.com/docs/index.jsp
●   http://www.yourkit.com/eap/index.jsp
Thank you!

Questions?

More Related Content

What's hot

Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbers
Justin Dorfman
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
Linaro
 

What's hot (20)

Continuous Performance Regression Testing with JfrUnit
Continuous Performance Regression Testing with JfrUnitContinuous Performance Regression Testing with JfrUnit
Continuous Performance Regression Testing with JfrUnit
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
 
The journey of a symfony app from 150ms to 20ms
The journey of a symfony app from 150ms to 20msThe journey of a symfony app from 150ms to 20ms
The journey of a symfony app from 150ms to 20ms
 
Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbers
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
 
Test driving QML
Test driving QMLTest driving QML
Test driving QML
 
NRD: Nagios Result Distributor
NRD: Nagios Result DistributorNRD: Nagios Result Distributor
NRD: Nagios Result Distributor
 
Cassandra To Infinity And Beyond
Cassandra To Infinity And BeyondCassandra To Infinity And Beyond
Cassandra To Infinity And Beyond
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
 
Create Your Own Operating System
Create Your Own Operating SystemCreate Your Own Operating System
Create Your Own Operating System
 
P99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent StorageP99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent Storage
 
Deployment of the Machine Learning at the production level
Deployment of the Machine Learning at the production levelDeployment of the Machine Learning at the production level
Deployment of the Machine Learning at the production level
 
Prometheus london
Prometheus londonPrometheus london
Prometheus london
 
Stress driven development
Stress driven developmentStress driven development
Stress driven development
 
Netty training
Netty trainingNetty training
Netty training
 
Telemetry indepth
Telemetry indepthTelemetry indepth
Telemetry indepth
 
Training – Going Async
Training – Going AsyncTraining – Going Async
Training – Going Async
 
Js on-microcontrollers
Js on-microcontrollersJs on-microcontrollers
Js on-microcontrollers
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java Applications
 
Into the domain
Into the domainInto the domain
Into the domain
 

Viewers also liked

Code Optimization
Code OptimizationCode Optimization
Code Optimization
guest9f8315
 
Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇
bluedavy lin
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardware
IndicThreads
 
Code generator
Code generatorCode generator
Code generator
Tech_MX
 

Viewers also liked (20)

Code optimization
Code optimizationCode optimization
Code optimization
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
code optimization
code optimization code optimization
code optimization
 
Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardware
 
OOP in Java - Ver1.1
OOP in Java -  Ver1.1OOP in Java -  Ver1.1
OOP in Java - Ver1.1
 
Memory leak
Memory leakMemory leak
Memory leak
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
Memory Leak In java
Memory Leak In javaMemory Leak In java
Memory Leak In java
 
Java performance tuning
Java performance tuningJava performance tuning
Java performance tuning
 
Basic Block
Basic BlockBasic Block
Basic Block
 
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideBKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission Control
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunning
 
Gc in android
Gc in androidGc in android
Gc in android
 
LAS16-201: ART JIT in Android N
LAS16-201: ART JIT in Android NLAS16-201: ART JIT in Android N
LAS16-201: ART JIT in Android N
 
Online auction system srs riport
Online auction system srs  riportOnline auction system srs  riport
Online auction system srs riport
 
Code generator
Code generatorCode generator
Code generator
 
Basic Blocks and Flow Graphs
Basic Blocks and Flow GraphsBasic Blocks and Flow Graphs
Basic Blocks and Flow Graphs
 
Lex (lexical analyzer)
Lex (lexical analyzer)Lex (lexical analyzer)
Lex (lexical analyzer)
 

Similar to Performance optimization techniques for Java code

Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
slandelle
 
Speeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using StarlingSpeeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using Starling
Erik Osterman
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
koji lin
 

Similar to Performance optimization techniques for Java code (20)

Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+
 
Utopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K usersUtopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K users
 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
 
Property-based testing an open-source compiler, pflua (FOSDEM 2015)
Property-based testing an open-source compiler, pflua (FOSDEM 2015)Property-based testing an open-source compiler, pflua (FOSDEM 2015)
Property-based testing an open-source compiler, pflua (FOSDEM 2015)
 
Introduction to multicore .ppt
Introduction to multicore .pptIntroduction to multicore .ppt
Introduction to multicore .ppt
 
Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Speeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using StarlingSpeeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using Starling
 
Ratpack the story so far
Ratpack the story so farRatpack the story so far
Ratpack the story so far
 
Play Framework
Play FrameworkPlay Framework
Play Framework
 
OpenMp
OpenMpOpenMp
OpenMp
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 
SciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
The Good, the Bad and the Ugly things to do with android
The Good, the Bad and the Ugly things to do with androidThe Good, the Bad and the Ugly things to do with android
The Good, the Bad and the Ugly things to do with android
 
Java under the hood
Java under the hoodJava under the hood
Java under the hood
 
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Performance optimization techniques for Java code

  • 2. Who am I and why should you trust me?  ● Attila-Mihály Balázs http://hype-free.blogspot.com/ ● Former malware researcher (”low-level guy”) ● Current Java dev (”high level dude”) ● Spent the last ~6 monts optimizing a large (1 000 000+ LOC) legacy system ● Will spend the next 6 months on it too (at least )
  • 4. What's this about ● Core principles ● Demo 1: collections framework ● Demo 2, 3, 4: synchronization performance ● Demo 5: ugly code, is it worth it? ● Demo 6, 7, 8: playing with Strings ● Conclusions ● Q&A
  • 5. What this is not about ● Selecting efficient algorithms ● High level optimizations (architectural changes) ● These are important too! (but require more effort, and we are going for the quick win here)
  • 6. Core principles ● Performance is a balence, and endless game of shifting bottlenecks, no silver bullets here! CPU CPU Memory Memory Your program Disk Disk Network Network
  • 7. Perform on all levels! ● Performance has many levels: – Compiler (JIT): 5 to 6: 100%(1) – Memory: L1/L2 cache, main memory – Disk: cache, RAID, SSD – Network: 10Mbit, 100Mbit, 1000Mbit ● Until recently we had it easy (performance doubled every 18 months) ● Now we need to do some work (1) http://java.sun.com/performance/reference/whitepapers/6_performance.html
  • 8. Core principles ● Measure, measure, measure! (before, during, after). ● Try using realistic data! ● Watch out for the Heisenberg effect (more on this later) ● Some things are not intuitive: – Pop-question: if processing 1000 messages takes 1 second, how long does the processing of 1 message take?
  • 9. Core principles ● Troughput ● Latency ● Thread context, context switching ● Lock contention ● Queueing theory ● Profiling ● Sampling
  • 10. Feasibility – ”numbers everyone should know” (2) ● L1 cache reference 0.5 ns ● Branch mispredict 5 ns ● L2 cache reference 7 ns ● Mutex lock/unlock 100 ns ● Main memory reference 100 ns ● Compress 1K bytes with Zippy 10,000 ns ● Send 2K bytes over 1 Gbps network 20,000 ns ● Read 1 MB sequentially from memory 250,000 ns ● Round trip within same datacenter 500,000 ns ● Disk seek 10,000,000 ns ● Read 1 MB sequentially from network 10,000,000 ns ● Read 1 MB sequentially from disk 30,000,000 ns ● Send packet CA->Netherlands->CA 150,000,000 n (2) http://research.google.com/people/jeff/stanford-295-talk.pdf
  • 11. Feasability ● Amdahl's law: The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.
  • 12. Course of action ● Have a clear (written?), measourable goal: operation X should take less than 100ms in 99.9% of the cases ● Measure (profile) ● Is the goal met? → The End ● Optimize hotspots → go to step 2
  • 13. Tools ● VisualVM ● JProfiler ● YourKit ● Eclipse TPTP ● Netbeans Profiler
  • 14. Demo 1: collections framework ● Name 3 things wrong with this code: Vector<String> v1; … if (!v1.contains(s)) { v1.add(s); }
  • 15. Demo 1: collections framework ● Wrong data structure (list / array instead of set), hence slooow performance for large data sets (but not for small ones!) ● Extra synchronization if used by a single thread only ● Not actually thread safe! (only ”exception safe”)
  • 16. Demo 1: lessons ● Use existing classes ● Use realistic sample data ● Thread safety is hard! ● Heisenberg (observer) effect
  • 17. Demo 2, 3, 4: synchronization performance ● If I have N units of work and use 4, it must be faster than using a single thread, right? ● What does lock contention look like? ● What does a ”synchronization train(wreck)” look like?
  • 18. Demo 2, 3, 4: lessons ● Use existing classes – ReadWriteLock – java.util.concurrent.* ● Use realistic sample data (too short / too long units of work) ● Sometimes throwing a threadpool at it makes it worse! ● Consider using a private copy of the variable for each thread
  • 19. Demo 5: ugly code, is it worth it? ● Parsing a logfile
  • 20. Demo 5: lessons ● Sometimes yes, but always profile first!
  • 21. Demo 6: String.substring ● How are strings stored in Java?
  • 22. Demo 6: Lesson ● You can look inside the JRE when needed!
  • 24. Demo 7: Lessons ● You shouldn't use String.intern: – Slow – You have to use it everywhere – Needs hand-tuning ● Use a WeakHashMap for caching (don't forget to synchronize!) ● Use String.equals (not ==)
  • 25. Demo 8: charsets – ASCII – ISO-8859-1 – UTF-8 – UTF-16
  • 26. Demo 8: lessons ● Use UTF-8 where possible
  • 27. Conclusions ● Measure twice, cut once ● Don't trust advice you didn't test! (including mine) ● Most of the time you don't need to sacrifice clean code for performant code
  • 28. Conclusions ● Slides: – Google Groups – http://hype-free.blogspot.com/ – x_at_y_or_z@yahoo.com ● Source code: – http://code.google.com/p/hype- free/source/browse/#svn/trunk/java- perfopt-201003 ● Profiler evaluation licenses
  • 29. Resources ● https://visualvm.dev.java.net/ ● http://www.ej-technologies.com/ ● http://blog.ej-technologies.com/ ● http://www.yourkit.com/ ● http://www.yourkit.com/docs/index.jsp ● http://www.yourkit.com/eap/index.jsp