SlideShare ist ein Scribd-Unternehmen logo
1 von 121
Downloaden Sie, um offline zu lesen
What the CRaC...
Coordinated Restore at Checkpoint
on the Java Virtual Machine
Gerrit Grunwald | Developer Advocate | Azul
ABOUTME.
JAVA IS
GREAT...
VIBRANT
COMMUNITY...
HUNDREDS OF


JUGs...
THOUSANDS OF


FOSS PROJECTS...
JAVA VIRTUAL


MACHINE
JAVA VIRTUAL


MACHINE
HOW DOES


IT WORK...
MyClass.java MyClass.class
SOURCE CODE COMPILER BYTE CODE
MyClass.class
BYTE CODE CLASS LOADER JVM MEMORY
JVM MEMORY
􀊫􀈒
EXECUTION ENGINE
EXECUTION ENGINE
Interpreter C1 JIT


Compiler


(client)
C2 JIT


Compiler


(server)
􀊫
Pro
fi
ler
􀈒
Garbage


Collector
EXECUTION ENGINE
Interpreter C1 JIT


Compiler


(client)
C2 JIT


Compiler


(server)
instruction set of CPU
Detects hot spots by


counting method calls
JVM
THRESHOLD


REACHED
Pass the hot spot methods


to C1 JIT Compiler
JVM C1 JIT


COMPILER
Compiles code as quickly


as possible with low optimisation
C1 JIT


COMPILER
Compiles code as quickly


as possible with low optimisation
Pro
fi
les the running code


(detecting hot code)
JVM
THRESHOLD


REACHED
Pass the "hot" code


to C2 JIT Compiler
JVM C2 JIT


COMPILER
Compiles code with best


optimisation possible (slower)
TIERED
COMPILATION
Level 0 - Interpreted code


Level 1 - C1 compiled code (no pro
fi
ling)


Level 2 - C1 compiled code (basic pro
fi
ling)


Level 3 - C1 compiled code (full pro
fi
ling, ~35% overhead)


Level 4 - C2 compiled code (uses pro
fi
le data from previous steps)
LEVELS OF EXECUTION
Level 0 - Interpreted code


Level 1 - C1 compiled code (no pro
fi
ling)


Level 2 - C1 compiled code (basic pro
fi
ling)


Level 3 - C1 compiled code (full pro
fi
ling, ~35% overhead)


Level 4 - C2 compiled code (uses pro
fi
le data from previous steps)
LEVELS OF EXECUTION
PREFERRED
EXECUTION
CYCLE
EXECUTION CYCLE
Slow
(Execution Level 0)
EXECUTION CYCLE
Finding


"hotspots"
Slow
(Execution Level 0)
EXECUTION CYCLE
Fast compile,


low optimisation
(Execution Level 3)
Finding


"hotspots"
Slow
(Execution Level 0)
EXECUTION CYCLE
Finding


"hot code"
Fast compile,


low optimisation
(Execution Level 3)
Finding


"hotspots"
Slow
(Execution Level 0)
EXECUTION CYCLE
Slower compile,


high optimisation
(Execution Level 4)
Finding


"hot code"
Fast compile,


low optimisation
(Execution Level 3)
Finding


"hotspots"
Slow
(Execution Level 0)
EXECUTION CYCLE
Can happen


(performance hit)
Slower compile,


high optimisation
(Execution Level 4)
Finding


"hot code"
Fast compile,


low optimisation
(Execution Level 3)
Finding


"hotspots"
Slow
(Execution Level 0)
DEOPTIMISATION
DEOPTIMISATION
if (value > 0)
result = 3; result = 0;
print(result);
TRUE FALSE
CODE
DEOPTIMISATION
if (value > 0)
result = 3; result = 0;
print(result);
TRUE FALSE
CODE
Hot Path (pro
fi
led)
DEOPTIMISATION
if (value > 0)
result = 3; result = 0;
print(result);
TRUE FALSE
CODE
guard (value > 0)
result = 3;


print(result);
DEOPTIMISE
TRUE FALSE
OPTIMIZED CODE
DEOPTIMISATION
if (value > 0)
result = 3; result = 0;
print(result);
TRUE FALSE
CODE
guard (value > 0)
result = 3;


print(result);
DEOPTIMISE
TRUE FALSE
OPTIMIZED CODE
value = 3
DEOPTIMISATION
if (value > 0)
result = 3; result = 0;
print(result);
TRUE FALSE
CODE
guard (value > 0)
result = 3;


print(result);
DEOPTIMISE
TRUE FALSE
OPTIMIZED CODE
value = 0
THAT'S
GREAT...
...BUT...
...IT TAKES
TIME !
JVM PERFORMANCE GRAPH
JVM STARTUP
GarbageCollector pauses
Deoptimisations
MICROSERVICE
ENVIRONMENT
MICROSERVICE ENVIRONMENT
FIRST RUN
JVM STARTUP
SECOND RUN
JVM STARTUP
THIRD RUN
JVM STARTUP
WOULDN'T IT BE GREAT...?
FIRST RUN
JVM STARTUP
SECOND RUN
NO STARTUP OVERHEAD
THIRD RUN
NO STARTUP OVERHEAD
SOLUTIONS...?
CLASS DATA
SHARING
WHAT ABOUT CDS?
Class Data Sharing (App/Dynamic)


Dump internal class representation into
fi
le


Shared on each JVM start (CDS)


No optimization or hotspot detection


Only reduces loading time of classes


Start up could be up to 1-2 seconds faster
MyClass.class
BYTE CODE CLASS LOADER JVM MEMORY
CDS
AHEAD OF TIME
COMPILATION
WHY NOT USE AOT?
No interpreting bytecodes


No analysis of hotspots


No runtime compilation of code


Start at full speed, straight away


GraalVM native image does that
PROBLEM SOLVED...?
NOT SO FAST...
AOT is, by de
fi
nition, static


Code is compiled before it is run


Compiler has no knowledge of how the
code will actually run


Pro
fi
le Guided Optimisation (PGO)
can partially help
JVM PERFORMANCE GRAPH
AOT Compiled Code
AOT Compiled Code with Profile Guided Optimisation
Needs to run once for pro
fi
ling
AOT VS JIT
Limited use of method inlining


No runtime bytecode generation


Re
fl
ection is possible but complicated


Unable to use speculative optimisations


Overall performance will typically be lower


Deployed env != Development env.


Full speed from the start


No overhead to compile code at runtime


Small memory footprint
Can use aggressive method inlining at runtime


Can use runtime bytecode generation


Re
fl
ection is simple


Can use speculative optimisations


Overall performance will typically be higher


Deployed env. == Development env.


Requires more time to start up


Overhead to compile code at runtime


Larger memory footprint
AOT JIT
JIT DISADVANTAGES
Requires more time to start up


CPU overhead to compile code at runtime


Larger memory footprint
AOT JIT
Reduced


Max Latency
Peak Throughput
Small Packaging
Startup Speed
Low Memory


Footprint
(Chart by Thomas Wuerthinger)
A DIFFERENT


APPROACH
CRIU
CHECKPOINT RESTORE IN USERSPACE
Linux project


Part of kernel >= 3.11 (2013)


Freeze a running container/application


Checkpoint its state to disk


Restore the container/application from the saved data.


Used by/integrated in OpenVZ, LXC/LXD, Docker,
Podman and others
CHECKPOINT RESTORE IN UUSERSPACE
Mainly implemented in userspace (not kernel)


Main goal is migration of containers for microservices


Heavily relies on /proc
fi
le system


It can checkpoint:


Processes and threads


Application memory, memory mapped
fi
les and shared memory


Open
fi
les, pipes and FIFOs


Sockets


Interprocess communication channels


Timers and signals


Can rebuild TCP connection from one side without need to exchange packets
CHECKPOINT RESTORE IN UUSERSPACE
Restart from saved state on another machine
(open
fi
les, shared memory etc.)


Start multiple instances of same state on same machine
(PID will be restored which will lead to problems)


A Java Virtual Machine would assume it was continuing its tasks
(very dif
fi
cult to use effectively)
CRIU CHALLENGES
CRaC
Coordinated Restore at Checkpoint
CRaC
CRIU comes bundled with the JDK


Heap is cleaned, compacted (JVM is in a safe state)


Throws CheckpointException (in case of open
fi
les/sockets)


Comes with a simple API


Creates checkpoints using code or jcmd


Uses JVM safepoint mechanism (to stop threads etc.)
openjdk.org/projects/crac
Lead by Anton Kozlov (Azul)
RUNNING APPLICATION
Aware of checkpoint


being created
RUNNING APPLICATION
Aware of restore


happening
CRaC has more restrictions (e.g. no open
fi
les, sockets etc.)
CRaC
CRaC
START
>java -XX:CRaCCheckpointTo=PATH -jar app.jar


RESTORE
>java -XX:CRaCRestoreFrom=PATH
CRaC API
<<interface>>
Resource
beforeCheckpoint()
afterRestore()
CRaC uses Resources that can be
noti
fi
ed about a Checkpoint and
Restore


Classes in application code
implement the Resource interface


The application receives callbacks
during checkpointing and
restoring


Makes it possible to close/restore
resources (e.g. open
fi
les, sockets)
CRaC API
Resource objects need to be registered with a Context so that they
can receive noti
fi
cations


There is a global Context accessible via the static
getGlobalContext() method of the Core class
CRaC API
<<interface>>
Resource
beforeCheckpoint()
afterRestore()
Core
getGlobalContext()
CRaC API
<<abstract>>
Context
register(Resource)
CRaC API
The global Context maintains a list of Resource objects


The beforeCheckpoint() methods are called in the reverse order the
Resource objects were added


The afterRestore() methods are called in the order the Resource
objects were added


Checkpoint and restore can be done programmatically by using
Core.checkpointRestore()


Checkpoint can also be created by jcmd
SIMPLE


EXAMPLE
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
yes
Cache
FAST
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Calculate if prime
no
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Calculate if prime
Result
no
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Calculate if prime
Result
no
Cache
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Calculate if prime
Result
no
Cache
SLOW
Scheduled Thread
Executed every


5 seconds
SIMPLE EXAMPLE
Loop 100000x
Random number


1 - 100000
isPrime(number)
Is result for number
in cache?
Calculate if prime
Result
no
Cache
Scheduled Thread
Executed every


5 seconds
timeout 50 sec
SIMPLE EXAMPLE
First run is slow because the cache is empty


Time to check 100 000 random numbers for
prime should decrease with every run
Application Performance
0
0.2
0.4
0.6
0.8
1
Iterations
1 2 3 4 5 6 7 8 9 10 11 12
SIMPLE EXAMPLE
DEMO...
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


>
De
fi
nes where to store
fi
les


for the checkpoint
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


>
SIMPLE EXAMPLE
Slow because cache is not
fi
lled yet
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)
>
SIMPLE EXAMPLE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)
>
SIMPLE EXAMPLE
Fast because cache is
fi
lled
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)
SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)
SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint CREATE CHECKPOINT
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
RESTORE
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


afterRestore() called in GenericCache


afterRestore() called in Main


12. Run: 23 ms (99999 elements cached, 100.0%)


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


afterRestore() called in GenericCache


afterRestore() called in Main


12. Run: 23 ms (99999 elements cached, 100.0%)


SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
Continues to run...fast
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


afterRestore() called in GenericCache


afterRestore() called in Main


12. Run: 23 ms (99999 elements cached, 100.0%)


13. Run: 21 ms (99995 elements cached, 100.0%)
SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
>java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar


Running on CRaC (PID 4240)


First run will take up to 30 seconds...


Register Resource: GenericCache


Register Resource: Main


1. Run: 27863 ms (63303 elements cached, 63.3%)


2. Run: 10611 ms (86580 elements cached, 86.6%)


3. Run: 3965 ms (95161 elements cached, 95.2%)


4. Run: 1428 ms (98245 elements cached, 98.2%)


5. Run: 533 ms (99340 elements cached, 99.3%)


6. Run: 207 ms (99750 elements cached, 99.8%)


7. Run: 87 ms (99898 elements cached, 99.9%)


8. Run: 42 ms (99960 elements cached, 100.0%)


9. Run: 25 ms (99986 elements cached, 100.0%)


10. Run: 20 ms (99992 elements cached, 100.0%)


11. Run: 17 ms (99999 elements cached, 100.0%)


beforeCheckpoint() called in Main


beforeCheckpoint() called in GenericCache


CR: Checkpoint ...


Killed


>java -XX:CRaCRestoreFrom=/crac-files


afterRestore() called in GenericCache


afterRestore() called in Main


12. Run: 23 ms (99999 elements cached, 100.0%)


13. Run: 21 ms (99995 elements cached, 100.0%)


14. Run: 24 ms (99997 elements cached, 100.0%)
SIMPLE EXAMPLE
>jcmd crac4.jar JDK.checkpoint


4240:


Command executed successfully


>
SHELL 1 SHELL 2
FROM THE COMMAND LINE:
>jcmd YOUR_AWESOME.jar JDK.checkpoint
CREATING A CHECKPOINT
CREATING A CHECKPOINT
FROM CODE:
Core.checkpointRestore();
github.com/HanSolo/crac4
OK...BUT
HOW GOOD IS IT...?
Time to
fi
rst opera
ti
on
Spring-Boot
Micronaut
Quarkus
xml-transform
[ms]
0 1250 2500 3750 5000
4 352
980
1 001
3 898
OpenJDK
Time to
fi
rst opera
ti
on
Spring-Boot
Micronaut
Quarkus
xml-transform
[ms]
0 1250 2500 3750 5000
53
33
46
38
4 352
980
1 001
3 898
OpenJDK OpenJDK on CRaC
SUMMARY...
SUMMARY...
CRaC is a way to pause a JVM based application


And restore it at some point in the future


Bene
fi
t is potentially extremely fast time to full performance level


Eleminates the need for hotspot identi
fi
cation, method compiles,
recompiles and deoptimisations


Improved throughput from start


CRaC is an OpenJDK project


CRaC can save infrastructure cost
CPU
Utilization
0 %
25 %
50 %
75 %
100 %
Time
INFRASTRUCTURE COST
Checkpoint
JVM startup time
Interpretation +


Compilation Overhead
Start after restore
Eliminates startup time


Eliminates cpu overhead
github.com/CRaC
wiki.openjdk.org/display/crac
DEMO...
THANK
YOU

Weitere ähnliche Inhalte

Was ist angesagt?

The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
gRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at SquaregRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at SquareApigee | Google Cloud
 
Distributed load testing with k6
Distributed load testing with k6Distributed load testing with k6
Distributed load testing with k6Thijs Feryn
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservicesThomas Graf
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scaleVinay Kumar Chella
 
Rust system programming language
Rust system programming languageRust system programming language
Rust system programming languagerobin_sy
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityThomas Graf
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examplesPeter Lawrey
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffsEvan Lin
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraScyllaDB
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with NginxMarian Marinov
 
TRex Realistic Traffic Generator - Stateless support
TRex  Realistic Traffic Generator  - Stateless support TRex  Realistic Traffic Generator  - Stateless support
TRex Realistic Traffic Generator - Stateless support Hanoch Haim
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Fast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaFast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaCharles Nutter
 

Was ist angesagt? (20)

The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
gRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at SquaregRPC: The Story of Microservices at Square
gRPC: The Story of Microservices at Square
 
Distributed load testing with k6
Distributed load testing with k6Distributed load testing with k6
Distributed load testing with k6
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scale
 
Rust system programming language
Rust system programming languageRust system programming language
Rust system programming language
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffs
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache Cassandra
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with Nginx
 
TRex Realistic Traffic Generator - Stateless support
TRex  Realistic Traffic Generator  - Stateless support TRex  Realistic Traffic Generator  - Stateless support
TRex Realistic Traffic Generator - Stateless support
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Fast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaFast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible Java
 

Ähnlich wie What the CRaC - Superfast JVM startup

"Load Testing Distributed Systems with NBomber 4.0", Anton Moldovan
"Load Testing Distributed Systems with NBomber 4.0",  Anton Moldovan"Load Testing Distributed Systems with NBomber 4.0",  Anton Moldovan
"Load Testing Distributed Systems with NBomber 4.0", Anton MoldovanFwdays
 
Silicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsSilicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsAzul Systems, Inc.
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects Andrey Karpov
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewCassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewJoshua McKenzie
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internalsaaronmorton
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time OptimizationKan-Ru Chen
 
Code Red Security
Code Red SecurityCode Red Security
Code Red SecurityAmr Ali
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVMSylvain Wallez
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performanceaaronmorton
 
GroovyServ - Technical Part
GroovyServ - Technical PartGroovyServ - Technical Part
GroovyServ - Technical PartYasuharu Nakano
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streamingphanleson
 
Why async matters
Why async mattersWhy async matters
Why async matterstimbc
 
php & performance
 php & performance php & performance
php & performancesimon8410
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapiScott Miao
 
Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Fwdays
 
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollectivePuppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollectivePuppet
 

Ähnlich wie What the CRaC - Superfast JVM startup (20)

"Load Testing Distributed Systems with NBomber 4.0", Anton Moldovan
"Load Testing Distributed Systems with NBomber 4.0",  Anton Moldovan"Load Testing Distributed Systems with NBomber 4.0",  Anton Moldovan
"Load Testing Distributed Systems with NBomber 4.0", Anton Moldovan
 
Silicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsSilicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM Mechanics
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Java On CRaC
Java On CRaCJava On CRaC
Java On CRaC
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewCassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, Overview
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internals
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
 
Code Red Security
Code Red SecurityCode Red Security
Code Red Security
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performance
 
GroovyServ - Technical Part
GroovyServ - Technical PartGroovyServ - Technical Part
GroovyServ - Technical Part
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streaming
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
Why async matters
Why async mattersWhy async matters
Why async matters
 
php & performance
 php & performance php & performance
php & performance
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 
Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"
 
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollectivePuppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
 

Kürzlich hochgeladen

Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKUXDXConf
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 

Kürzlich hochgeladen (20)

Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 

What the CRaC - Superfast JVM startup

  • 1. What the CRaC... Coordinated Restore at Checkpoint on the Java Virtual Machine
  • 2. Gerrit Grunwald | Developer Advocate | Azul ABOUTME.
  • 10. MyClass.class BYTE CODE CLASS LOADER JVM MEMORY
  • 12. EXECUTION ENGINE Interpreter C1 JIT Compiler (client) C2 JIT Compiler (server) 􀊫 Pro fi ler 􀈒 Garbage Collector
  • 13. EXECUTION ENGINE Interpreter C1 JIT Compiler (client) C2 JIT Compiler (server)
  • 14. instruction set of CPU Detects hot spots by counting method calls JVM THRESHOLD REACHED
  • 15. Pass the hot spot methods to C1 JIT Compiler JVM C1 JIT COMPILER Compiles code as quickly as possible with low optimisation
  • 16. C1 JIT COMPILER Compiles code as quickly as possible with low optimisation Pro fi les the running code (detecting hot code) JVM THRESHOLD REACHED
  • 17. Pass the "hot" code to C2 JIT Compiler JVM C2 JIT COMPILER Compiles code with best optimisation possible (slower)
  • 19. Level 0 - Interpreted code Level 1 - C1 compiled code (no pro fi ling) Level 2 - C1 compiled code (basic pro fi ling) Level 3 - C1 compiled code (full pro fi ling, ~35% overhead) Level 4 - C2 compiled code (uses pro fi le data from previous steps) LEVELS OF EXECUTION
  • 20. Level 0 - Interpreted code Level 1 - C1 compiled code (no pro fi ling) Level 2 - C1 compiled code (basic pro fi ling) Level 3 - C1 compiled code (full pro fi ling, ~35% overhead) Level 4 - C2 compiled code (uses pro fi le data from previous steps) LEVELS OF EXECUTION PREFERRED
  • 24. EXECUTION CYCLE Fast compile, low optimisation (Execution Level 3) Finding "hotspots" Slow (Execution Level 0)
  • 25. EXECUTION CYCLE Finding "hot code" Fast compile, low optimisation (Execution Level 3) Finding "hotspots" Slow (Execution Level 0)
  • 26. EXECUTION CYCLE Slower compile, high optimisation (Execution Level 4) Finding "hot code" Fast compile, low optimisation (Execution Level 3) Finding "hotspots" Slow (Execution Level 0)
  • 27. EXECUTION CYCLE Can happen (performance hit) Slower compile, high optimisation (Execution Level 4) Finding "hot code" Fast compile, low optimisation (Execution Level 3) Finding "hotspots" Slow (Execution Level 0)
  • 29. DEOPTIMISATION if (value > 0) result = 3; result = 0; print(result); TRUE FALSE CODE
  • 30. DEOPTIMISATION if (value > 0) result = 3; result = 0; print(result); TRUE FALSE CODE Hot Path (pro fi led)
  • 31. DEOPTIMISATION if (value > 0) result = 3; result = 0; print(result); TRUE FALSE CODE guard (value > 0) result = 3; print(result); DEOPTIMISE TRUE FALSE OPTIMIZED CODE
  • 32. DEOPTIMISATION if (value > 0) result = 3; result = 0; print(result); TRUE FALSE CODE guard (value > 0) result = 3; print(result); DEOPTIMISE TRUE FALSE OPTIMIZED CODE value = 3
  • 33. DEOPTIMISATION if (value > 0) result = 3; result = 0; print(result); TRUE FALSE CODE guard (value > 0) result = 3; print(result); DEOPTIMISE TRUE FALSE OPTIMIZED CODE value = 0
  • 37. JVM PERFORMANCE GRAPH JVM STARTUP GarbageCollector pauses Deoptimisations
  • 39. MICROSERVICE ENVIRONMENT FIRST RUN JVM STARTUP SECOND RUN JVM STARTUP THIRD RUN JVM STARTUP
  • 40.
  • 41. WOULDN'T IT BE GREAT...? FIRST RUN JVM STARTUP SECOND RUN NO STARTUP OVERHEAD THIRD RUN NO STARTUP OVERHEAD
  • 44. WHAT ABOUT CDS? Class Data Sharing (App/Dynamic) Dump internal class representation into fi le Shared on each JVM start (CDS) No optimization or hotspot detection Only reduces loading time of classes Start up could be up to 1-2 seconds faster
  • 45. MyClass.class BYTE CODE CLASS LOADER JVM MEMORY CDS
  • 47. WHY NOT USE AOT? No interpreting bytecodes No analysis of hotspots No runtime compilation of code Start at full speed, straight away GraalVM native image does that PROBLEM SOLVED...?
  • 48. NOT SO FAST... AOT is, by de fi nition, static Code is compiled before it is run Compiler has no knowledge of how the code will actually run Pro fi le Guided Optimisation (PGO) can partially help
  • 49. JVM PERFORMANCE GRAPH AOT Compiled Code AOT Compiled Code with Profile Guided Optimisation Needs to run once for pro fi ling
  • 50. AOT VS JIT Limited use of method inlining No runtime bytecode generation Re fl ection is possible but complicated Unable to use speculative optimisations Overall performance will typically be lower Deployed env != Development env. Full speed from the start No overhead to compile code at runtime Small memory footprint Can use aggressive method inlining at runtime Can use runtime bytecode generation Re fl ection is simple Can use speculative optimisations Overall performance will typically be higher Deployed env. == Development env. Requires more time to start up Overhead to compile code at runtime Larger memory footprint AOT JIT
  • 51. JIT DISADVANTAGES Requires more time to start up CPU overhead to compile code at runtime Larger memory footprint
  • 52. AOT JIT Reduced Max Latency Peak Throughput Small Packaging Startup Speed Low Memory Footprint (Chart by Thomas Wuerthinger)
  • 55. Linux project Part of kernel >= 3.11 (2013) Freeze a running container/application Checkpoint its state to disk Restore the container/application from the saved data. Used by/integrated in OpenVZ, LXC/LXD, Docker, Podman and others CHECKPOINT RESTORE IN UUSERSPACE
  • 56. Mainly implemented in userspace (not kernel) Main goal is migration of containers for microservices Heavily relies on /proc fi le system It can checkpoint: Processes and threads Application memory, memory mapped fi les and shared memory Open fi les, pipes and FIFOs Sockets Interprocess communication channels Timers and signals Can rebuild TCP connection from one side without need to exchange packets CHECKPOINT RESTORE IN UUSERSPACE
  • 57. Restart from saved state on another machine (open fi les, shared memory etc.) Start multiple instances of same state on same machine (PID will be restored which will lead to problems) A Java Virtual Machine would assume it was continuing its tasks (very dif fi cult to use effectively) CRIU CHALLENGES
  • 59. CRaC CRIU comes bundled with the JDK Heap is cleaned, compacted (JVM is in a safe state) Throws CheckpointException (in case of open fi les/sockets) Comes with a simple API Creates checkpoints using code or jcmd Uses JVM safepoint mechanism (to stop threads etc.)
  • 61. RUNNING APPLICATION Aware of checkpoint being created RUNNING APPLICATION Aware of restore happening CRaC has more restrictions (e.g. no open fi les, sockets etc.) CRaC
  • 62. CRaC START >java -XX:CRaCCheckpointTo=PATH -jar app.jar RESTORE >java -XX:CRaCRestoreFrom=PATH
  • 64. <<interface>> Resource beforeCheckpoint() afterRestore() CRaC uses Resources that can be noti fi ed about a Checkpoint and Restore Classes in application code implement the Resource interface The application receives callbacks during checkpointing and restoring Makes it possible to close/restore resources (e.g. open fi les, sockets) CRaC API
  • 65. Resource objects need to be registered with a Context so that they can receive noti fi cations There is a global Context accessible via the static getGlobalContext() method of the Core class CRaC API
  • 67. CRaC API The global Context maintains a list of Resource objects The beforeCheckpoint() methods are called in the reverse order the Resource objects were added The afterRestore() methods are called in the order the Resource objects were added Checkpoint and restore can be done programmatically by using Core.checkpointRestore() Checkpoint can also be created by jcmd
  • 69. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Cache Scheduled Thread Executed every 5 seconds
  • 70. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Cache Scheduled Thread Executed every 5 seconds
  • 71. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Cache Scheduled Thread Executed every 5 seconds
  • 72. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Cache Scheduled Thread Executed every 5 seconds
  • 73. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? yes Cache FAST Scheduled Thread Executed every 5 seconds
  • 74. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Calculate if prime no Cache Scheduled Thread Executed every 5 seconds
  • 75. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Calculate if prime Result no Cache Scheduled Thread Executed every 5 seconds
  • 76. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Calculate if prime Result no Cache Scheduled Thread Executed every 5 seconds
  • 77. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Calculate if prime Result no Cache SLOW Scheduled Thread Executed every 5 seconds
  • 78. SIMPLE EXAMPLE Loop 100000x Random number 1 - 100000 isPrime(number) Is result for number in cache? Calculate if prime Result no Cache Scheduled Thread Executed every 5 seconds timeout 50 sec
  • 79. SIMPLE EXAMPLE First run is slow because the cache is empty Time to check 100 000 random numbers for prime should decrease with every run
  • 80. Application Performance 0 0.2 0.4 0.6 0.8 1 Iterations 1 2 3 4 5 6 7 8 9 10 11 12 SIMPLE EXAMPLE
  • 82. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 83. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar > De fi nes where to store fi les for the checkpoint SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 84. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 85. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 86. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 87. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) > SIMPLE EXAMPLE Slow because cache is not fi lled yet SHELL 1 SHELL 2
  • 88. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 89. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 90. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 91. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 92. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 93. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 94. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 95. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 96. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 97. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) > SIMPLE EXAMPLE SHELL 1 SHELL 2
  • 98. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) > SIMPLE EXAMPLE Fast because cache is fi lled SHELL 1 SHELL 2
  • 99. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint SHELL 1 SHELL 2
  • 100. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint CREATE CHECKPOINT SHELL 1 SHELL 2
  • 101. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed > SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 102. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed > SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 103. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 104. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > RESTORE SHELL 1 SHELL 2
  • 105. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files afterRestore() called in GenericCache afterRestore() called in Main 12. Run: 23 ms (99999 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 106. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files afterRestore() called in GenericCache afterRestore() called in Main 12. Run: 23 ms (99999 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > Continues to run...fast SHELL 1 SHELL 2
  • 107. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files afterRestore() called in GenericCache afterRestore() called in Main 12. Run: 23 ms (99999 elements cached, 100.0%) 13. Run: 21 ms (99995 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 108. >java -XX:CRaCCheckpointTo=/crac-files -jar crac4.jar Running on CRaC (PID 4240) First run will take up to 30 seconds... Register Resource: GenericCache Register Resource: Main 1. Run: 27863 ms (63303 elements cached, 63.3%) 2. Run: 10611 ms (86580 elements cached, 86.6%) 3. Run: 3965 ms (95161 elements cached, 95.2%) 4. Run: 1428 ms (98245 elements cached, 98.2%) 5. Run: 533 ms (99340 elements cached, 99.3%) 6. Run: 207 ms (99750 elements cached, 99.8%) 7. Run: 87 ms (99898 elements cached, 99.9%) 8. Run: 42 ms (99960 elements cached, 100.0%) 9. Run: 25 ms (99986 elements cached, 100.0%) 10. Run: 20 ms (99992 elements cached, 100.0%) 11. Run: 17 ms (99999 elements cached, 100.0%) beforeCheckpoint() called in Main beforeCheckpoint() called in GenericCache CR: Checkpoint ... Killed >java -XX:CRaCRestoreFrom=/crac-files afterRestore() called in GenericCache afterRestore() called in Main 12. Run: 23 ms (99999 elements cached, 100.0%) 13. Run: 21 ms (99995 elements cached, 100.0%) 14. Run: 24 ms (99997 elements cached, 100.0%) SIMPLE EXAMPLE >jcmd crac4.jar JDK.checkpoint 4240: Command executed successfully > SHELL 1 SHELL 2
  • 109. FROM THE COMMAND LINE: >jcmd YOUR_AWESOME.jar JDK.checkpoint CREATING A CHECKPOINT
  • 110. CREATING A CHECKPOINT FROM CODE: Core.checkpointRestore();
  • 113. Time to fi rst opera ti on Spring-Boot Micronaut Quarkus xml-transform [ms] 0 1250 2500 3750 5000 4 352 980 1 001 3 898 OpenJDK
  • 114. Time to fi rst opera ti on Spring-Boot Micronaut Quarkus xml-transform [ms] 0 1250 2500 3750 5000 53 33 46 38 4 352 980 1 001 3 898 OpenJDK OpenJDK on CRaC
  • 116. SUMMARY... CRaC is a way to pause a JVM based application And restore it at some point in the future Bene fi t is potentially extremely fast time to full performance level Eleminates the need for hotspot identi fi cation, method compiles, recompiles and deoptimisations Improved throughput from start CRaC is an OpenJDK project CRaC can save infrastructure cost
  • 117. CPU Utilization 0 % 25 % 50 % 75 % 100 % Time INFRASTRUCTURE COST Checkpoint JVM startup time Interpretation + Compilation Overhead Start after restore Eliminates startup time Eliminates cpu overhead