For Java developers, the Just-In-Time (JIT) compiler is key to improved performance. However, in a container world, the performance gains are often negated due to CPU and memory consumption constraints. To help solve this issue, the Eclipse OpenJ9 JVM provides JITServer technology, which separates the JIT compiler from the application.
JITServer allows the user to employ much smaller containers enabling a higher density of applications, resulting in cost savings for end-users and/or cloud providers. Because the CPU and memory surges due to JIT compilation are eliminated, the user has a much easier task of provisioning resources for his/her application. Additional advantages include: faster ramp-up time, better control over resources devoted to compilation, increased reliability (JIT compiler bugs no longer crash the application) and amortization of compilation costs across many application instances.
We will dig into JITServer technology, showing the challenges of implementation, detailing its strengths and weaknesses and illustrating its performance characteristics. For the cloud audience we will show how it can be deployed in containers, demonstrate its advantages compared to a traditional JIT compilation technique and offer practical recommendations about when to use this technology.
6. Agenda
REASON:
JVM and JIT compiler
â the good and the
bad
6
JIT-as-a-Service
PROBLEM:
Java on Cloud - a bad
fit for microservices
7. Agenda
REASON:
JVM and JIT compiler
â the good and the
bad
7
JIT-as-a-Service
SOLUTION:
JIT-as-a-Service to
the rescue
PROBLEM:
Java on Cloud - a bad
fit for microservices
9. Legacy Java Apps
9
â˘Java monolith on
dedicated server
â˘Plenty of CPU power
and memory
â˘Never went down
â˘6 month
upgrade/refresh
schedule
10. Moving to the Cloud
10
-Running in containers
-Managed by Cloud
Provider (and K8s)
-Auto-scaling to meet
demand
Cloud native App talking to Microservices
11. Main Motivators
11
-Flexible & scalable
-Easier to roll-out new releases more frequently
-Take advantage of latest-greatest Cloud technologies
-Less infrastructure to maintain and manage
-Saving money
14. Java Virtual Machine (JVM)
14
The Good
⢠Device independent â write once, run anywhere
⢠> 25 years of improvements
⢠JIT produces optimized machine code through use of Profilers
⢠Efficient garbage collection
⢠Longer it runs, the better it runs (JVM collects more profile data, JIT
compiles more methods)
15. Java Virtual Machine (JVM)
15
The Bad
⢠Initial execution run is âinterpretedâ, which is relatively slow
⢠âHot Spotâ methods compiled by JIT can create CPU and memory
spikes
⢠CPU spikes cause lower QoS
⢠Memory spikes cause OOM issues, including crashes
⢠Slow start-up time
⢠Slow ramp-up time
16. Java Virtual Machine (JVM)
16
0
50
100
150
200
250
300
350
400
0 30 60 90
CPU
utilization
(%)
Time (sec)
Daytrader7 CPU consumption
CPU spikes caused
by JIT compilation
0
100000
200000
300000
400000
500000
600000
0 30 60 90
Resident
set
size
(KB)
Time (sec)
Daytrader7 memory footprint
Footprint spikes caused
by JIT compilation
18. Container Size
18
Main issues:
â˘Need to over-provision to
avoid OOM
â˘Very hard to do â JVMs
have a non-deterministic
behavior
0
100000
200000
300000
400000
500000
600000
0 30 60 90
Resident
set
size
(KB)
Time (sec)
Daytrader7 memory footprint
Footprint spikes caused
by JIT compilation
22. JIT-as-a-Service
Decouple the JIT compiler from the JVM and let it run as an independent process
Offload JIT
compilation to
remote process
Remote
JIT
Remote
JIT
JVM
JIT
JVM
JIT
Kubernetes
Control Plane
Treat JIT
compilation as a
cloud service
⢠Auto-managed by orchestrator
⢠A mono-to-micro solution
⢠Local JIT still available
22
23. Eclipse OpenJ9 JITServer
⢠JITServer feature is available in the Eclipse OpenJ9 JVM
⢠âSemeru Cloud Compilerâ when used with Semeru Runtimes
⢠OpenJ9 combines with OpenJDK to form a full JDK
Link to GitHub repo: https://github.com/eclipse-openj9/openj9
23
24. Overview of Eclipse OpenJ9
Designed from the start to span all the operating
systems needed by IBM products
This JVM can go from small to large
Can handle constrained environments or memory
rich ones
Renowned for its small footprint, fast start-up and
ramp-up time
Is used by the largest enterprises on the planet
24
26. IBM Semeru Runtimes
âThe part of Java thatâs really in the cloudsâ
IBM-built OpenJDK runtimes powered by the Eclipse OpenJ9 JVM
No cost, stable, secure, high performance, cloud optimized, multi-
platform, ready for development and production use
Open Edition
⢠Open source license (GPLv2+CE)
⢠Available for Java 8, 11, 17, 18 (soon 19)
Certified Edition
⢠IBM license
⢠Java SE TCK certified.
⢠Available for Java 11, 17
26
27. JITServer advantages for JVM Clients
27
Provisioning
Easier to size; only consider the needs
of the application
Performance
Improved ramp-up time due to JITServer
supplying extra CPU power when the JVM
needs it the most.
Reduced CPU consumption with JITServer AOT
cache
Cost
Reduced memory consumption means
increased application density and reduced
operational cost.
Efficient auto-scaling â only pay for what
you need/use.
Resiliency
If the JITServer crashes, the JVM can
continue to run and compile with its
local JIT
32. JITServer value in Kubernetes
⢠https://blog.openj9.org/2021/10/20/save-money-with-jitserver-on-the-
cloud-an-aws-experiment/
⢠Experimental test bed
⢠ROSA (RedHat OpenShift Service on AWS)
⢠Demonstrate that JITServer is not tied to IBM HW or SW
⢠OCP cluster: 3 master nodes, 2 infra nodes, 3 worker nodes
⢠Worker nodes have 8 vCPUs and 16 GB RAM (only ~12.3 GB available)
⢠Four different applications
⢠AcmeAir Microservices
⢠AcmeAir Monolithic
⢠Petclinic (Springboot framework)
⢠Quarkus
⢠Low amount of load to simulate conditions seen in practice
⢠OpenShift Scheduler to manage pod and node deployments/placement
32
33. JITServer improves container density and cost
Default config
AM 500
B 550
C 550
F 450 P 450
P 450
B 550
F 450
AM 500
A 350
AM 500
M 200
Q 350
P 450
Q 350
D 600
D 1000
F 450
B 550
Q 350
AM 500
AM 500
AM 500
B 550
B 550
A 350
C 550
F 450
M 200
P 450
P 450
P 450
Q 350
Q 350
D 1000
AM 500 B 550
P 450
AM 500
B 550
B 550
C 550
C 550
F 450
F 450 P 450
Q 350
Q 350
D 1000
D 1000
Q 350
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000 D 1000
D 600
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000
D 1000
JITServer config
Legend:
AM: AcmeAir monolithic
A: Auth service
B: Booking service
C: Customer service
D: Database (mongo/postgres)
F: Flight service
J: JITServer
M: Main service
P: Petclinic
Q: Quarkus
Total=8250 MB Total=8550 MB Total=8600 MB
Total=9250 MB Total=9850 MB
6.3 GB less
33
35. Conclusions from high density experiments
⢠JITServer can improve container density and reduce operational costs
of Java applications running in the cloud by 20-30%
⢠Steady-state throughput is the same despite using fewer nodes
35
36. Horizontal Pod Autoscaling in Kubernetes
⢠Better autoscaling behavior with JITServer due to faster ramp-up
⢠Less risk to trick the HPA due to transient JIT compilation overhead
36
Setup:
Single node Microk8s cluster (16 vCPUs, 16 GB RAM)
JVMs limited to 1 CPU, 500MB
JITServer limited to 8 CPUs and has AOT cache enabled
Load applied with JMeter, 100 threads, 10 ms think-time,
60s ramp-up time
Autoscaler: scales up when average CPU utilization
exceeds 0.5P. Up to 15 AcmeAir instances
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 60 120 180 240 300 360 420 480
Throughput
(pages/sec)
Time (sec)
AcmeAir throughput when using Kubernetes autoscaling
Baseline JITServer+AOTcache
38. JITServer usage basics
⢠One JDK, three different personas
⢠Normal JVM: $JAVA_HOME/bin/java MyApp
⢠JITServer: $JAVA_HOME/bin/jitserver
⢠Client JVM: $JAVA_HOME/bin/java -XX:+UseJITServer MyApp
⢠Optional further configuration through JVM command line options
⢠At the server:
-XX:JITServerPort=⌠default: 38400
⢠At the client:
-XX:JITServerAddress=⌠default: âlocalhostâ
-XX:JITServerPort=⌠default: 38400
⢠Full list of options: https://www.eclipse.org/openj9/docs/jitserver/
⢠Note: Java version and OpenJ9 release at client and server must match
38
39. JITServer usage in Kubernetes
⢠Typically we create/configure
⢠JITServer deployment
⢠JITServer service (clients interact with service)
⢠Use
⢠Yaml files
⢠Helm charts: repo https://raw.githubusercontent.com/eclipse/openj9-utils/master/helm-chart/
⢠Certified Operators
⢠Tutorial: https://developer.ibm.com/tutorials/using-openj9-jitserver-in-
kubernetes/
39
40. JITServer encryption/authentication through TLS
⢠Needs additional JVM options
⢠Server: -XX:JITServerSSLKey=key.pem -XX:JITServerSSLCert=cert.pem
⢠Client: -XX:JITServerSSLRootCerts=cert.pem
⢠Certificates and keys can be provided using Kubernetes TLS Secrets
⢠Create TLS secret:
⢠kubectl create secret tls my-tls-secret --key <private-key-filename> --cert <certificate-filename>
⢠Use a volume to map âpemâ files
40
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-container-name
image: my-image
volumeMounts:
- name: secret-volume
mountPath: /etc/secret-volume
volumes:
- name: secret-volume
secret:
secretName: my-tls-secret
41. Monitoring
⢠Support for custom metrics for Prometheus
⢠Metrics scrapping: GET request to http://<jitserveraddress>:<port>/metrics
⢠Command line options:
-XX:+JITServerMetrics -XX:JITServerMetricsPort=<port>
⢠Metrics available
⢠jitserver_cpu_utilization
⢠jitserver_available_memory
⢠jitserver_connected_clients
⢠jitserver_active_threads
⢠Verbose logging
⢠Print client/server connections
-XX:+JITServerLogConnections
⢠Heart-beat: periodically print to verbose log some JITServer stats
⢠-Xjit:statisticsFrequency=<period-in-ms>
⢠Print detailed information about client/server behavior
-Xjit:verbose={JITServer},verbose={compilePerformance},vlog=âŚ
41
42. JITServer usage recommendations
When to use it:
⢠JVM needs to compile many methods in a relatively short time
⢠JVM is running in a CPU/memory constrained environment, which can
worsen interference from the JIT compiler
⢠The network latency between JITServer and client VM is relatively low
(<1ms)
⢠To keep network latency low, use âlatency-performanceâ profile for tuned and
configure your VM with SR-IOV
42
43. JITServer usage recommendations
Recommendations:
⢠10-20 client JVMs connected to a single JITServer instance
⢠JITServer needs 1-2 GB of RAM
⢠Better performance if the compilation phases from different JVM
clients do not overlap (stagger)
⢠Encryption adds to the communication overhead; avoid if possible
⢠In K8s use âsessionAffinityâ to ensure a client always connects to the
same server
⢠Enable JITServer AOT cache: -XX:+JITServerUseAOTCache (client needs
to have shared class cache enabled)
43
44. Final thoughts
⢠JIT provides advantage, but compilation adds overheadâŚ.
SoâŚ..
⢠Disaggregate JIT from JVM ď¨ JIT compilation as a service
⢠Eclipse OpenJ9 JITServer (a.k.a Semeru Cloud Compiler)
⢠Available now on Linux for Java 8, Java 11 and Java 17 (IBM Semeru Runtimes)
⢠Especially good for constrained environments (micro-containers)
⢠Kubernetes ready (Helm chart available, Prometheus integration)
⢠Can improve ramp-up, autoscaling and performance of short lived applications
⢠Can reduce peak memory footprint, increase app density and reduce costs
⢠Java solution to Java problem, with no compromise
44
This talk is entitled âThe next frontier in Open Source Java Compilers: Just-in-Time Compilation as a Serviceâ
Introduction
Introduction
I assume everyone here is a Java developer and knows what a JVM and a JIT is. As you know the JVM, or Java Virtual Machine, executes your Java application,
And the JIT, or Just-in-time Compiler is invoked by the JVM during run time to compile the most frequently called, or HOT, methods.
With this in mind, today we will be talking about the concept of a JIT-as-a-Service, and why we need it.
We are going to break this talk down into 3 parts:
First weâll discuss the problem we want to address, and this is running Java on the cloud is not a good fit, specifically in a distributed and dynamic architecture, like microservices
2. Then weâll talk the reason this is, by taking a look at the the JVM and the JIT compiler, which has a great history, but has some side effects that can affect performance at start-up
3. Finally, weâll discuss a way to get around these start-up issues by using JIT-as-a-Service
Letâs start with some background on running Java apps in cloud
For contrast, letâs start with how we all use to typically run our Java enterprise apps.
It was a monolith application running on a dedicated server, and to make sure we didn't have any performance issues, we loaded that server with plenty of CPUs and memory.
And, of course, that application ran great. It didn't matter if it took 10 minutes to start because it never went down.
Maybe once every 6 months it would be taken offline to launch a new version with some library upgrades, a couple new features, and some bug fixes.
Now let's fast forward to today where the trend is to deploy apps to the cloud.
That same monolith application, is now composed of many small microservices that all talk to each other, all running in containers, and managed by some cloud provider. And, depending on demand, there maybe multiple instances of each microservice.
And we do this for a couple reasons â
More agile and dynamic
Can implement new releases more easily and frequently
Positioned to take advantage of new cloud technologies
Less infrastructure we have to maintain and manage - going from a constrained to a utility use model
And of course, a major motivator is to save money
But how do we ensure that performance is still acceptable to our customers while still minimizing cost so that we actually do save money?
The main variables controlling cost and performance are how big our containers are, and how many instances of each are running.
Container size we can control, but scaling of instances is left to the cloud orchestrator to manage. But we can do a lot to ensure that scaling is efficient and effective. More on this later.
This graph shows the various ways we can get these variables wrong. Of course, if we under-provision our containers, and not enough instances can be efficiently run, we save on money, but the performance is unacceptable.
On the opposite side, if we over-provision our containers and we may too many instances running, we have great performance, but weâre wasting money.
Our goal is to get to the bottom-right quadrant â the sweet spot.
But this is extremely hard to accomplish. Getting this right is the new focus for Java vendors, all coming out with new technologies to address this problem.
Why is this so hard? To better understand, we need to go over some background on how Java applications execute.
For Java applications, itâs all about the JVM and the JIT, which are great and time-tested technologies, but they have some not so good side-effects, especially during start-up
One reason Java really took off early on was because it was device independent - write once, run anywhere.
And itâs been around for a long time, constantly improving over time.
It uses a JIT to dynamically compile âhotâ methods using Profilers to generate very optimized machine code â much more than you can get with using a static or Ahead-of-Time compiler.
It has great garbage collection.
And because it takes time for the JVM to profile and the JIT to compile, Java apps actually run better the longer they run.
But there are some trade-offs.
Before the JIT is invoked, the code is âinterpretedâ, which is relatively slow
And when the JIT is invoked, it can cause CPU and memory spikes.
CPU spikes at the very least can lower QoS, and memory spikes cause OOM issues, including crashes. One of the main reasons JVMs crash is due to OOM issues.
Both CPU and memory spikes slow down start-up time and ramp-up time.
Start-up time is the time it takes for the app to be ready to process its first reqest, and ramp-up time is the time it takes for the JIT to compile all of the hot methods and to be running fully optimized.
Hereâs graph of a typical Java App at start-up, and you can see the CPU spikes on the left, and memory spikes on the right.
A lot of the CPU spikes are caused by JIT compilations, and you can see the biggest spikes occur at the start when the JIT is the most active. The result of these spikes can cause lower QoS, which means sluggish performance.
This is also true for memory spikes. Again, you can see the biggest spikes are related to JIT compiles during ramp-up time. Memory spikes are particularly bad because they can cause OOM issues, including crashing the JVM.
So now that we have some good background info, lets revisit our 2 variables for determining cost vs performance.
Remember, our goal is to find that sweet spot - where we have just the right amount of resources provisioned for our containers, and we have containers set up for efficient auto-scaling.
For container size, we now know why itâs hard to get the right size.
We need to over-provision in order to avoid any OOM issues. We need to handle the initial spikes, but those resources are wasted once the app reaches steady-state.
And the amount to over-provision is hard because Java is non-deterministic, meaning we can run the same application twice and get different spike levels. You really need to run a series of different load tests to get this even close to right.
For auto-scaling container instances, we now know we have 2 main issues.
Slow start-up and ramp-up times makes scaling ineffective â new instances take too long to start-up causing QoS issues. The alternative is to just start more instances than you think you need and effectively eliminate any auto-scaling.
Another problem is that CPU spikes due to JIT compiles can cause issues with auto-scalers. These spikes may be interpreted incorrectly by the auto-scaler as demand load and may result in unnecessary instance launches. One way to minimize this problem is set your thresholds very high, but again, this makes your auto-scaler less effective.
The solution is pretty clear â we need to flatten out those CPU and memory spikes, and we need to improve start-up and ramp-up times.
Which leads us to JIT-as-a-Service.
The basic premise here is to decouple the JIT compiler from the JVM and let it run as an independent process.
Here we show a couple of JVMs on the left, and remote JIT services on the right.
The JVMs will no longer use their local JIT, and will offload their JIT compilations to the remote JIT services.
Here we show the remote JIT processes containerized and made available as a cloud service.
This gives us an added benefit â we can be managed by orchestrators like Kubernetes, where it can make sure our service it is always running and scaled properly to handle demand.
And this solution is just like any other monolith to micro-services conversion â in this case the JVM is the monolith that is turned into 2 loosely coupled mircro-services â the JIT and the rest of the JVM.
Note that on the diagram we show the JVM JIT crossed-out, but it still can be used if the remote JIT should become unavailable.
This service already exists, and it is called the JITServer and is a feature of the Eclipse OpenJ9 JVM â which is totally open source and free to download.
It also goes by the name âSemeru Cloud Compilerâ, because it is mostly distributed with the IBM Semeru Runtimes (which we will talk about in a minute).
For distribution, the OpenJ9 combines with the OpenJDK binaries to form a full JDK.
As I mentioned, the Eclipse Open J9 JVM, and by extension JITServer Technology is completely open source - here is a link to its GitHub repo.
A little background on the OpenJ9 JVM â
It originally started life as the J9 JVM, which was developed by IBM over 20 years ago to run all of their Java based workloads on IBM hardware.
It was open sourced to the Eclipse Foundation around 5 years ago and re-branded as OpenJ9
It works with any Java workload, from micro-services to monolliths, and is specifically designed to work in constrained environments.
And its well-known for its small footproint, fast startup and ramp-up times.
Over time it has been used by many fortune 500 companies to run their enterprise Java applications. So it has a long history of success.
Here we show how it compares to the popular HotSpot JVM. OpenJ9 is the green, and HotSpot is orange. And this comparison is independent of any JITServer advantages.
These graphs are based on startup and ramp-up times. Remember, the distinction is start-up time is initial application load time, while ramp-up time is the time it takes to be running at peak throughput with optimized compiled code.
Going left to right, you can see that start-up time can be 51% faster than HotSpot.
Next we see that OpenJ9 has a 50% smaller footprint after start-up, which means more resources for the application.
Next, we see faster ramp-up time. Notice how much longer it takes HotSpot to match the level of OpenJ9.
And finally, you see OpenJ9 still has a smaller footprint, even after fully ramping up.
All of these metrics are important when running in constrained environments.
Semeru Runtimes is IBMs distribution of the OpenJDK binaries, and it is the only distribution that comes with the OpenJ9 JVM.
It comes in 2 flavors â an Open and Certified Edition. Both are free to download, the only difference is licensing and supported platforms.
If you are wondering where the name came from, the connection is that Mount Semeru is the tallest mountain on the island of Java.
Back to the JITServer - letâs take a look at the advantages, from the perspective of the JVM clients that will be utilizing the JITServer.
For provisioning:
Since there are no more JIT compilation spikes, sizing becomes much easier. There is no need to add in any âjust-in-caseâ resources, and you can just focus on what the application needs.
As for performance:
It will be much more predictable â the JIT will no longer be stealing CPU cycles.
And because the JITServer can provide additional CPU cycles from the start, ramp-up times will be improved. And this is especially true for short-lived apps, since a majority of their life span is during ramp-up.
The JITServer also has its own AOT cache, which means that any new replicated instances can have access to already compiled methods.
As for cost:
Less resources are needed, and more efficient auto-scaling means only paying for what you need and use.
And finally for resiliency:
The JVM and the JITServer are separate processes, so the JVM can continue if the JITServer crashes. And the JVM still has use of its local JIT.
Letâs take a closer at some test results which show both cost savings and improved performance.
This is setup for the demo
Acmeair app: This application shows an implementation of a fictitious airline called "Acme Air".
The application was built with some key business requirements: the ability to scale to billions of web API calls per day, the need to develop and deploy the application targeting multiple cloud platforms (including Public, Private and hybrid).
The application can be deployed both on-prem as well as on Cloud platforms
this and the next few slides can be substituted by running the actual demo, or showing a copy of the demo located at https://ibm.ent.box.com/file/1014972472931
Now lets take a look at another test to see how JITServer can help with provisioning.
The experiment was conducted on a Red Hat OpenShift cluster on AWS. It has 3 worker nodes, with around 12GB of RAM to play with.
We will be running 4 test applications â 2 versions of the AcmeAir application, one as a monolith, and one with micro-services.
And a springboot and Quarkus application.
We will apply a real-world load to the applications to simulate activity, and we will let the OpenShift Scheduler determine how to deploy and replicate the applications.
***
Letâs look at a more complex example that demonstrates the value of JITServer a Kubernetes setting.
These experiments were performed on RedHat OpenShift Service on AWS (for those of you that donât know OpenShift is a Kubernetes distribution from RedHat).
Our cluster has 3 worker nodes with 8 vCPUs and 16 GB of RAM out if which only 12.3GB are available (the rest is used by OS and OCP related applications)
As workload we have 4 different applications: AcmeAir Microservices and AcmeAir Monolithic based on OpenLiberty, Petclinic (which is based on the Springboot framework) and a Quarkus based app.
We apply a load amount of load to better reflect conditions seen in practice (I have seen quite a few studies that show that, the level of utilization is somewhere around 6 and 15% while another study from Google gives more generous numbers between 10 and 50%).
So these slides show how the OpenShift scheduler decided to place the various pods on the worker nodes. Note that each application has a different color. and each application is replicated multiple times.
The size of the shape indicates its relative container size. The number in the shape is the memory limit for that container.
As you can see in the top row, all 3 worker nodes are used, and the size of the containers are all larger than the bottom graph â this is due to building in extra memory to avoid OOM issues and improve throughput.
The bottom row uses the JITServer, which results in only 2 worker nodes being used, despite the fact that the JITServer containers (shown in brown) are the largest containers in the node. The savings comes from being able to scale down each of the application nodes.
The end result is a 33% cost savings by using one less worker node.
***
This slide is an illustration of how OpenShift scheduler decided to place the various containers on nodes.
There are two different configurations: above we have the default configuration without JITServer which needs 3 worker nodes.
Below we have the JITServer configuration that only uses 2 worker nodes.
The colored boxes represent containers and the legend on the right will help you decipher which application each container is running.
The number in each box represents the memory limit for that container. These values were experimentally determined so that the application can run without an OOM or drastically reduced throughput.
The boxes were drawn at scale meaning that a bigger box represents a proportionally larger amount of memory given to that container.
At a glance you can see that in the default config you cannot fit all those containers in just 2 nodes; you need 3 nodes.
In contrast, the JITServer config uses 6.3 GB less and we are able to fit all those containers in just 2 nodes. This happens even after we account for the presence of JITServer
(as you can see we have two JITServer instances, one on each node).
The takeaway is that JITServer technology allows you to increase application density in the cloud and therefore cost.
Now lets take a look at how each of the applications performed, The orange line represents the top row from the previous page. And the blue line represents the bottom row from the previous page, which uses the JITServer.
Each graph represents each of the applications.
You can see that the performance is pretty even, despite the fact that the JITServer is working with less worker node CPUs. The small blue lags are likely caused by the noisy-neighbor effect due to all the apps loaded up at the same time.
***
I have here 4 graphs, one for each application and the blue line shows how throughput with JITServer varies in time, while the orange line represents the throughput of the baseline.
As you can see, the steady state throughput for the 2 configurations is the same.
From the ramp-up point of view JITServer is doing quite well for Petclinic and Quarkus, while for AcmeAir mono and micro there is a miniscule ramp-up lag, which I would say it's negligible.
On the graphs you can also notice some dips in throughput, more pronounced for AcmeAir monolithic.
This is due to interference between applications, or the so-called noisy neighbor effect.
Since in practice applications are not likely to be loaded at the exact same time, in these experiments we apply load to the 4 applications in a staggered fashion, 2 minutes apart, starting with AcmeAir microservices and continuing with AcmeAir monolithic, Petclinic and Quarkus.
Those throughput dips correspond to these 2 minute intervals when the next application starts to become exercised causing a flurry of JIT compilations to happen.
If you pay close attention, you'll observe that the Baseline configuration is affected too by the noisy neighbor effect, but to a lower extent because Baseline has 50% more CPUs to its disposal (3 nodes vs 2).
So, to summarize, the experiment demonstrate that JITServer can increase container density in the cloud without sacrificing throughput.
And this results in reducing operational cost of Java applications by 20 to 30%.
And just to note - Ramp-up time can be slightly affected in high density scenarios due to limited computing resources, but how much depends on the level of load and the number of pods concurrently active.
****
In conclusion, the experiments conducted on Amazon AWS demonstrate that JITServer can increase container density in the cloud without sacrificing throughput and therefore reduces operational cost of Java applications by 20 to 30%.
Ramp-up can be slightly affected in high density scenarios due to limited computing resources, but the extent of this depends on the level of load and the number of pods concurrently active.
And for out last experiment, we wanted to see how JITServer affects autoscaling in Kubernetes.
You can see a description of the test bed on the right. The autoscaler instantiates a new pod when the average CPU utilization exceeds 50%.
The graph shows the throughput of the AcmeAir app while increasing amounts of load are applied.
The orange line is the baseline, and the blue line represents using the JITServer.
As you can see, the throughput curve continues to rise, and then it plateaus. The dips are associated with the launch of new pods, which burn a lot of CPU.
Comparing the two curves, JITSever gives you a better behavior because it is able the warm-up the newly spawn pods faster.
Also, without the JITServer there is the danger that the autoscaler will interpret the CPU used for compilation as load and be misled into launching even more pods. The JITServer makes this less likely.
And because the JITServer caches compiled methods, new instances of the same app will have these methods available to them at start-up.
***
HPA = Horizontal Pod Autoscaler
We performed some experiments to see how JITServer affects the autoscaling behavior in Kubernetes.
A description of the experimental test-bed can be found on the right.
Letâs focus on the graph on the left which shows the throughput of AcmeAir app while increasing load is applied to it.
HPA monitors the average CPU usage of the AcmeAir pods and if that exceeds our target of 0.5P, new pods are instantiated.
Overall the throughput curve goes up and at some point it plateaus.
Interestingly, the curve shows some transient dips in throughput and those correlate with HPA decisions to launch new pods.
The new pods will burn a lot of CPU but yield poor performance until they warm-up. To maintain fairness the load balancer gives an equal number of requests to each pod, but because the new pods can only process a limited number of requests per second, the older pods will match that level of performance.
Comparing the two curves, JITSever gives you a better behavior because it is able the warm-up the newly spawn pods faster.
Moreover, without JITServer there is the danger that the autoscaler will interpret the CPU used for compilation as load and be misled into launching even more pods.
This can be avoided by making the autoscaler less sensitive to CPU spikes. However, not reacting fast enough to a legitimate increase in load is also bad.
If running from the command line, the JITServer can be started from the OpenJ9 bin directory, by simply typing âjitserverâ.
I would like to point out that the JITServer is just a OpenJ9 JVM, but running under a different persona.
To use it from your app, just use the UseJitServer option when starting your java applications
And there are a number of other options such as specifying address and port number if you need more than just the default values.
Here is a link to all of the options.
And note that the JITServer and its clients need to on the same Java version OpenJ9 release.
#####
Port can be changed if there is a conflict with another service using the same port number
For Kubernetes, you need to set up a JITServer deployment and service.
You do this with Yaml files, or Helm Charts, and an Operator is now available.
Here is a link to a tutorial that walks you through the steps.
- You can establish trust and encrypt the communication between client and server using TLS
- This can be done with the command line options shown in blue,
which specify the certificate file and private key to be used.
- The certificate and the private key files can be stored as Kubernetes TLS secrets and
mapped into the container using volumes.
- I am showing here an excerpt of a yaml file though there are other ways possible.
The JITServer can be queried for metrics. Here is a list, which is what we showed in the demo.
You can also specify logging options.
Here are some recommendation on when to use the JITServer.
One use case is when your JVM needs to compile many methods in a relatively short time
Or you are running in a constrained environment, where you canât afford CPU spikes from compilations
And only use if you have low network latency. Communication between the JVM and JITServer can create a lot of traffic. And you should use any latency-performance settings to tune your environment.
*****
SR-IOV = single root I/O Virtualization
Kubernetes use "requests" and "limits" for resources like CPU and memory. "requests" is the amount of CPU/memory that a process is guaranteed to get. This value is used for pod scheduling. "limits" is the maximum amount of CPU/memory a pod is allowed to consume.
By setting a smaller CPU "request" value, Kubernetes has more flexibility in scheduling the JITServer. By setting a larger CPU "limit" value, you give JITServer the ability to use any unused CPU cycles
Typically the "request" value should be set to what the process consumes at steady-state (if you can define such a thing for JITServer)
-Xshareclasses on client â this turns on AOT
As for recommendations on how to use it:
The JITServer does create require additional resources, so to get the most net benefit, you should try to have between 10-20 connected client JVMs
It needs at least 1-2 GB of RAM
If you are using it with Kubernetes, always set the CPU/memory limits (which is the max) much larger than the requests (which is the minimum). This can really help handling CPU usage spikes
And as we saw with one of our experiments, it performs better if all of its clients are not started at the same time.
Donât use encryption unless you really need it. It does add a lot of overhead
In Kubernetes, definitely use âsessionAffinityâ to make sure JITServers and their clients stay connected.
And the last tip is using the AOT cache feature of OpenJ9. If both the client and JITServer have this enabled, AOT code can be cached on the server side and shared with all of the JVM instances.
*****
SR-IOV = single root I/O Virtualization
Kubernetes use "requests" and "limits" for resources like CPU and memory. "requests" is the amount of CPU/memory that a process is guaranteed to get. This value is used for pod scheduling. "limits" is the maximum amount of CPU/memory a pod is allowed to consume.
By setting a smaller CPU "request" value, Kubernetes has more flexibility in scheduling the JITServer. By setting a larger CPU "limit" value, you give JITServer the ability to use any unused CPU cycles
Typically the "request" value should be set to what the process consumes at steady-state (if you can define such a thing for JITServer)
-Xshareclasses on client â this turns on AOT
So, what did we learn today?
JIT compilation adds overhead, and one solution is to disaggregate the JIT from the JVM and perform JIT compilations as a service.
We demonstrated the OpenJ9 implementation, which is the JITServer, also know as the Semeru Cloud Compiler.