Slices Of Performance in Java - Oleksandr Bodnar

2
ConfidentialConfidential
"Slices" of performance in Java
Oleksandr Bodnar,
Senior Software Engineer

3
1. Classic Performance Tuning in Java Application
2. JVM Options, JIT, HotSpot
3. Tuning source code
4. Analyze performance by tools
Agenda

4
Introduce

5
Performance programming seeks to improve performance
beyond what is achieved by programming an algorithm in
the most expedient manner.
Performance

6
• The goal is that each processing element be kept as busy as possible
doing useful work. This entails satisfying four requirements: breaking
problems into independent subproblems that can be executed
concurrently, distributing these subproblems appropriately among the
processing elements, making sure that the necessary data is close to its
processing element, and overlapping communication with computation
where possible.
Performance

7
1. Introduce

8
Classic Performance Tuning

9
Confidential
In computer science, the time complexity is the
computational complexity that describes the amount of
time it takes to run an algorithm.
Time complexity

11
1. A Graph is a non-linear data
structure consisting of nodes
and edges. The nodes are
sometimes also referred to as
vertices and the edges are lines
or arcs that connect any two
nodes in the graph.
Graph

12
Adjacency list is a collection of
unordered lists used to represent a
finite graph. Each list describes the
set of neighbors of a vertex in the
graph.
Graph: Adjacency list
Input Output
1 5
2 6
3 2, 5
4 5
5 1, 4
6 2

13
Adjacency matrix is a square matrix used to represent a finite graph.
The elements of the matrix indicate whether pairs of vertices are
adjacent or not in the graph.
Graph: Adjacency matrix

14
Incidence matrix is a matrix that shows the relationship between two
classes of objects.
Graph: Incidence matrix

15
1. DFS and BFS
2. Graph Cycle
3. BackTracking
4. Shortest Paths and
Connectivity
5. Maximum Flow
6. Greedy Algorithms
7. Dynamic programming
Graph

16
Directed graphs with nonnegative weights

17
All-pairs shortest paths

20
• JGraphT if you are more interested in data structures and algorithms.
• JGraph if your primary focus is visualization.
• Jung, yWorks, and BFG are other things people tried using.
• Prefuse is a no no since one has to rewrite most of it.
• Google Guava if you need good datastructures only.
• Apache Commons Graph.
Java & Graph

21
Graph<URL, DefaultEdge> g = new DefaultDirectedGraph<>(DefaultEdge.class);
URL amazon = new URL("http://www.amazon.com");
URL yahoo = new URL("http://www.yahoo.com");
URL ebay = new URL("http://www.ebay.com");
// add the vertices
g.addVertex(amazon);
g.addVertex(yahoo);
g.addVertex(ebay);
// add edges to create linking structure
g.addEdge(yahoo, amazon);
g.addEdge(yahoo, ebay);
(#8.1)

22
JVM

23
«Ты суешь её [option] в свой конфиг,
она действительно что-то улучшает, и
ты рисуешь себе звёздочку «я умею
тюнить JVM».
JVM Options

24
Execution
Engine
JVM
Class loader
JIT is "Just In
Time"
Check by
HotSpot rules
GC
.java file .class file
C1
C2
Minor GC
Full GC

25
Memory Options

26
• The following shows the relationship between the memory size, the number of
GC execution, and the GC execution time.
• Large memory size
- decreases the number of GC executions.
- increases the GC execution time.
• Small memory size
- decreases the GC execution time.
- increases the number of GC executions.
• 10 GB is OK if the server resource is good and Full GC can be completed within
1 second even when the memory has been set to 10 GB. But most servers are
not in the status. When the memory is set to 10 GB, it takes about 10 ~ 30
seconds to execute Full GC. Of course, the time may vary according the object
size.
Setting Memory Size

27
• How we should set the memory size?
• Based on the current status before GC tuning, check the memory size left after
Full GC. If there is about 300 MB left after Full GC, it is good to set the memory to
1 GB (300 MB (for default usage) + 500 MB (minimum for the Old area) + 200 MB
(for free memory)). That means you should set the memory space with more than
500 MB for the Old area. Therefore, if you have three operation servers, set one
server to 1 GB, one to 1.5 GB, and one to 2 GB, and then check the result.
Setting Memory Size

28
• NewRatio is the ratio of the New area and the Old area. If XX:NewRatio=1, New
area:Old area is 1:1. For 1 GB, New area:Old area is 500MB: 500MB.
If NewRatio is 2, New area:Old area is 1:2. Therefore, as the value gets larger, the
Old area size gets larger and the New area size gets smaller.
• If the New area size is small, much memory is passed to the Old area, causing
frequent Full GC and taking a long time to handle it.
Setting Memory Size: NewRatio

29
Confidential
• In the analysis, focus on the following. The
most important item to decide the GC
option is Full GC execution time.
Analyzing GC Tuning Results
• Case 1: -XX:+UseParallelGC -Xms1536m -
Xmx1536m -XX:NewRatio=2
• Case 2: -XX:+UseParallelGC -Xms1536m -
Xmx1536m -XX:NewRatio=3
• Case 3: -XX:+UseParallelGC -Xms1g -
Xmx1g -XX:NewRatio=3
• Case 4: -XX:+UseParallelOldGC -
Xms1536m -Xmx1536m -XX:NewRatio=2
• Case 5: -XX:+UseParallelOldGC -
Xms1536m -Xmx1536m -XX:NewRatio=3
• Case 6: -XX:+UseParallelOldGC -Xms1g -
Xmx1g -XX:NewRatio=3
GC Cases

30
• If you try to decrease the Old area
size to decrease Full GC
execution
time, OutOfMemoryError may
occur or the number of Full GCs
may increase.
• Alternatively, if you try to decrease
the number of Full GC by
increasing the Old area size, the
execution time will be increased.
GC Tuning: jstat

31
GC Tuning: jstat

32
• -verbosegc is one of the JVM options specified when running a Java
application. While jstat can monitor any JVM application that has not
specified any options, -verbosegc needs to be specified in the
beginning, so it could be seen as an unnecessary option (since jstat can
be used instead). However, as -verbosegc displays easy to understand
output results whenever a GC occurs, it is very helpful for monitoring
rough GC information.
• HPjmeter
-verbosegc

33
JVisualVM + Visual GC plugin

34
• java -XX:+PrintFlagsFinal –version
Current state of flags?

35
• -XX:ThreadPriorityPolicy=0
Root required
Priority For Java Threads

36
-XX:MaxInlineLevel=9
-XX:MaxInlineSize=35
-XX:FreqInlineSize=325
-XX:MaxTrivialSize=6
-XX:MinInliningThreshold=250
-XX:LiveNodeCountInliningCutoff=40000
Inling

37
-XX:LoopUnrollLimit=16
Loop Unrolling
for (int i = 0; i < A.length; i++) {
for (int j = i; j > 0 && A[j-1] > A[j]; j--) {
int tmp = A[j-1];
A[j-1] = A[j];
A[j] = tmp;
}
}

38
Allocation elimination: -XX:+EliminateAllocations

39
• java -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_equals,_hashCode
Intrinsics

40
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintAssembly
ASM

41
Embedded or Android Java Project

42
Confidential
• In embedded environments we usually have
very restricted resources concerning
memory consumption, CPU performance
and battery lifetime. Additionally there are
often other restrictions like real time
requirements.
Embedded Project

43
Confidential
• When I was given the task of writing an
Android application, I decided to research
Dependency Injection Frameworks for
mobiles … It soon dawned on us that the
root of the problem was the DI framework.
It was searching for all injection resources
and references, while the app was starting
and trying to perform all the wiring at the
beginning of the app’s life cycle … Our
solution was to hard-wire the resources.
Even though this gave us an “uglier” app,
the app started up in lightning speed,
solving our performance issue.
Android Java Project

44
Android makes things more complicated.
Java virtual
machine
[stack, JIT]
Dalvik virtual
machine
[register, JIT]
Android runtime
[register, AOT]

45
1. Don't use any frameworks for server applications
2. Divide your development process on 2 separate important parts:
1) Designing a classic application using minimum frameworks
2) Applying performance tuning templates and anti-refactoring
3. Use Tools to analyze performance on concrete device

46
Confidential
• Format: ops/s
• Higher value is better
Micro optimizations
• Format: ns/op
• Smaller value is better
Throughput: operations per unit of time AverageTime: average time per operation

47
Confidential
.sorted(new Comparator<Bean>() {
@Override
public int compare(Bean x, Bean y) {
return (x.getValue() < y.getValue()) ? -1 :
((x.getValue() == y.getValue()) ? 0 : 1);
}
})
9611 ns/op
Micro optimizations (#8.2)
.sorted((x, y) -> (x.getValue() < y.getValue()) ?
-1 : ((x.getValue() == y.getValue()) ? 0 : 1))
10805 ns/op
Anonymous Comparator Lambda Comparator

48
Confidential
int sum = 0;
for (int i = 0 ; i < vals.length ; i++) {
sum += vals[i];
}
OptionalInt summa = Arrays.stream(vals)
.parallel()
.reduce((a,b) -> a + b);
Summa by Iterator Summa by Lambda Reduction

49
Confidential
int findVal = 3;
int[] ar = new int[] {1, 2, 3};
boolean isFind = IntStream.of(ar)
.anyMatch(i -> i == findVal);
Micro optimizations ~10x
int findVal = 3;
int[] ar = new int[] {1, 2, 3};
for(int i=0; i<ar.length; i++)
if(ar[i]==findVal) {
isFind = true;
break;
}
Find by Lambda Reduction Find by Iterator

50
Confidential
BoundsCheck
return ar[index]; try {
return ar[index];
} catch (RuntimeException e) {
return -1;
}
boundCheck noCheck noCheckWithTryCatch
if (index >= ar.length) { throw
new
ArrayIndexOutOfBoundsExceptio
n(); }
return ar[index];

51
Confidential
7.311 ns/op
+335%
+437%
Micro optimizations (#8.4 #8.5)
24.430 ns/op
+130% 31.973 ns/op
System.nanoTime() System.currentTimeMillis() Instant.now().toEpochMilli()

52
1. Primitive wrappers
2. Boxing / unboxing
3. Regex / Pattern matching
4. String concatenation in loops
Micro optimizations (#8.6: pr, valueOf, new, auto)
Primitives (10 000 ^ 2) Primitive wrappers (10 000 ^ 2)
4319062 nanoseconds (~0.004 sec) 1423851544 nanoseconds (~1.4 sec)

53
1. Micro optimizationsBefore After Time profit
class User { Date createdAt;} class User { long createdAt;} 120 milliseconds profit of add
1000 hours
for(String name: list){} Iterator<String> iter = list.iterator();
while(iter.hasNext())
{String name = iter.next();}
+10%
for(String name: list){} for(int i=0; i<list.size();i++){
String name = list.get(i);
}
+20%
for(String name: list){} for(String name: array){} +30%

54
List<Device> devices = …;
return devices.get(0);
ArrayList<Device> devices = …;
return devices.get(0);
+7%
class User {
Device[] devices = {};
}
class User {
final static Device[] EMPTY = {};
Device[] devices = EMPTY;
}
+50%
void make(String … params) void make(String param);
void make(String param1, String param2);
+2x
if(email == null || email.equals(“”))
{}
if(email == null || email.isEmpty()) {} +60%
Micro optimizations

55
val.equals(input.toLowerCase()); val.equalsIgnoreCase(input); +2.3x
val.startsWith(input.toLowerCase()); val.regionMatches(true, 0,input,0,len); +4x
val.split(“0”); customSplit(val, “0”); +3x
return body1.trim() + SEPARATOR+
body2.trim();
String tr1 = body1.trim();
String tr2 = body2.trim();
return tr1 + SEPARATOR+ tr2;
+4x
Micro optimizations

56
1. Micro optimizations
Before After Time profit
StringBuilder sb = new StringBuilder();
sb.append(body1);
sb.append(SEPARATOR);
sb.append(body2);
return sb.toString();
return new StringBuilder()
.append(body1)
.append(SEPARATOR)
.append(body2)
.toString();
+70%
Micro optimizations

57
try {
double d = Double.parseDouble(s);
} catch (NumberFormatException e) { }
● Does trim()
● Spams ASCIIToBinaryConverter
● Spams exceptions
● Slow
try {
double d = Utils.parseDouble(s);
} catch (NumberFormatException e) { }
+2.2x
Micro optimizations

58
double d = 11.2321D;
return Math.round(d);
double d = 11.2321D;
return (int) (d + 0.5);
+50%
Micro optimizations

59
public class User {
public String name;
public int age;
public User(String name, int age) {
this.name = name;
this.age = age;
}
}
JSON (#8.9)

60
JSON (#8.9)
Method Score
manualSerialize 39.797 ns/op
manualSerializePre 47.981 ns/op
jsonSerializeMapper 254.545 ns/op
jsonSerializeWriter 305.878 ns/op
googleGson 8328.877 ns/op

61
• JMH is a Java harness for building, running, and analyzing
nano/micro/milli/macro benchmarks written in Java and other languages
targeting the JVM.
• Part of the code-tools project of openjdk
• Used extensively within open jdk to test the internals
• Keeps pace with the changes in the jvm
• Brings scientific approach to benchmarking
Java Benchmarking: org.openjdk.jmh

62
• A maven based project . Bundles the benchmark code with the working jar
• A quick common annotation list
A quick look at JMH working
Annotation Function
@Benchmark Lines up the method for benchmarking
@BenchmarkMode Defines mode of benchmark line averagetime or
throughput
@Warmup Defines the warm-up cycles
@Measurement Defines the measurement iteration
@Fork Number of vm

63
1. Micro optimizations in Java
(Dmitriy Dumanskiy, CTO at Blynk):
https://github.com/doom369/java-microbenchmarks/
Micro optimizations

64
• The Simple Logging Facade For Java (slf4j) is a simple facade for various
logging frameworks, like JDK logging (java.util.logging), log4j, or logback.
Even it contains a binding tat will delegate all logger operations to
another well known logging facade called jakarta commons logging
(JCL).
• Logback is the successor of log4j logger API, in fact both projects have
the same father, but logback offers some advantages over log4j, like
better performance and less memory consumption, automatic reloading
of configuration files, or filter capabilities, to cite a few features.
Log

65
• Log4j 2.8.2
• Logback 1.2.3 using SLF4J 1.7.25
• JUL (java.util.logging)
Log

66
Collections

67
Confidential
1. Did this ever make sense?
2. Yes, on these assumptions:
- can ignore constant factors
- all instructions have the same duration
- memory doesn’t matter
- instruction execution dominates performance
3. But instruction execution is only one
bottleneck:
- Disk/Network
- Garbage Collection
- Resource Contention and more…
Which List Implementation?
get() add() remove(0)
ArrayList O(1) O(1) O(N)
LinkedList O(N) O(1) O(1)
COWArrayList O(1) O(N) O(N)

68
Confidential
pre-1980
Workplace Von Neumann architecture

69
1980 -> 2018

70
1980 -> 2018

71
Keeping the Cores Running Today

72
The Memory Hierarchy

73
Memory access time
action approximate time (ns)
typical processor instruction 1
fetch from L1 cache 0.5
branch misprediction 5
fetch from L2 cache 7
mutex lock/unlock 25
fetch from main memory 100
2 kB via 1 GB/s 20 000
seek for new disk location 8 000 000
read 1 MB sequentially from disk 20 000 000

74
Stride prefetching

75
Stride prefetching

76
LinkedList

77
• Linked List: node size is 24 bytes
• Running on Intel Core i5:
- L1data: 128K
- L2: 512K
- L3: 3M
• Each new list item is 40 bytes (24 + 16)
- L1 cache will be full at <3K items
• ArrayList is better: each new item is 20 bytes (4 + 16)
What’s Going On?

78
primitive array

79
• Concurrent collections
• Apache collections
• Guava collections
Non Default Collections

80
String … String … String … String …

81
1. memory leak :
• Download Webpage
• Get substring with head and work with head of page
SubString JDK < 1.7.0.06

82
public final class String ... {
private final char [] value ;
private int offset;
private int count;
public boolean equals(Object anObject)…

83
value =
offset = 0;
count = 11;
“Hello World”String str1 = “Hello World”;
value =
offset = 6;
count = 5;
String str2 =
str1.substring(6,str1.length());

84
SubString JDK >= 1.7.0.06
value =
“Hello World”
“World”String str1 = “Hello World”;
value =
String str2 =
str1.substring(6,str1.length());

85
Confidential
public int charAt () {
int r = 0;
for (int c = 0; c < text . length (); c ++) {
r += text . charAt (c);
}
return r;
}
charAt vs toCharArray
public int toCharArray () {
int r = 0;
char [] chars = text . toCharArray ();
r += chars [c];
}
return r;
}
charAt toCharArray

87
Confidential
public int charAt () {
int r = 0;
emptyMethod();
r += text . charAt (c);
}
return r;
}
charAt vs toCharArray
public int toCharArray () {
int r = 0;
char [] chars = text . toCharArray ();
emptyMethod();
r += chars [c];
}
return r;
}
charAt toCharArray

89
Multithreading

90
• Option #1 – synchronizing methods
public class SynchronizedCounterMethod {
private int c = 0;
public synchronized void increment() {
c++;
System.out.println("Current count value is " + c);
}
}
The synchronized keyword on a method means that if this is already
locked anywhere
Synchronization

91
• Option #1 – ByteCode
public synchronized void increment();
descriptor: ()V
flags: ACC_PUBLIC, ACC_SYNCHRONIZED
Code:
stack=3, locals=1, args_size=1
0: aload_0
1: dup
2: getfield #2 // Field c:I
5: iconst_1
6: iadd
7: putfield #2 // Field c:I
10: return
Synchronization

92
• Option #2 – synchronizing blocks:
public class SynchronizedCounterCode {
private int c = 0;
public void increment() {
synchronized(this) {
c++;
}
}
}
When synchronizing a block, key for the locking should be supplied
Synchronization

93
Confidential
• Option #2 – synchronizing blocks:
public void increment();
descriptor: ()V
flags: ACC_PUBLIC
Code:
0: aload_0
1: dup
2: astore_1
3: monitorenter
4: aload_0
5: dup
6: getfield #2 // Field c:I
9: iconst_1
10: iadd
Synchronization
11: putfield #2 // Field c:I
14: aload_1
15: monitorexit
16: goto 24
19: astore_2
20: aload_1
21: monitorexit
22: aload_2
23: athrow
24: return

94
• Option #3 – synchronizing static methods:
public class SynchronizedCounterStatic {
private int c = 0;
public synchronized static void increment() {
c++;
}
}
Synchronization

95
• Option #3 – synchronizing static methods:
public static synchronized void increment();
descriptor: ()V
flags: ACC_PUBLIC, ACC_STATIC, ACC_SYNCHRONIZED
Code:
0: getstatic #2 // Field c:I
3: iconst_1
4: iadd
5: putstatic #2 // Field c:I
8: return
Synchronization

96
Threads = 10
Benchmark Cnt Score Units
SynchronizedCounterBlock 3 47.856 ns/op
SynchronizedCounterMethod 3 61.074 ns/op
SynchronizedCounterStatic 3 575.552 ns/op
Synchronization Benchmark

97
Threads = 1
Synchronization Benchmark: Lock Optimization

98
1) The main responsibility of schedulers is to share a resource between
consumers
2) Linux schedulers:
a) Scheduler of Filesystem
b) Scheduler of NET
c) Scheduler of CPU
Linux scheduler

99
• The Completely Fair Scheduler (CFS) is a process scheduler which was
merged into the 2.6.23 (October 2007) release of the Linux kernel and is
the default scheduler. It handles CPU resource allocation for
executing processes, and aims to maximize overall CPU utilization while
also maximizing interactive performance.
Scheduler of CPU

100
CFS: Thread states

101
• For extrusion of flows of OS uses interruptions
• Interruptions are generated by the timer
• CPU interruption processing:
- Execution of the current instruction comes to an end
- Program Counter (PC) remains
- The processor of interruption is called
• The processor of interruption launches the scheduler
Interruptions

102
• Traditionally for HZ of times a second
• Starting with the version of a kernel 2.6.21 the option CONFIG_NO_HZ
appeared
• Starting with the version of a kernel 3.10 the option
CONFIG_NO_HZ_FULL appeared
grep 'CONFIG_HZ=' /boot/config-$(uname -r)
How does the timer often work?

103
● Real-time scheduling policies - sched/rt.c
○ SCHED_FIFO - thread might be preempted only by higher priority thread
○ SCHED_RR - thread might be preempted by higher priority thread of time
quantum has expired
● Non-realtime scheduling policies - sched/fair.c
○ SCHED_BATCH - thread is always assumed CPU intensive
○ SCHED_IDLE - thread is executed very rarely
○ SCHED_OTHER - most commonly used policy
● Controlled by sched_setscheduler function call
Scheduling policies

104
● Static priority
○ 0 - ordinary application
○ 1-99 - real time application. Used for SCHED_FIFO and SCHED_RR
● Nice value (-19..20) only used for SCHED_OTHER and
SCHED_BATCH
Static priority and nice value

105
Linux schedulers history
O(n) O(1) CFS
Linux 2.4 Linux 2.4-2.6 Linux 2.6+

106
• We will begin with reviewing of one CPU
• CPU has runqueue - queue of the tasks which are in READY status
• Runqueue - the queue with priorities realized with use of a red-black tree
• As a priority vruntime is used
Completely Fair Scheduler (CFS)

107
● vruntime = real runtime / task weight + start_min_vruntime
● vruntime обновляется каждый раз при срабатывании таймера
● Task Weight зависит от nice value
● Nice value (-20, 19)
● Task Weight = 1024 / 1.25^nice
vruntime

108
● Each kernel of the processor uses independent queue
● It is necessary to move threads between queues for balancing of
loading
CFS scheduler multicore

109
• It is impossible to use the queue size as criterion of balancing because
threads can have different priority
• It is impossible to use total threads weight as criterion of balancing
because threads can sleep and kernels will nonuniformly are loaded
• It is necessary to use load a metrics which considers thread weight and
CPU utilization.
Balancing of threads between queues

110
Pthreads methods

111
• Java does not use POSIX primitives
• Instead of this Java determines the primitives by park/unpark
Java park/unpark

112
void os::PlatformEvent::park() {
int status = pthread_mutex_lock(_mutex);
assert_status(status == 0, status, "mutex_lock");
guarantee (_nParked == 0, "invariant") ;
++ _nParked ;
while (_Event < 0) {
status = pthread_cond_wait(_cond, _mutex);
if (status == ETIME) { status = EINTR; }
}
-- _nParked ;
_Event = 0 ;
status = pthread_mutex_unlock(_mutex);
}
os_linux.cpp

113
@JCStressTest
@Outcome(id = “1", expect = Expect.ACCEPTABLE, desc = "Default outcome.")
@State
public class TestSubject {
private int x = 0;
@Actor
void executedOnCpu1(IntResult1 intResult1) {
x++;
intResult1.r1 = x;
}
}
JCStressTest

114
java -jar target/jcstress.jar -t ".*TestSubject.*" -v
JCStressTest

115
Java LMAX Disruptor

116
<dependency>
<groupId>com.lmax</groupId>
<artifactId>disruptor</artifactId>
<version>3.4.1</version>
</dependency>
Java Disruptor

117
Optimize architecture and classes

118
• Another important feature of Java is its ability to load your compiled Java
classes (bytecode) following the start-up of the JVM. Depending on the
size of your application, the class loading process can be intrusive and
significantly degrade the performance of your application under high
load following a fresh restart. This short-term penalty can also be
explained by the fact that the internal JIT compiler has to start over its
optimization work following a restart.
Class Loading

119
• Profile your application for possible memory leaks using tools
such as Plumbr (Java memory leak detector).
• Performance Tip: Focus your analysis on the biggest Java
object accumulation points. It is important to realize that
reducing your application memory footprint will translate in
improved performance due to reduced GC activity.
• SET PATH=C:Toolsplumbrplumbrwin64;%PATH% set
CATALINA_OPTS=-agentlib:plumbr -
javaagent:C:Toolsplumbrplumbrplumbr.jar
Memory leak

120
jconsole

121
jvisualvm

122
• The Eclipse Memory Analyzer is a fast and feature-rich Java heap
analyzer that helps you find memory leaks and reduce memory
consumption.
• Use the Memory Analyzer to analyze productive heap dumps with
hundreds of millions of objects, quickly calculate the retained sizes of
objects, see who is preventing the Garbage Collector from collecting
objects, run a report to automatically extract leak suspects.
• https://www.eclipse.org/mat/
Memory Analyzer (MAT) (#7)

123
• Recursive code logic leading to StackOverFlowError is another common
scenario in Java applications.
Avoid Recursion

124
• Jame Gosling said
“You should avoid implementation inheritance whenever possible.”
Avoid Java Inheritance “Extends”

125
• Memory prices are low and getting lower, and retrieving data from disk
or via a network is still expensive. Caching is certainly one aspect of
application performance we shouldn’t overlook.
• Of course, introducing a stand-alone caching system into the topology
of an application does add complexity to the architecture – so a good
way to start leveraging caching is to make good use of existing caching
capabilities in the libraries and frameworks we’re already using.
Architectural Improvements: Caching

126
• No matter how much hardware we throw at a single instance, at some
point that won’t be enough. Simply put, scaling up has natural
limitations, and when the system hits these – scaling out is the only way
to grow, evolve and simply handle more load.
• Finally, an additional advantage of scaling with the help of a cluster,
beyond pure Java performance – is that adding new nodes also leads to
redundancy and better techniques of dealing with failure, leading to
overall higher availability of the system.
Architectural Improvements: Scaling Out

127
Thank you

Slices Of Performance in Java - Oleksandr Bodnar

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Slices Of Performance in Java - Oleksandr Bodnar

Ähnlich wie Slices Of Performance in Java - Oleksandr Bodnar (20)

Mehr von GlobalLogic Ukraine

Mehr von GlobalLogic Ukraine (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Slices Of Performance in Java - Oleksandr Bodnar