SlideShare a Scribd company logo
1 of 75
Download to read offline
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 1/75
1 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 2/75
WHENCE?WHENCE?
NLP Infrastructure Technical Lead @ Grammarly
Clojure, Common Lisp, Java
Services that improve writing of 30 million users (15 million daily)
2 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 3/75
WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT
PERFORMANCE?PERFORMANCE?
premature optimization is the root of all evil.
— Donald Knuth
3 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 4/75
WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT
PERFORMANCE?PERFORMANCE?
"We should forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil. Yet we should
not pass up our opportunities in that critical 3%." — Donald Knuth
4 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 5/75
PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES
"Hardware is cheap, programmers are expensive."
A.K.A. "Just throw more machines into it."
5 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 6/75
PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES
  3 c5.9xlarge EC2 instances:    $3,351 monthly
10 c5.9xlarge EC2 instances: $11,170 monthly
Is it worth to spend 1 person-month to optimize from 10 to 3?
Probably.
6 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 7/75
PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES
"Hardware is cheap, programmers are expensive."
A.K.A. "Just throw more machines into it."
"Docker/Kubernetes/microservices/cloud/whatever allows you to
scale horizontally."
7 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 8/75
PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES
There's no such thing as effortless horizontal scaling.
At each next order of magnitude you get new headaches:
More infrastructure (balancers, service discovery, queues, …)
Configuration management
Observability
Deployment story
Debugging story
Complexity of setting up testing environments
Whole bunch of second-order effects
Mental tax
You hire more devops/platform engineers/SREs to deal with this.
8 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 9/75
PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES
"Hardware is cheap, programmers are expensive."
A.K.A. "Just throw more machines into it."
"Docker/Kubernetes/microservices/cloud/whatever allow us to
scale horizontally."
9 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 10/75
WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT
PERFORMANCE?PERFORMANCE?
Ability to distinguish between those 97% and 3% is crucial in
building effective so ware.
That ability requires:
Knowledge
Tools
Experience
Experience comes from practice.
10 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 11/75
WHAT CLOJURE HAS TO DO WITH ANY OFWHAT CLOJURE HAS TO DO WITH ANY OF
THIS?THIS?
11 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 12/75
CLOJURE IS FASTCLOJURE IS FAST
Dynamically compiled language
World-class JVM JIT for free
Data structures with performance in mind
Conservative polymorphism features
Ability to drop down to Java where necessary
12 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 13/75
CLOJURE IS VERSATILECLOJURE IS VERSATILE
REPL is the best so ware design tool you can get.
Applies to performance work too.
Hundreds of people work on creating tools for measuring and
improving performance on JVM.
Easily usable from Clojure.
13 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 14/75
WAYS TO MEASURE HOW FAST/SLOW ISWAYS TO MEASURE HOW FAST/SLOW IS
SOMETHINGSOMETHING
1. "Feels slow"
2. Wrist stopwatch
3. (time ...)
4. (time (dotimes [_ 10000] ...)
5. Criterium
14 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 15/75
REFLECTIONREFLECTION
15 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 16/75
REFLECTIONREFLECTION
185x speedup! But why?
(require '[criterium.core :as crit])
(def s "This gotta be good")
(crit/quick-bench (.substring s 5 18))
;; Execution time mean : 2.760464 µs
(crit/quick-bench (.substring ^String s 5 18))
;; Execution time mean : 14.897897 ns
16 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 17/75
REFLECTIONREFLECTION
Reflection is Java's introspection mechanism for resolving and
calling the program's building blocks (classes, fields, methods) at
runtime.
In the same spirit as Clojure's resolve, ns-publics, apply.
Common explanation is "reflection is slow".
17 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 18/75
REFLECTIONREFLECTION
We can use Java Reflection directly from Clojure.
Turns out the reflective call itself is not that slow. Maybe it's the
resolution of the method?
(def m (.getDeclaredMethod String "substring"
(into-array Class [Integer/TYPE Integer/TYPE])))
;; returns java.lang.reflect.Method object
(crit/quick-bench (.invoke ^Method m s
(object-array [(Integer. 5) (Integer. 18)])))
;; Execution time mean : 107.801748 ns
(crit/quick-bench
(let [^Method m (.getDeclaredMethod
String "substring"
(into-array Class [Integer/TYPE Integer/TYPE]))]
(.invoke m string (object-array [(Integer. 5) (Integer. 18)]))))
;; Execution time mean : 648.579085 ns
18 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 19/75
REFLECTIONREFLECTION
What's really going on when Clojure performs a reflective call?
One way is to dig into clojure.lang.Compiler (9k SLOC).
Another way is to use clj-java-decompiler library.
19 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 20/75
CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER
Non-reflective call:
And you'll get:
https://github.com/clojure-goes-fast/clj-java-decompiler
(require '[clj-java-decompiler.core :refer [decompile]])
(decompile (.substring ^String s 5 18))
Var const__0 = RT.var("slides", "s");
// ...
((String)const__0.getRawRoot()).substring(RT.intCast(5L), RT.intCast(18L));
20 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 21/75
CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER
Reflective call:
(decompile (.substring s 5 18))
Var const__0 = RT.var("slides", "s");
Object const__1 = 5L;
Object const__2 = 18L;
// ...
Reflector.invokeInstanceMethod(const__0.getRawRoot(),
"substring",
new Object[] { const__1, const__2 });
21 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 22/75
INSIDE CLOJURE/LANG/REFLECTOR.JAVAINSIDE CLOJURE/LANG/REFLECTOR.JAVA
static Object invokeInstanceMethod(Object target, String methodName,
Object[] args) {
Class c = target.getClass();
List methods = getMethods(c, args.length, methodName, false);
return invokeMatchingMethod(methodName, methods, target, args);
}
static List getMethods(Class c, int arity, String name, boolean getStatics) {
ArrayList methods = new ArrayList();
for (Method m : c.getMethods())
if (name.equals(method.getName()))
methods.add(method);
return methods;
}
static Object invokeMatchingMethod(String methodName, List methods,
Object target, Object[] args) {
Method foundm = null;
for (Method m : methods) {
Class[] params = m.getParameterTypes();
if(isCongruent(params, args))
foundm = m;
}
foundm.invoke(target, args);
}
22 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 23/75
REFLECTIONREFLECTION
On a reflective call, Clojure looks through all methods of the class
linearly, at runtime.
No wonder why reflective calls are so slow!
23 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 24/75
WAYS TO COMBAT REFLECTIONWAYS TO COMBAT REFLECTION
Enable *warn-on-reflection*
Use type hints
And occasionally check with clj-java-decompiler.
(set! *warn-on-reflection* true)
(.substring s 5 18)
;; Reflection warning, .../slides.clj:114:12 - call to
;; method substring can't be resolved (target class is unknown).
24 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 25/75
SHOULD REFLECTION BE WEEDED OUTSHOULD REFLECTION BE WEEDED OUT
EVERYWHERE?EVERYWHERE?
There's nothing wrong with having zero-reflection policy.
But a few stray reflection calls won't hurt if they aren't called
o en.
You should profile to know for sure.
25 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 26/75
CLJ-ASYNC-PROFILERCLJ-ASYNC-PROFILER
The most convenient profiler-as-a-library for Clojure.
https://github.com/clojure-goes-fast/clj-async-profiler
(require '[clj-async-profiler.core :as prof])
(prof/profile
(crit/quick-bench (.substring s 5 18)))
26 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 27/75
Flame Graph Search
clojure/lang/RestFn.applyTo
clojure/core$eval.invoke
java/lang/Class$MethodAr..
criterium/core$benchmark_STAR_.invokeStatic
clojure/lang/AFn.applyToHelper
criterium/core$run_benchmark.invokeStatic
clojure/lang/Reflector.getMethods
cl..
nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke
jav.. java/util/A..
criterium/core$quick_benchmark_STAR_.invokeStatic
c..
crite..
cloju..
criterium/core$execute_expr_core_timed_part.invoke
clojure/main$repl$read_eval_print__8572$fn__8575.invoke
jav..
cloju..
criterium/core$warmup_for_jit.invoke
java/lang/Class$M..
slides$eval15051.invoke
criterium/core$execute_expr.invoke
jav..java/lang/Class.privateGetPublicMethods
criterium/core$warmup_for_jit.invokeStatic
java/util/A..
clojure/lang/Compiler.eval
clojure/lang/AFn.applyTo
clojure/main$repl.doInvoke
clojure/lang/RestFn.invoke
java/lang/Class$MethodAr..
j..
clojure/core$apply.invoke
criterium/core$execute_expr_core_timed_part$fn__14416.invoke
clojure/core$apply.invokeStatic
jav..
slides$eval15051$fn__15052.invoke
cr..
cloj..
j..
clojure/core$with_bindings_STAR_.invokeStatic
crite..
criterium/core$execute_expr_core_timed_part.invokeStatic
slide..
criterium/core$quick_benchmark_STAR_.invoke
slides$eval15051$fn__15052$fn__15053.invoke
java/lang/Class.getMethods
clojure/main$repl.invokeStatic
crite..
sl..
clojure/main$repl$read_eval_print__8572.invoke
refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke
criter..
clojure/lang/Reflector.invokeInstanceMethod
cr..
clojure/core$eval.invokeStatic
criterium/core$execute_expr.invokeStatic
slides$eval15051.invokeStatic
cr..
java/uti..
cr..
cr..
clojure/lang/RestFn.invoke
clojure/lang/Compiler.eval
crite..
clojure/core$with_bindings_STAR_.doInvoke
clojure/core$apply.invokeStatic
jav..
clojure/main$repl$fn__8581.invoke
cr..
java/lang/Class$M..
criterium/core$run_benchmark.invoke
criterium/core$benchmark_STAR_.invoke
crite..
cr..
c..
jav..
jav..
cr..
j..
jav..
cr..
criter..
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 28/75
nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke
clojure/lang/AFn.run
java/util/concurrent/ThreadPoolExecutor.runWorker
nrepl/middleware/interruptible_eval$evaluate.invoke
java/lang/Thread.run
java/util/concurrent/ThreadPoolExecutor$Worker.run
nrepl/middleware/interruptible_eval$evaluate.invokeStatic
nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke
27 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 29/75
Flame Graph Search
j..
ja..
cl..
clojure/core$eval.invoke
clojure/edn$read_string.invokeStatic
j..
c..
jav..
cl..
clojure/lang/EdnReader.readDelimitedList
java/.. c..
clojure/lang/Compiler.eval
cloj..
clojure/lang/EdnRead..
clojure/lang/EdnReader.readString
ja..
boot/user$eval16156$fn__16157.invoke
clojure/la..
j..
boot/user$eval16156.invoke
cheshire/parse$parse.inv..
java/util/re..
cloj..
ja..
clojure/lang/EdnReader.readDelimitedList
j..
java..
cheshire/parse$parse.inv..
clojure/..
clojure/lang/EdnReader.readDelimitedList
java/..
cl..
clojure/lang/EdnReader.read
cl..
clojure/lang/EdnReader$..
clojure/edn$read_string.invokeStatic
cheshire/parse$p..
clojure/lang/Compiler.eval
cl..
clojure/lang/EdnReader$MapReader.invoke
clojure/lang/EdnReader.readDelimitedList
cloju..
c..
clojure/lang/EdnReader$MapReader.invoke
ja.. cheshire/parse..
java/util/r..
co..
clojure/main$repl$read_eval_print__8572$fn__8575.invoke
ja..
clojure/lang/EdnReader..
c..
cl..
cheshire/core$parse_string.i..
c..
j..
c..
clojure/lang/EdnReader.read
jav..
boot/user$parse_edn.invoke
ja..
boot/user$run.invokeStatic
c..
cloj..
j..clojure/lang/EdnRead..
j..
ja..
ja..
ja..
ja..
boot/user$parse_json.invokeSt..
clojure/lang/RestFn.applyTo
clojure..
ja..
clojure/lang/EdnReader.read
j..
j..
clojure/lang/EdnReader.readDelimitedList
clojure/lang/EdnReader$MapReader.invoke
clojure/lang/EdnReader$MapReader.invoke
ja..
c..
java/util/r..
cloj..
boot/user$eval16156.invokeStatic
c..
cheshire/core$parse_string.i..
clojure/main$repl$read_eval_print__8572.invoke
clojure/core$eval.invokeStatic
j..java/util/re..
cheshire/parse$pars..
cloj..
cl..
cheshire/parse$parse_ST..
boot/user$parse_json.invoke
cheshire/parse$pars..
clojure..
cloj..
ches..
ja..
j..
j..
cloju..
clojure/lang/EdnReader$MapReader.invoke
c..ches..
cheshire/parse$p..
java..
boot/user$parse_edn.invokeStatic
cl..
clojure/edn$read_string.invoke
cheshire/parse$parse_..
boot/user$run.invoke
java/io/..
clojure/main$repl.invokeStatic
cl..
clo..
cheshire/parse..
jav..
java/util/re..
clojure/main$repl$fn__8581.invoke
cheshire/core$parse_string.i..
cheshire/parse$parse_ST..
cheshire/core$parse_string.i..
cloj..
cloju..
clojure/main$repl.doInvoke
cheshire/parse$parse_..
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 30/75
nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke
clojure/core$apply.invokeStatic
java/util/concurrent/ThreadPoolExecutor.runWorker
nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke
java/util/concurrent/ThreadPoolExecutor$Worker.run
clojure/lang/AFn.run
clojure/lang/RestFn.invoke
clojure/lang/AFn.applyToHelper
clojure/lang/AFn.applyTo
clojure/core$apply.invoke
nrepl/middleware/interruptible_eval$evaluate.invoke
nrepl/middleware/interruptible_eval$evaluate.invokeStatic
clojure/lang/RestFn.invoke
clojure/core$with_bindings_STAR_.invokeStatic
clojure/core$apply.invokeStatic
nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke
refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke
java/lang/Thread.run
clojure/core$with_bindings_STAR_.doInvoke
28 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 31/75
CLJ-ASYNC-PROFILERCLJ-ASYNC-PROFILER
Profiler that is controllable from your code.
Instant feedback without leaving the REPL.
Flamegraphs are a great representation.
Intuitive and portable.
29 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 32/75
BOXINGBOXING
30 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 33/75
BOXINGBOXING
31 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 34/75
BOXINGBOXING
Boxing means wrapping primitive types into objects.
19x difference — not bad!
(let [nums (vec (range 1e6))]
(crit/quick-bench (reduce + nums)))
;; Execution time mean : 18.384708 ms
(let [^longs nums (into-array Long/TYPE (range 1e6))]
(crit/quick-bench
(areduce nums i acc 0
(+ acc (aget nums i)))))
;; Execution time mean : 971.487253 µs
32 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 35/75
BOXINGBOXING
(decompile
(let [^longs nums (into-array Long/TYPE (range 1e6))]
(areduce nums i acc 0
(+ acc (aget nums i)))))
final Object nums = core$into_array.invokeStatic(
(Object)Long.TYPE, core$range.invokeStatic(100000));
final int lng = ((long[])nums).length;
long i = 0L;
long acc = 0L;
while (i < lng) {
final long n = RT.intCast(i) + 1;
acc = Numbers.add(acc, ((long[])nums)[RT.intCast(i)]);
i = n;
}
return Numbers.num(acc);
33 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 36/75
WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING
Profile to ensure that boxing is really a problem.
Arrays instead of lists and vectors.
Primitive type hints and casts.
(set! *unchecked-math* :warn-on-boxed)
34 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 37/75
:WARN-ON-BOXED:WARN-ON-BOXED
(set! *unchecked-math* :warn-on-boxed)
(let [init (fn [] 1)]
(loop [i (init), res (init)]
(if (< i 10)
(recur (inc i) (* res i))
res)))
;; Boxed math warning, ../slides.clj:25:9 -
;; call: clojure.lang.Numbers.lt(Object,long).
;; Boxed math warning, ../slides.clj:26:14 -
;; call: clojure.lang.Numbers.unchecked_inc(Object).
;; Boxed math warning, ../slides.clj:26:22 -
;; call: clojure.lang.Numbers.unchecked_multiply(Object,Object).
35 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 38/75
:WARN-ON-BOXED:WARN-ON-BOXED
(set! *unchecked-math* :warn-on-boxed)
(let [init (fn [] 1)]
(loop [i (long (init)), res (long (init))]
(if (< i 10)
(recur (inc i) (* res i))
res)))
36 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 39/75
WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING
Profile to ensure that boxing is really a problem.
Arrays instead of lists and vectors.
Primitive type hints.
(set! *unchecked-math* :warn-on-boxed)
clj-java-decompiler
37 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 40/75
CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER
(decompile
(let [init (fn [] 1)]
(loop [i (init), res (init)]
(if (< i 10)
(recur (inc i) (* res i))
res))))
Object init = new slides$fn__17198$init__17199();
Object i = ((IFn)init).invoke();
Object res = ((IFn)init).invoke();
while (Numbers.lt(i, 10L)) {
final Object i2 = Numbers.inc(i);
res = Numbers.multiply(res, i);
i = i2;
}
return res;
38 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 41/75
CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER
(decompile
(let [init (fn [] 1)]
(loop [i (long (init)), res (long (init))]
(if (< i 10)
(recur (inc i) (* res i))
res))))
Object init = new slides$fn__17198$init__17199();
long i = ((IFn)init).invoke();
long res = ((IFn)init).invoke();
while (Numbers.lt(i, 10L)) {
final Object i2 = Numbers.inc(i);
res = Numbers.multiply(res, i);
i = i2;
}
return res;
39 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 42/75
WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING
Profile to ensure that boxing is really a problem.
Arrays instead of lists and vectors.
Primitive type hints.
(set! *unchecked-math* :warn-on-boxed)
clj-java-decompiler
Write number crunching in Java.
40 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 43/75
WRITE JAVA IN STYLEWRITE JAVA IN STYLE
Compile Java code without leaving or restarting your REPL.
Use new classes immediately in your Clojure code.
You still have the access to all Clojure development tools.
https://github.com/ztellman/virgil
41 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 44/75
INSUFFICIENT MEMORYINSUFFICIENT MEMORY
42 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 45/75
INSUFFICIENT MEMORYINSUFFICIENT MEMORY
If you don't specify -Xmx, JVM will start with default heap size.
Usually, it's 1/4 of available RAM.
$ java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
uintx MaxHeapSize := 2353004544
43 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 46/75
INSUFFICIENT MEMORYINSUFFICIENT MEMORY
If there's not enough free memory, GC might get too busy and slow
you down.
Same code, 7x slower without any apparent reason.
$ clj -J-Xmx2g
user=> (time
(dotimes [_ 10]
(reduce + (vec (repeat 5e7 1)))))
;; Elapsed time: 17084.184121 msecs
user=> (def live-set (byte-array 1.3e9))
user=> (time
(dotimes [_ 10]
(reduce + (vec (repeat 5e7 1)))))
;; Elapsed time: 125614.873082 msecs
44 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 47/75
WAYS TO DETECT MEMORY SHORTAGEWAYS TO DETECT MEMORY SHORTAGE
(In development) VisualVM
45 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 48/75
VISUALVMVISUALVM
Normal:
46 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 49/75
VISUALVMVISUALVM
GC is overworked:
47 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 50/75
WAYS TO DETECT MEMORY SHORTAGEWAYS TO DETECT MEMORY SHORTAGE
(In development) VisualVM
(In production) VisualVM over JMX, jstat, …
clj-memory-meter to understand what occupies memory.
48 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 51/75
CLJ-MEMORY-METERCLJ-MEMORY-METER
Reports how much heap an object consumes.
https://github.com/clojure-goes-fast/clj-memory-meter
(require '[clj-memory-meter.core :as mm])
(mm/measure "Hello, world!")
;; "72 B"
(mm/measure (reduce #(assoc %1 %2 (str %2)) {} (range 100)))
;; "9.6 KB"
(mm/measure (vec (repeat 5e7 1)))
;; "258.4 MB"
49 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 52/75
IMMUTABILITYIMMUTABILITY
50 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 53/75
IMMUTABILITYIMMUTABILITY
We love immutability, but sometimes, it is unnecessary.
(crit/quick-bench
(let [obj (Object.)]
(loop [i 0, res []]
(if (< i 1e6)
(recur (inc i) (conj res obj))
res))))
;; Execution time mean : 31.455536 ms
51 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 54/75
WAYS TO COMBAT IMMUTABILITYWAYS TO COMBAT IMMUTABILITY
Profiler
Transients
2.2x speedup.
(crit/quick-bench
(let [obj (Object.)]
(loop [i 0, res (transient [])]
(if (< i 1e6)
(recur (inc i) (conj! res obj))
(persistent! res)))))
;; Execution time mean : 14.115719 ms
52 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 55/75
WAYS TO COMBAT IMMUTABILITYWAYS TO COMBAT IMMUTABILITY
Profiler
Transients
Mutable Java collections
5x speedup.
(crit/quick-bench
(let [obj (Object.)
res (ArrayList.)]
(loop [i 0]
(when (< i 1e6)
(.add res obj)
(recur (inc i))))
res))
;; Execution time mean : 6.344132 ms
53 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 56/75
CAVEAT EMPTORCAVEAT EMPTOR
If you need the resulting collection to be a Clojure structure,
transients are more efficient than Java classes.
(crit/quick-bench
(let [obj (Object.)
res (ArrayList.)]
(loop [i 0]
(when (< i 1e6)
(.add res obj)
(recur (inc i))))
(vec res)))
;; Execution time mean : 19.435359 ms
54 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 57/75
LAZINESSLAZINESS
55 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 58/75
LAZINESSLAZINESS
Increases allocation pressure -> more work for GC
Worse memory locality
Harder to debug and profile
56 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 59/75
LAZINESSLAZINESS
Everyone did this at least once in their career:
Wow, Clojure is fast! /s
(time (dotimes [_ 1e6]
(map inc (range 1e6))))
;; Elapsed time: 30.931708 msecs
57 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 60/75
LAZINESSLAZINESS
(defn burn-cpu []
(let [start (System/nanoTime)]
(loop [res 0]
(if (< (- (System/nanoTime) start) 1e9)
(recur (inc res))
res))))
(prof/profile
(let [nums (map (fn [_] (burn-cpu)) (range 10))]
(reduce + nums)))
58 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 61/75
LAZINESSLAZINESS
Flame Graph Search
clojure/core$apply.invoke
clojure/core$eval.invoke
nrepl/middleware/interruptible_eval$evaluate.invoke
clojure/core/protocols$seq_reduce.invokeStatic
clojure/core/protocols$fn__7835.invokeStatic
clojure/lang/RestFn.invoke
clojure/core$reduce.invoke
clojure/lang/Compiler.eval
clojure/core/protocols$fn__7781$G__7776__7794.invoke
slides$eval16450$fn__16451$fn__16452.invoke
nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke
clojure/lang/Numbe..
clojure/core$with_bindings_STAR_.doInvoke
clojure/core$map$fn__5587.invoke
clojure/main$repl$read_eval_print__8572.invoke
clojure/main$repl$read_eval_print__8572$fn__8575.invoke
slides$eval16450.invoke
clojure/lang/RestFn.applyTo
clojure/main$repl$fn__8581.invoke
slides$burn_cpu.invoke
clojure/core$eval.invokeStatic
clojure/main$repl.doInvoke
clojure/core/protocols$fn__7835.invoke
nrepl/middleware/interruptible_eval$evaluate.invokeStatic
clojure/core$reduce.invokeStatic
clojure/lang/LazySeq.sval
refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke
clojure/core$seq__5124.invokeStatic
slides$eval16450.invokeStatic
clojure/core$apply.invokeStatic
clojure/lang/AFn.applyTo
clojure/lang/RT.seq
clojure/lang/Compiler.eval
clojure/core$with_bindings_STAR_.invokeStatic
slides$eval16450$fn__16451.invoke
clojure/lang/RestFn.invoke
clojure/core$apply.invokeStatic
clojure/lang/Numbers.minus
clojure/lang/LazySeq.seq
clojure/main$repl.invokeStatic
nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke
slides$burn_cpu.invokeStatic
clojure/lang/AFn.applyToHelper
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 62/75
java/lang/Thread.run
nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke
clojure/lang/AFn.run
java/util/concurrent/ThreadPoolExecutor$Worker.run
java/util/concurrent/ThreadPoolExecutor.runWorker
59 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 63/75
WAYS TO COMBAT LAZINESSWAYS TO COMBAT LAZINESS
doall, mapv, filterv
Transducers
60 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 64/75
TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR
Reflection
Boxing
Insufficient memory
Immutability
Laziness
61 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 65/75
TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR
Reflection
Boxing
Insufficient memory
Immutability
Laziness
Redundant allocations
Coarsely-synchronized data structures
Context switching overhead
…
62 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 66/75
TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR
Reflection
Boxing
Insufficient memory
Immutability
Laziness
Redundant allocations
Coarsely-synchronized data structures
Context switching overhead
…
GC pauses
Megamorphic callsites
Heap fragmentation
… 63 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 67/75
TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR
Reflection
Boxing
Insufficient memory
Immutability
Laziness
Redundant allocations
Coarsely-synchronized data structures
Context switching overhead
…
GC pauses
Megamorphic callsites
Heap fragmentation
…
Cache incoherence
TLB misses (page walks)
Branch misprediction
NUMA foreign access
…
64 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 68/75
TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR
Reflection
Boxing
Insufficient memory
Immutability
Laziness
Redundant allocations
Coarsely-synchronized data structures
Context switching overhead
…
GC pauses
Megamorphic callsites
Heap fragmentation
…
Cache incoherence
TLB misses (page walks)
Branch misprediction
NUMA foreign access
…
Magnetic disturbances
CPU overheating
…
65 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 69/75
PERFORMANCE IS HARDPERFORMANCE IS HARD
Abstractions are constantly leaking.
The more you learn, the less you know.
Your assumptions are constantly getting invalidated.
66 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 70/75
PERFORMANCE IS FUN AND USEFULPERFORMANCE IS FUN AND USEFUL
You learn things behind those leaky abstractions.
You get a more holistic view of the system.
You save money and the environment.
67 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 71/75
PERFORMANCE PROBLEMS ARE NOTPERFORMANCE PROBLEMS ARE NOT
UNIQUE TO CLOJUREUNIQUE TO CLOJURE
But we are in a great position to solve them.
There is plenty of prior art, especially for JVM.
Tools, blogposts, experiments, reports.
REPL allows us to use all of this much more easily.
68 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 72/75
TOOLSTOOLS
Criterium:
clj-java-decompiler:
clj-async-profiler:
clj-memory-meter:
virgil:
VisualVM:
JMH:
https://github.com/hugoduncan/criterium
https://github.com/clojure-goes-fast/clj-java-
decompiler
https://github.com/clojure-goes-fast/clj-async-
profiler
https://github.com/clojure-goes-fast/clj-
memory-meter
https://github.com/ztellman/virgil
https://visualvm.github.io
http://clojure-goes-fast.com/blog/using-jmh-with-clojure-
part1/
69 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 73/75
RESOURCESRESOURCES
Aleksey Shipilëv's blog:
Nitsan Wakart's blog:
http://clojure-goes-fast.com
https://groups.google.com/forum/#!forum/mechanical-sympathy
https://shipilev.net/
http://psy-lob-saw.blogspot.com/
70 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 74/75
INSTEAD OF A CONCLUSIONINSTEAD OF A CONCLUSION
First, make it work.
Then, make it right.
Then, make it fast.
But please, don't stop at the first.
71 . 1
12/4/2018 Speed bumps ahead
http://localhost:3002/clojurex-2018/?print-pdf 75/75
(into [] (map answer) questions)
72 . 1

More Related Content

Similar to Speed bumps ahead

Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Jyotirmoy Sundi
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security LLC
 
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)#MBLTdev: Разработка первоклассных SDK для Android (Twitter)
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)e-Legion
 
DDD on example of Symfony (SfCampUA14)
DDD on example of Symfony (SfCampUA14)DDD on example of Symfony (SfCampUA14)
DDD on example of Symfony (SfCampUA14)Oleg Zinchenko
 
JUDCon London 2011 - Bin packing with drools planner by example
JUDCon London 2011 - Bin packing with drools planner by exampleJUDCon London 2011 - Bin packing with drools planner by example
JUDCon London 2011 - Bin packing with drools planner by exampleGeoffrey De Smet
 
Open Source XMPP for Cloud Services
Open Source XMPP for Cloud ServicesOpen Source XMPP for Cloud Services
Open Source XMPP for Cloud Servicesmattjive
 
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMayank Prasad
 
Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Valeriy Kravchuk
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsDhaval Shah
 
Build, train and deploy your ML models with Amazon Sage Maker
Build, train and deploy your ML models with Amazon Sage MakerBuild, train and deploy your ML models with Amazon Sage Maker
Build, train and deploy your ML models with Amazon Sage MakerAWS User Group Bengaluru
 
Whatever it takes - Fixing SQLIA and XSS in the process
Whatever it takes - Fixing SQLIA and XSS in the processWhatever it takes - Fixing SQLIA and XSS in the process
Whatever it takes - Fixing SQLIA and XSS in the processguest3379bd
 
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Jelastic Multi-Cloud PaaS
 
Machine learning key to your formulation challenges
Machine learning key to your formulation challengesMachine learning key to your formulation challenges
Machine learning key to your formulation challengesMarc Borowczak
 
Innovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and MonitoringInnovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and MonitoringCary Millsap
 
Building Twitter's SDKs for Android
Building Twitter's SDKs for AndroidBuilding Twitter's SDKs for Android
Building Twitter's SDKs for AndroidAndy Piper
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
10 ways to make your code rock
10 ways to make your code rock10 ways to make your code rock
10 ways to make your code rockmartincronje
 
Nhibernate Part 2
Nhibernate   Part 2Nhibernate   Part 2
Nhibernate Part 2guest075fec
 

Similar to Speed bumps ahead (20)

Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠
 
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)#MBLTdev: Разработка первоклассных SDK для Android (Twitter)
#MBLTdev: Разработка первоклассных SDK для Android (Twitter)
 
DDD on example of Symfony (SfCampUA14)
DDD on example of Symfony (SfCampUA14)DDD on example of Symfony (SfCampUA14)
DDD on example of Symfony (SfCampUA14)
 
JUDCon London 2011 - Bin packing with drools planner by example
JUDCon London 2011 - Bin packing with drools planner by exampleJUDCon London 2011 - Bin packing with drools planner by example
JUDCon London 2011 - Bin packing with drools planner by example
 
Open Source XMPP for Cloud Services
Open Source XMPP for Cloud ServicesOpen Source XMPP for Cloud Services
Open Source XMPP for Cloud Services
 
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
 
Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings
 
Build, train and deploy your ML models with Amazon Sage Maker
Build, train and deploy your ML models with Amazon Sage MakerBuild, train and deploy your ML models with Amazon Sage Maker
Build, train and deploy your ML models with Amazon Sage Maker
 
Whatever it takes - Fixing SQLIA and XSS in the process
Whatever it takes - Fixing SQLIA and XSS in the processWhatever it takes - Fixing SQLIA and XSS in the process
Whatever it takes - Fixing SQLIA and XSS in the process
 
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
 
JEEconf 2017
JEEconf 2017JEEconf 2017
JEEconf 2017
 
Deep learning
Deep learningDeep learning
Deep learning
 
Machine learning key to your formulation challenges
Machine learning key to your formulation challengesMachine learning key to your formulation challenges
Machine learning key to your formulation challenges
 
Innovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and MonitoringInnovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and Monitoring
 
Building Twitter's SDKs for Android
Building Twitter's SDKs for AndroidBuilding Twitter's SDKs for Android
Building Twitter's SDKs for Android
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
10 ways to make your code rock
10 ways to make your code rock10 ways to make your code rock
10 ways to make your code rock
 
Nhibernate Part 2
Nhibernate   Part 2Nhibernate   Part 2
Nhibernate Part 2
 

Recently uploaded

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 

Recently uploaded (20)

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 

Speed bumps ahead

  • 1. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 1/75 1 . 1
  • 2. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 2/75 WHENCE?WHENCE? NLP Infrastructure Technical Lead @ Grammarly Clojure, Common Lisp, Java Services that improve writing of 30 million users (15 million daily) 2 . 1
  • 3. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 3/75 WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT PERFORMANCE?PERFORMANCE? premature optimization is the root of all evil. — Donald Knuth 3 . 1
  • 4. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 4/75 WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT PERFORMANCE?PERFORMANCE? "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." — Donald Knuth 4 . 1
  • 5. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 5/75 PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES "Hardware is cheap, programmers are expensive." A.K.A. "Just throw more machines into it." 5 . 1
  • 6. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 6/75 PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES   3 c5.9xlarge EC2 instances:    $3,351 monthly 10 c5.9xlarge EC2 instances: $11,170 monthly Is it worth to spend 1 person-month to optimize from 10 to 3? Probably. 6 . 1
  • 7. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 7/75 PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES "Hardware is cheap, programmers are expensive." A.K.A. "Just throw more machines into it." "Docker/Kubernetes/microservices/cloud/whatever allows you to scale horizontally." 7 . 1
  • 8. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 8/75 PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES There's no such thing as effortless horizontal scaling. At each next order of magnitude you get new headaches: More infrastructure (balancers, service discovery, queues, …) Configuration management Observability Deployment story Debugging story Complexity of setting up testing environments Whole bunch of second-order effects Mental tax You hire more devops/platform engineers/SREs to deal with this. 8 . 1
  • 9. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 9/75 PERFORMANCE OPTIMIZATION FALLACIESPERFORMANCE OPTIMIZATION FALLACIES "Hardware is cheap, programmers are expensive." A.K.A. "Just throw more machines into it." "Docker/Kubernetes/microservices/cloud/whatever allow us to scale horizontally." 9 . 1
  • 10. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 10/75 WHY SHOULD YOU CARE ABOUTWHY SHOULD YOU CARE ABOUT PERFORMANCE?PERFORMANCE? Ability to distinguish between those 97% and 3% is crucial in building effective so ware. That ability requires: Knowledge Tools Experience Experience comes from practice. 10 . 1
  • 11. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 11/75 WHAT CLOJURE HAS TO DO WITH ANY OFWHAT CLOJURE HAS TO DO WITH ANY OF THIS?THIS? 11 . 1
  • 12. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 12/75 CLOJURE IS FASTCLOJURE IS FAST Dynamically compiled language World-class JVM JIT for free Data structures with performance in mind Conservative polymorphism features Ability to drop down to Java where necessary 12 . 1
  • 13. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 13/75 CLOJURE IS VERSATILECLOJURE IS VERSATILE REPL is the best so ware design tool you can get. Applies to performance work too. Hundreds of people work on creating tools for measuring and improving performance on JVM. Easily usable from Clojure. 13 . 1
  • 14. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 14/75 WAYS TO MEASURE HOW FAST/SLOW ISWAYS TO MEASURE HOW FAST/SLOW IS SOMETHINGSOMETHING 1. "Feels slow" 2. Wrist stopwatch 3. (time ...) 4. (time (dotimes [_ 10000] ...) 5. Criterium 14 . 1
  • 15. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 15/75 REFLECTIONREFLECTION 15 . 1
  • 16. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 16/75 REFLECTIONREFLECTION 185x speedup! But why? (require '[criterium.core :as crit]) (def s "This gotta be good") (crit/quick-bench (.substring s 5 18)) ;; Execution time mean : 2.760464 µs (crit/quick-bench (.substring ^String s 5 18)) ;; Execution time mean : 14.897897 ns 16 . 1
  • 17. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 17/75 REFLECTIONREFLECTION Reflection is Java's introspection mechanism for resolving and calling the program's building blocks (classes, fields, methods) at runtime. In the same spirit as Clojure's resolve, ns-publics, apply. Common explanation is "reflection is slow". 17 . 1
  • 18. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 18/75 REFLECTIONREFLECTION We can use Java Reflection directly from Clojure. Turns out the reflective call itself is not that slow. Maybe it's the resolution of the method? (def m (.getDeclaredMethod String "substring" (into-array Class [Integer/TYPE Integer/TYPE]))) ;; returns java.lang.reflect.Method object (crit/quick-bench (.invoke ^Method m s (object-array [(Integer. 5) (Integer. 18)]))) ;; Execution time mean : 107.801748 ns (crit/quick-bench (let [^Method m (.getDeclaredMethod String "substring" (into-array Class [Integer/TYPE Integer/TYPE]))] (.invoke m string (object-array [(Integer. 5) (Integer. 18)])))) ;; Execution time mean : 648.579085 ns 18 . 1
  • 19. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 19/75 REFLECTIONREFLECTION What's really going on when Clojure performs a reflective call? One way is to dig into clojure.lang.Compiler (9k SLOC). Another way is to use clj-java-decompiler library. 19 . 1
  • 20. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 20/75 CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER Non-reflective call: And you'll get: https://github.com/clojure-goes-fast/clj-java-decompiler (require '[clj-java-decompiler.core :refer [decompile]]) (decompile (.substring ^String s 5 18)) Var const__0 = RT.var("slides", "s"); // ... ((String)const__0.getRawRoot()).substring(RT.intCast(5L), RT.intCast(18L)); 20 . 1
  • 21. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 21/75 CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER Reflective call: (decompile (.substring s 5 18)) Var const__0 = RT.var("slides", "s"); Object const__1 = 5L; Object const__2 = 18L; // ... Reflector.invokeInstanceMethod(const__0.getRawRoot(), "substring", new Object[] { const__1, const__2 }); 21 . 1
  • 22. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 22/75 INSIDE CLOJURE/LANG/REFLECTOR.JAVAINSIDE CLOJURE/LANG/REFLECTOR.JAVA static Object invokeInstanceMethod(Object target, String methodName, Object[] args) { Class c = target.getClass(); List methods = getMethods(c, args.length, methodName, false); return invokeMatchingMethod(methodName, methods, target, args); } static List getMethods(Class c, int arity, String name, boolean getStatics) { ArrayList methods = new ArrayList(); for (Method m : c.getMethods()) if (name.equals(method.getName())) methods.add(method); return methods; } static Object invokeMatchingMethod(String methodName, List methods, Object target, Object[] args) { Method foundm = null; for (Method m : methods) { Class[] params = m.getParameterTypes(); if(isCongruent(params, args)) foundm = m; } foundm.invoke(target, args); } 22 . 1
  • 23. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 23/75 REFLECTIONREFLECTION On a reflective call, Clojure looks through all methods of the class linearly, at runtime. No wonder why reflective calls are so slow! 23 . 1
  • 24. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 24/75 WAYS TO COMBAT REFLECTIONWAYS TO COMBAT REFLECTION Enable *warn-on-reflection* Use type hints And occasionally check with clj-java-decompiler. (set! *warn-on-reflection* true) (.substring s 5 18) ;; Reflection warning, .../slides.clj:114:12 - call to ;; method substring can't be resolved (target class is unknown). 24 . 1
  • 25. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 25/75 SHOULD REFLECTION BE WEEDED OUTSHOULD REFLECTION BE WEEDED OUT EVERYWHERE?EVERYWHERE? There's nothing wrong with having zero-reflection policy. But a few stray reflection calls won't hurt if they aren't called o en. You should profile to know for sure. 25 . 1
  • 26. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 26/75 CLJ-ASYNC-PROFILERCLJ-ASYNC-PROFILER The most convenient profiler-as-a-library for Clojure. https://github.com/clojure-goes-fast/clj-async-profiler (require '[clj-async-profiler.core :as prof]) (prof/profile (crit/quick-bench (.substring s 5 18))) 26 . 1
  • 27. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 27/75 Flame Graph Search clojure/lang/RestFn.applyTo clojure/core$eval.invoke java/lang/Class$MethodAr.. criterium/core$benchmark_STAR_.invokeStatic clojure/lang/AFn.applyToHelper criterium/core$run_benchmark.invokeStatic clojure/lang/Reflector.getMethods cl.. nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke jav.. java/util/A.. criterium/core$quick_benchmark_STAR_.invokeStatic c.. crite.. cloju.. criterium/core$execute_expr_core_timed_part.invoke clojure/main$repl$read_eval_print__8572$fn__8575.invoke jav.. cloju.. criterium/core$warmup_for_jit.invoke java/lang/Class$M.. slides$eval15051.invoke criterium/core$execute_expr.invoke jav..java/lang/Class.privateGetPublicMethods criterium/core$warmup_for_jit.invokeStatic java/util/A.. clojure/lang/Compiler.eval clojure/lang/AFn.applyTo clojure/main$repl.doInvoke clojure/lang/RestFn.invoke java/lang/Class$MethodAr.. j.. clojure/core$apply.invoke criterium/core$execute_expr_core_timed_part$fn__14416.invoke clojure/core$apply.invokeStatic jav.. slides$eval15051$fn__15052.invoke cr.. cloj.. j.. clojure/core$with_bindings_STAR_.invokeStatic crite.. criterium/core$execute_expr_core_timed_part.invokeStatic slide.. criterium/core$quick_benchmark_STAR_.invoke slides$eval15051$fn__15052$fn__15053.invoke java/lang/Class.getMethods clojure/main$repl.invokeStatic crite.. sl.. clojure/main$repl$read_eval_print__8572.invoke refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke criter.. clojure/lang/Reflector.invokeInstanceMethod cr.. clojure/core$eval.invokeStatic criterium/core$execute_expr.invokeStatic slides$eval15051.invokeStatic cr.. java/uti.. cr.. cr.. clojure/lang/RestFn.invoke clojure/lang/Compiler.eval crite.. clojure/core$with_bindings_STAR_.doInvoke clojure/core$apply.invokeStatic jav.. clojure/main$repl$fn__8581.invoke cr.. java/lang/Class$M.. criterium/core$run_benchmark.invoke criterium/core$benchmark_STAR_.invoke crite.. cr.. c.. jav.. jav.. cr.. j.. jav.. cr.. criter..
  • 28. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 28/75 nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke clojure/lang/AFn.run java/util/concurrent/ThreadPoolExecutor.runWorker nrepl/middleware/interruptible_eval$evaluate.invoke java/lang/Thread.run java/util/concurrent/ThreadPoolExecutor$Worker.run nrepl/middleware/interruptible_eval$evaluate.invokeStatic nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke 27 . 1
  • 29. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 29/75 Flame Graph Search j.. ja.. cl.. clojure/core$eval.invoke clojure/edn$read_string.invokeStatic j.. c.. jav.. cl.. clojure/lang/EdnReader.readDelimitedList java/.. c.. clojure/lang/Compiler.eval cloj.. clojure/lang/EdnRead.. clojure/lang/EdnReader.readString ja.. boot/user$eval16156$fn__16157.invoke clojure/la.. j.. boot/user$eval16156.invoke cheshire/parse$parse.inv.. java/util/re.. cloj.. ja.. clojure/lang/EdnReader.readDelimitedList j.. java.. cheshire/parse$parse.inv.. clojure/.. clojure/lang/EdnReader.readDelimitedList java/.. cl.. clojure/lang/EdnReader.read cl.. clojure/lang/EdnReader$.. clojure/edn$read_string.invokeStatic cheshire/parse$p.. clojure/lang/Compiler.eval cl.. clojure/lang/EdnReader$MapReader.invoke clojure/lang/EdnReader.readDelimitedList cloju.. c.. clojure/lang/EdnReader$MapReader.invoke ja.. cheshire/parse.. java/util/r.. co.. clojure/main$repl$read_eval_print__8572$fn__8575.invoke ja.. clojure/lang/EdnReader.. c.. cl.. cheshire/core$parse_string.i.. c.. j.. c.. clojure/lang/EdnReader.read jav.. boot/user$parse_edn.invoke ja.. boot/user$run.invokeStatic c.. cloj.. j..clojure/lang/EdnRead.. j.. ja.. ja.. ja.. ja.. boot/user$parse_json.invokeSt.. clojure/lang/RestFn.applyTo clojure.. ja.. clojure/lang/EdnReader.read j.. j.. clojure/lang/EdnReader.readDelimitedList clojure/lang/EdnReader$MapReader.invoke clojure/lang/EdnReader$MapReader.invoke ja.. c.. java/util/r.. cloj.. boot/user$eval16156.invokeStatic c.. cheshire/core$parse_string.i.. clojure/main$repl$read_eval_print__8572.invoke clojure/core$eval.invokeStatic j..java/util/re.. cheshire/parse$pars.. cloj.. cl.. cheshire/parse$parse_ST.. boot/user$parse_json.invoke cheshire/parse$pars.. clojure.. cloj.. ches.. ja.. j.. j.. cloju.. clojure/lang/EdnReader$MapReader.invoke c..ches.. cheshire/parse$p.. java.. boot/user$parse_edn.invokeStatic cl.. clojure/edn$read_string.invoke cheshire/parse$parse_.. boot/user$run.invoke java/io/.. clojure/main$repl.invokeStatic cl.. clo.. cheshire/parse.. jav.. java/util/re.. clojure/main$repl$fn__8581.invoke cheshire/core$parse_string.i.. cheshire/parse$parse_ST.. cheshire/core$parse_string.i.. cloj.. cloju.. clojure/main$repl.doInvoke cheshire/parse$parse_..
  • 30. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 30/75 nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke clojure/core$apply.invokeStatic java/util/concurrent/ThreadPoolExecutor.runWorker nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke java/util/concurrent/ThreadPoolExecutor$Worker.run clojure/lang/AFn.run clojure/lang/RestFn.invoke clojure/lang/AFn.applyToHelper clojure/lang/AFn.applyTo clojure/core$apply.invoke nrepl/middleware/interruptible_eval$evaluate.invoke nrepl/middleware/interruptible_eval$evaluate.invokeStatic clojure/lang/RestFn.invoke clojure/core$with_bindings_STAR_.invokeStatic clojure/core$apply.invokeStatic nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke java/lang/Thread.run clojure/core$with_bindings_STAR_.doInvoke 28 . 1
  • 31. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 31/75 CLJ-ASYNC-PROFILERCLJ-ASYNC-PROFILER Profiler that is controllable from your code. Instant feedback without leaving the REPL. Flamegraphs are a great representation. Intuitive and portable. 29 . 1
  • 32. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 32/75 BOXINGBOXING 30 . 1
  • 33. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 33/75 BOXINGBOXING 31 . 1
  • 34. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 34/75 BOXINGBOXING Boxing means wrapping primitive types into objects. 19x difference — not bad! (let [nums (vec (range 1e6))] (crit/quick-bench (reduce + nums))) ;; Execution time mean : 18.384708 ms (let [^longs nums (into-array Long/TYPE (range 1e6))] (crit/quick-bench (areduce nums i acc 0 (+ acc (aget nums i))))) ;; Execution time mean : 971.487253 µs 32 . 1
  • 35. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 35/75 BOXINGBOXING (decompile (let [^longs nums (into-array Long/TYPE (range 1e6))] (areduce nums i acc 0 (+ acc (aget nums i))))) final Object nums = core$into_array.invokeStatic( (Object)Long.TYPE, core$range.invokeStatic(100000)); final int lng = ((long[])nums).length; long i = 0L; long acc = 0L; while (i < lng) { final long n = RT.intCast(i) + 1; acc = Numbers.add(acc, ((long[])nums)[RT.intCast(i)]); i = n; } return Numbers.num(acc); 33 . 1
  • 36. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 36/75 WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING Profile to ensure that boxing is really a problem. Arrays instead of lists and vectors. Primitive type hints and casts. (set! *unchecked-math* :warn-on-boxed) 34 . 1
  • 37. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 37/75 :WARN-ON-BOXED:WARN-ON-BOXED (set! *unchecked-math* :warn-on-boxed) (let [init (fn [] 1)] (loop [i (init), res (init)] (if (< i 10) (recur (inc i) (* res i)) res))) ;; Boxed math warning, ../slides.clj:25:9 - ;; call: clojure.lang.Numbers.lt(Object,long). ;; Boxed math warning, ../slides.clj:26:14 - ;; call: clojure.lang.Numbers.unchecked_inc(Object). ;; Boxed math warning, ../slides.clj:26:22 - ;; call: clojure.lang.Numbers.unchecked_multiply(Object,Object). 35 . 1
  • 38. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 38/75 :WARN-ON-BOXED:WARN-ON-BOXED (set! *unchecked-math* :warn-on-boxed) (let [init (fn [] 1)] (loop [i (long (init)), res (long (init))] (if (< i 10) (recur (inc i) (* res i)) res))) 36 . 1
  • 39. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 39/75 WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING Profile to ensure that boxing is really a problem. Arrays instead of lists and vectors. Primitive type hints. (set! *unchecked-math* :warn-on-boxed) clj-java-decompiler 37 . 1
  • 40. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 40/75 CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER (decompile (let [init (fn [] 1)] (loop [i (init), res (init)] (if (< i 10) (recur (inc i) (* res i)) res)))) Object init = new slides$fn__17198$init__17199(); Object i = ((IFn)init).invoke(); Object res = ((IFn)init).invoke(); while (Numbers.lt(i, 10L)) { final Object i2 = Numbers.inc(i); res = Numbers.multiply(res, i); i = i2; } return res; 38 . 1
  • 41. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 41/75 CLJ-JAVA-DECOMPILERCLJ-JAVA-DECOMPILER (decompile (let [init (fn [] 1)] (loop [i (long (init)), res (long (init))] (if (< i 10) (recur (inc i) (* res i)) res)))) Object init = new slides$fn__17198$init__17199(); long i = ((IFn)init).invoke(); long res = ((IFn)init).invoke(); while (Numbers.lt(i, 10L)) { final Object i2 = Numbers.inc(i); res = Numbers.multiply(res, i); i = i2; } return res; 39 . 1
  • 42. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 42/75 WAYS TO COMBAT BOXINGWAYS TO COMBAT BOXING Profile to ensure that boxing is really a problem. Arrays instead of lists and vectors. Primitive type hints. (set! *unchecked-math* :warn-on-boxed) clj-java-decompiler Write number crunching in Java. 40 . 1
  • 43. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 43/75 WRITE JAVA IN STYLEWRITE JAVA IN STYLE Compile Java code without leaving or restarting your REPL. Use new classes immediately in your Clojure code. You still have the access to all Clojure development tools. https://github.com/ztellman/virgil 41 . 1
  • 44. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 44/75 INSUFFICIENT MEMORYINSUFFICIENT MEMORY 42 . 1
  • 45. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 45/75 INSUFFICIENT MEMORYINSUFFICIENT MEMORY If you don't specify -Xmx, JVM will start with default heap size. Usually, it's 1/4 of available RAM. $ java -XX:+PrintFlagsFinal -version | grep MaxHeapSize uintx MaxHeapSize := 2353004544 43 . 1
  • 46. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 46/75 INSUFFICIENT MEMORYINSUFFICIENT MEMORY If there's not enough free memory, GC might get too busy and slow you down. Same code, 7x slower without any apparent reason. $ clj -J-Xmx2g user=> (time (dotimes [_ 10] (reduce + (vec (repeat 5e7 1))))) ;; Elapsed time: 17084.184121 msecs user=> (def live-set (byte-array 1.3e9)) user=> (time (dotimes [_ 10] (reduce + (vec (repeat 5e7 1))))) ;; Elapsed time: 125614.873082 msecs 44 . 1
  • 47. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 47/75 WAYS TO DETECT MEMORY SHORTAGEWAYS TO DETECT MEMORY SHORTAGE (In development) VisualVM 45 . 1
  • 48. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 48/75 VISUALVMVISUALVM Normal: 46 . 1
  • 49. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 49/75 VISUALVMVISUALVM GC is overworked: 47 . 1
  • 50. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 50/75 WAYS TO DETECT MEMORY SHORTAGEWAYS TO DETECT MEMORY SHORTAGE (In development) VisualVM (In production) VisualVM over JMX, jstat, … clj-memory-meter to understand what occupies memory. 48 . 1
  • 51. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 51/75 CLJ-MEMORY-METERCLJ-MEMORY-METER Reports how much heap an object consumes. https://github.com/clojure-goes-fast/clj-memory-meter (require '[clj-memory-meter.core :as mm]) (mm/measure "Hello, world!") ;; "72 B" (mm/measure (reduce #(assoc %1 %2 (str %2)) {} (range 100))) ;; "9.6 KB" (mm/measure (vec (repeat 5e7 1))) ;; "258.4 MB" 49 . 1
  • 52. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 52/75 IMMUTABILITYIMMUTABILITY 50 . 1
  • 53. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 53/75 IMMUTABILITYIMMUTABILITY We love immutability, but sometimes, it is unnecessary. (crit/quick-bench (let [obj (Object.)] (loop [i 0, res []] (if (< i 1e6) (recur (inc i) (conj res obj)) res)))) ;; Execution time mean : 31.455536 ms 51 . 1
  • 54. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 54/75 WAYS TO COMBAT IMMUTABILITYWAYS TO COMBAT IMMUTABILITY Profiler Transients 2.2x speedup. (crit/quick-bench (let [obj (Object.)] (loop [i 0, res (transient [])] (if (< i 1e6) (recur (inc i) (conj! res obj)) (persistent! res))))) ;; Execution time mean : 14.115719 ms 52 . 1
  • 55. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 55/75 WAYS TO COMBAT IMMUTABILITYWAYS TO COMBAT IMMUTABILITY Profiler Transients Mutable Java collections 5x speedup. (crit/quick-bench (let [obj (Object.) res (ArrayList.)] (loop [i 0] (when (< i 1e6) (.add res obj) (recur (inc i)))) res)) ;; Execution time mean : 6.344132 ms 53 . 1
  • 56. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 56/75 CAVEAT EMPTORCAVEAT EMPTOR If you need the resulting collection to be a Clojure structure, transients are more efficient than Java classes. (crit/quick-bench (let [obj (Object.) res (ArrayList.)] (loop [i 0] (when (< i 1e6) (.add res obj) (recur (inc i)))) (vec res))) ;; Execution time mean : 19.435359 ms 54 . 1
  • 57. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 57/75 LAZINESSLAZINESS 55 . 1
  • 58. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 58/75 LAZINESSLAZINESS Increases allocation pressure -> more work for GC Worse memory locality Harder to debug and profile 56 . 1
  • 59. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 59/75 LAZINESSLAZINESS Everyone did this at least once in their career: Wow, Clojure is fast! /s (time (dotimes [_ 1e6] (map inc (range 1e6)))) ;; Elapsed time: 30.931708 msecs 57 . 1
  • 60. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 60/75 LAZINESSLAZINESS (defn burn-cpu [] (let [start (System/nanoTime)] (loop [res 0] (if (< (- (System/nanoTime) start) 1e9) (recur (inc res)) res)))) (prof/profile (let [nums (map (fn [_] (burn-cpu)) (range 10))] (reduce + nums))) 58 . 1
  • 61. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 61/75 LAZINESSLAZINESS Flame Graph Search clojure/core$apply.invoke clojure/core$eval.invoke nrepl/middleware/interruptible_eval$evaluate.invoke clojure/core/protocols$seq_reduce.invokeStatic clojure/core/protocols$fn__7835.invokeStatic clojure/lang/RestFn.invoke clojure/core$reduce.invoke clojure/lang/Compiler.eval clojure/core/protocols$fn__7781$G__7776__7794.invoke slides$eval16450$fn__16451$fn__16452.invoke nrepl/middleware/interruptible_eval$interruptible_eval$fn__1775$fn__1778.invoke clojure/lang/Numbe.. clojure/core$with_bindings_STAR_.doInvoke clojure/core$map$fn__5587.invoke clojure/main$repl$read_eval_print__8572.invoke clojure/main$repl$read_eval_print__8572$fn__8575.invoke slides$eval16450.invoke clojure/lang/RestFn.applyTo clojure/main$repl$fn__8581.invoke slides$burn_cpu.invoke clojure/core$eval.invokeStatic clojure/main$repl.doInvoke clojure/core/protocols$fn__7835.invoke nrepl/middleware/interruptible_eval$evaluate.invokeStatic clojure/core$reduce.invokeStatic clojure/lang/LazySeq.sval refactor_nrepl/ns/slam/hound/regrow$wrap_clojure_repl$fn__10916.doInvoke clojure/core$seq__5124.invokeStatic slides$eval16450.invokeStatic clojure/core$apply.invokeStatic clojure/lang/AFn.applyTo clojure/lang/RT.seq clojure/lang/Compiler.eval clojure/core$with_bindings_STAR_.invokeStatic slides$eval16450$fn__16451.invoke clojure/lang/RestFn.invoke clojure/core$apply.invokeStatic clojure/lang/Numbers.minus clojure/lang/LazySeq.seq clojure/main$repl.invokeStatic nrepl/middleware/interruptible_eval$evaluate$fn__1732.invoke slides$burn_cpu.invokeStatic clojure/lang/AFn.applyToHelper
  • 62. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 62/75 java/lang/Thread.run nrepl/middleware/interruptible_eval$run_next$fn__1770.invoke clojure/lang/AFn.run java/util/concurrent/ThreadPoolExecutor$Worker.run java/util/concurrent/ThreadPoolExecutor.runWorker 59 . 1
  • 63. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 63/75 WAYS TO COMBAT LAZINESSWAYS TO COMBAT LAZINESS doall, mapv, filterv Transducers 60 . 1
  • 64. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 64/75 TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR Reflection Boxing Insufficient memory Immutability Laziness 61 . 1
  • 65. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 65/75 TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR Reflection Boxing Insufficient memory Immutability Laziness Redundant allocations Coarsely-synchronized data structures Context switching overhead … 62 . 1
  • 66. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 66/75 TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR Reflection Boxing Insufficient memory Immutability Laziness Redundant allocations Coarsely-synchronized data structures Context switching overhead … GC pauses Megamorphic callsites Heap fragmentation … 63 . 1
  • 67. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 67/75 TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR Reflection Boxing Insufficient memory Immutability Laziness Redundant allocations Coarsely-synchronized data structures Context switching overhead … GC pauses Megamorphic callsites Heap fragmentation … Cache incoherence TLB misses (page walks) Branch misprediction NUMA foreign access … 64 . 1
  • 68. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 68/75 TOP SPEED BUMPS SO FARTOP SPEED BUMPS SO FAR Reflection Boxing Insufficient memory Immutability Laziness Redundant allocations Coarsely-synchronized data structures Context switching overhead … GC pauses Megamorphic callsites Heap fragmentation … Cache incoherence TLB misses (page walks) Branch misprediction NUMA foreign access … Magnetic disturbances CPU overheating … 65 . 1
  • 69. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 69/75 PERFORMANCE IS HARDPERFORMANCE IS HARD Abstractions are constantly leaking. The more you learn, the less you know. Your assumptions are constantly getting invalidated. 66 . 1
  • 70. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 70/75 PERFORMANCE IS FUN AND USEFULPERFORMANCE IS FUN AND USEFUL You learn things behind those leaky abstractions. You get a more holistic view of the system. You save money and the environment. 67 . 1
  • 71. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 71/75 PERFORMANCE PROBLEMS ARE NOTPERFORMANCE PROBLEMS ARE NOT UNIQUE TO CLOJUREUNIQUE TO CLOJURE But we are in a great position to solve them. There is plenty of prior art, especially for JVM. Tools, blogposts, experiments, reports. REPL allows us to use all of this much more easily. 68 . 1
  • 72. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 72/75 TOOLSTOOLS Criterium: clj-java-decompiler: clj-async-profiler: clj-memory-meter: virgil: VisualVM: JMH: https://github.com/hugoduncan/criterium https://github.com/clojure-goes-fast/clj-java- decompiler https://github.com/clojure-goes-fast/clj-async- profiler https://github.com/clojure-goes-fast/clj- memory-meter https://github.com/ztellman/virgil https://visualvm.github.io http://clojure-goes-fast.com/blog/using-jmh-with-clojure- part1/ 69 . 1
  • 73. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 73/75 RESOURCESRESOURCES Aleksey Shipilëv's blog: Nitsan Wakart's blog: http://clojure-goes-fast.com https://groups.google.com/forum/#!forum/mechanical-sympathy https://shipilev.net/ http://psy-lob-saw.blogspot.com/ 70 . 1
  • 74. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 74/75 INSTEAD OF A CONCLUSIONINSTEAD OF A CONCLUSION First, make it work. Then, make it right. Then, make it fast. But please, don't stop at the first. 71 . 1
  • 75. 12/4/2018 Speed bumps ahead http://localhost:3002/clojurex-2018/?print-pdf 75/75 (into [] (map answer) questions) 72 . 1