3. ● Asynchronous, protocol-agnostic RPC framework for JVM languages
● Provides async client/server abstractions and hides low-level network details
of Netty
● Out of the box support for protocols such as HTTP, Thrift, Redis, Memcached
but can be extended
● Implemented in Scala
● Open-sourced by Twitter
Part 1. What is Finagle?
7. Client Modules
Clients are tricky and have with the bulk of fault-tolerance logic. By default, they
are optimized for high success rate and low latency.
8. ● Designed for handling workloads that have a mix of Compute and I/O.
● Each server can handle thousands of requests.
● Uses just two threads per core (Netty’s default, but its configurable).
How does it scale?
10. Part 2. Programming with Futures
● What is a Future?
○ A container to hold the value of async computation that may be either a
success or failure.
● History - Introduced as part of Java 1.5 java.util.concurrent.Future but had
limited functionality: isDone() and get() [blocking]
● Twitter Futures - Are more powerful and adds composability!
● Part of util-core package and not tied to any thread pool model.
12. How to consume Futures?
● Someone gives you a future, you act on it and pass it on (kinda hot potato)
Typical actions:
● Transform the value [map(), handle()]
● Log it, update stats [side-effect/callbacks - onSuccess(), onFailure()]
● Trigger another async computation and return that result [flatmap(), rescue()]
Most of the handlers are variations of the basic handler transform()
Future<B> transform(Function<Try<A>, Future<B>>);
13. Example 1
The backend I am calling returns an int, but I need to return a
string to my caller. What do I use?
14. Example 1
The backend I am calling returns an int, but I need to return a
string to my caller. What do I use?
Answer: map!
public Future<String> foo() {
return backend.foo().map(new Function<Integer, String>() {
public String apply(Integer i) {
return i.toString();
}
});
}
15. Example 1
The backend I am calling returns an int, but I need to return a
string to my caller. What do I use?
Answer: map!
import static com.twitter.util.Function.func;
public Future<String> foo() {
return backend.foo().map(func(i -> i.toString()));
}
16. Example 2
I consult a cache for a value, but on a miss, need to talk to a
database. What do I use?
17. Example 2
I consult a cache for a value, but on a miss, need to talk to a
database. What do I use?
Answer: flatmap!
public Future<Value> fetch(Key k) {
return cache.fetch(k).flatmap(
new Function<Value, Future<Value>>() {
public Future<Value> apply(Value v) {
if (v != null) return Future.value(v);
return db.fetch(k);
}
});
}
18. Handling Exceptions
● Don’t forget: map/flatmap will only execute for successful
futures
● To deal with exceptions: handle/rescue are the analogous
equivalent
Future<A> handle(Function<Throwable, A>)
Future<A> rescue(Function<Throwable, Future<A>>)
19. Example 1
If the backend I am calling throws an exception, I want to return
an error code. What do I use?
20. Example 1
If the backend I am calling throws an exception, I want to return
an error code. What do I use?
Answer: handle!
public Future<Result> foo() {
return backend.foo().handle(
new Function<Throwable, Result>() {
public Result apply(Throwable t) {
Result r = new Result();
r.setErrorCode(errorCodeFromThrowable(t));
return r;
}
});
21. Example 2
I consult a cache for a value, but if that failed, need to talk to a
database. What do I use?
22. Example 2
I consult a cache for a value, but if that failed, need to talk to a
database. What do I use?
Answer: rescue!
public Future<Value> get(Key k) {
return cache.fetch(k).rescue(
new Function<Throwable, Future<Value>>() {
public Future<Value> apply(Throwable t) {
LOG.error(“Cache lookup failed”, t);
return db.fetch(k)
}
});
}
23. Other handlers
More Sequential composition - join()
Concurrent composition, return after all are satisfied - collect()
Concurrent composition, return if any of the future is satisfied - select()
Finish within in a Timeout: within()
Delayed execuion: delayed()
24. Common Pitfalls
● Never block on a Future in production code (ok for unit tests)
○ Avoid future.get(), future.apply(), Await.result(future) as it ties up I/O processing threads and
it degrades Finagle’s performance considerably.
○ If you really need to block because you are dealing with synchronous libraries such as jdbc,
jedis use a dedicated FuturePool.
● Avoid ThreadLocal<T>. Use com.twitter.finagle.context.LocalContext instead
● Don't use parallel streams in Java 8
● Request concurrency leak - Never return “null” instead of Future<A>
Future<String> getPinJson(long pinId) {
return null; // This is bad!
// Use, return Future.value(null);
}
25. Part 3. Java Service Framework Features
Standardized Metrics - per client, per method success/fail counts and latency stats
Logging - slow log, exception log,
Rate limiting - Enforce quotas for clients
Genesis - Tool to generate the required stubs to bootstrap a finagle-thrift service
Warm up hook
Graceful shutdown
26. You need to enable your options via Proxy builder
ServiceFrameworkProxy<UserService.ServiceIface> serviceFrameworkProxy =
new ServiceFrameworkProxyBuilder<UserService.ServiceIface>()
.setHandler(serviceHandler)
.setServiceName(serviceName)
.setClusterName(serviceName.toLowerCase())
.setServerSetPath(serverSetPath)
.setClientNameProvider(new DefaultClientNameProvider())
.setRootLog(LOG)
.setFailureLog(FAILURE_LOG)
.enableExceptionTypeForFailureCount()
.disableLoggingForThrowable(ClientDiscardedRequestException.class)
.disableThrowablesAsServiceFailure(
Arrays.asList(ClientDiscardedRequestException.class,
DataValidationException.class))
.enableMethodNameForSuccessCountV2()
.enableMethodNameForFailureCountV2()
.enableMethodNameForResponseTimeMetricsV2()
.enableClientNameTagForSuccessCount()
.enableClientNameTagForFailureCount()
.enableClientNameTagForResponseTimeMetrics()
.enableExceptionLog()
.build();
27. Complaint 1:
● Clients are noticing higher latency or timeouts during deploys or restarts.
First few requests take longer than at steady state due to connection
establishment, Java’s Hopspot JIT etc.
Solution: Use warmUp hook and then join serverset
public static boolean warmUp(Callable<Boolean> warmUpCall)
// By default, invokes warmUpCall 100 times concurrently and expects it succeeds for at least 80%
of the calls
28. Graceful Shutdown
● Unjoin from serverset, waits for duration/2 secs and then tries to gracefully
shutdown server by draining existing requests within the remaining duration/2
secs
ServiceShutdownHook.register(server, Duration.fromSeconds(10), status)
public static void register(final Server server, final Duration gracePeriod,
final ServerSet.EndpointStatus endpointStatus)
29. Complaint 2:
Client is seeing rate limiting Exceptions even though rate limits are set to a high
value
Happens if the the server cluster is huge and the local rate limit per node becomes
small and the python client is running on few nodes (pinlater, offline job etc)
Solution: Try reducing max_connection_lifespan_ms if its python thriftmux client
30. Next steps
● Finagle upgrade to 6.43
○ Unlocks Retry Budgets
○ Defaults to P2C Load balancer instead of heap
○ Toggle between Netty3 and Netty4
○ Couple of performance fixes in Future scheduler
○ Many more...
31. Resources:
“Your server as a function” paper - https://dl.acm.org/citation.cfm?id=2525538
Source code: https://github.com/twitter/finagle
Finaglers - https://groups.google.com/d/forum/finaglers
Blogs:
[1] https://twitter.github.io/scala_school/finagle.html
[2] https://twitter.github.io/finagle/guide/developers/Futures.html
[3] http://vkostyukov.net/posts/finagle-101/