SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Fault tolerance made easy
Patterns for fault tolerance implemented surprisingly easy

Uwe Friedrichsen, codecentric AG, 2013-2014
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
It‘s all about production!
Production
Availability
Resilience
Fault Tolerance
Your web server doesn‘t look good …
Pattern #1

Timeouts
Timeouts (1)
// Basics
myObject.wait(); // Do not use this by default
myObject.wait(TIMEOUT); // Better use this
// Some more basics
myThread.join(); // Do not use this by default
myThread.join(TIMEOUT); // Better use this
Timeouts (2)
// Using the Java concurrent library
Callable<MyActionResult> myAction = <My Blocking Action>
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<MyActionResult> future = executor.submit(myAction);
MyActionResult result = null;
try {
result = future.get(); // Do not use this by default
result = future.get(TIMEOUT, TIMEUNIT); // Better use this
} catch (TimeoutException e) { // Only thrown if timeouts are used
...
} catch (...) {
...
}
Timeouts (3)
// Using Guava SimpleTimeLimiter
Callable<MyActionResult> myAction = <My Blocking Action>
SimpleTimeLimiter limiter = new SimpleTimeLimiter();
MyActionResult result = null;
try {
result =
limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false);
} catch (UncheckedTimeoutException e) {
...
} catch (...) {
...
}
Determining Timeout Duration

Configurable Timeouts

Self-Adapting Timeouts

Timeouts in JavaEE Containers
Pattern #2

Circuit Breaker
Circuit Breaker (1)
Client
 Resource
Circuit Breaker
Request
Resource unavailable
Resource available
Closed
 Open
Half-Open
Lifecycle
Circuit Breaker (2)
Closed

on call / pass through
call succeeds / reset count
call fails / count failure
threshold reached / trip breaker
Open

on call / fail
on timeout / attempt reset
trip breaker
Half-Open

on call / pass through
call succeeds / reset
call fails / trip breaker
trip breaker
 attempt reset
reset
Source: M. Nygard, „Release It!“
Circuit Breaker (3)
public class CircuitBreaker implements MyResource {
public enum State { CLOSED, OPEN, HALF_OPEN }
final MyResource resource;
State state;
int counter;
long tripTime;
public CircuitBreaker(MyResource r) {
resource = r;
state = CLOSED;
counter = 0;
tripTime = 0L;
}
...
Circuit Breaker (4)
...
public Result access(...) { // resource access
Result r = null;
if (state == OPEN) {
checkTimeout();
throw new ResourceUnavailableException();
}
try {
r = r.access(...); // should use timeout
} catch (Exception e) {
fail();
throw e;
}
success();
return r;
}
...
Circuit Breaker (5)
...
private void success() {
reset();
}
private void fail() {
counter++;
if (counter > THRESHOLD) {
tripBreaker();
}
}
private void reset() {
state = CLOSED;
counter = 0;
}
...
Circuit Breaker (6)
...
private void tripBreaker() {
state = OPEN;
tripTime = System.currentTimeMillis();
}
private void checkTimeout() {
if ((System.currentTimeMillis - tripTime) > TIMEOUT) {
state = HALF_OPEN;
counter = THRESHOLD;
}
}
public State getState()
return state;
}
}
Thread-Safe Circuit Breaker

Failure Types

Tuning Circuit Breakers

Available Implementations
Pattern #3

Fail Fast
Fail Fast (1)
Client
 Resources
Expensive Action
Request
Uses
Fail Fast (2)
Client
 Resources
Expensive Action
Request
Fail Fast Guard
Uses
Check availability
Forward
Fail Fast (3)
public class FailFastGuard {
private FailFastGuard() {}
public static void checkResources(Set<CircuitBreaker> resources) {
for (CircuitBreaker r : resources) {
if (r.getState() != CircuitBreaker.CLOSED) {
throw new ResourceUnavailableException(r);
}
}
}
}
Fail Fast (4)
public class MyService {
Set<CircuitBreaker> requiredResources;
// Initialize resources
...
public Result myExpensiveAction(...) {
FailFastGuard.checkResources(requiredResources);
// Execute core action
...
}
}
The dreaded SiteTooSuccessfulException …
Pattern #4

Shed Load
Shed Load (1)
Clients
 Server
Too many Requests
Shed Load (2)
Server
Too many Requests
Gate Keeper
Monitor
Requests
Request Load Data
 Monitor Load
Shedded Requests
Clients
Shed Load (3)
public class ShedLoadFilter implements Filter {
Random random;
public void init(FilterConfig fc) throws ServletException {
random = new Random(System.currentTimeMillis());
}
public void destroy() {
random = null;
}
...
Shed Load (4)
...
public void doFilter(ServletRequest request,
ServletResponse response,
FilterChain chain)
throws java.io.IOException, ServletException {
int load = getLoad();
if (shouldShed(load)) {
HttpServletResponse res = (HttpServletResponse)response;
res.setIntHeader("Retry-After", RECOMMENDATION);
res.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE);
return;
}
chain.doFilter(request, response);
}
...
Shed Load (5)
...
private boolean shouldShed(int load) { // Example implementation
if (load < THRESHOLD) {
return false;
}
double shedBoundary =
((double)(load - THRESHOLD))/
((double)(MAX_LOAD - THRESHOLD));
return random.nextDouble() < shedBoundary;
}
}
Shed Load (6)
Shed Load (7)
Shedding Strategy

Retrieving Load

Tuning Load Shedders

Alternative Strategies
Pattern #5

Deferrable Work
Deferrable Work (1)
Client
Requests
Request Processing
Resources
Use
Routine Work
Use
OVERLOAD
Deferrable Work (2)
Without

Deferrable Work
100%
OVERLOAD
With

Deferrable Work
100%
Request Processing
Routine Work
// Do or wait variant
ProcessingState state = initBatch();
while(!state.done()) {
int load = getLoad();
if (load > THRESHOLD) {
waitFixedDuration();
} else {
state = processNext(state);
}
}
void waitFixedDuration() {
Thread.sleep(DELAY); // try-catch left out for better readability
}
Deferrable Work (3)
// Adaptive load variant
ProcessingState state = initBatch();
while(!state.done()) {
waitLoadBased();
state = processNext(state);
}
void waitLoadBased() {
int load = getLoad();
long delay = calcDelay(load);
Thread.sleep(delay); // try-catch left out for better readability
}
long calcDelay(int load) { // Simple example implementation
if (load < THRESHOLD) {
return 0L;
}
return (load – THRESHOLD) * DELAY_FACTOR;
}
Deferrable Work (4)
Delay Strategy

Retrieving Load

Tuning Deferrable Work
I can hardly hear you …
Pattern #6

Leaky Bucket
Leaky Bucket (1)
Leaky Bucket
Fill
Problem
occured
Periodically
Leak
Error
Handling
Overflowed?
public class LeakyBucket { // Very simple implementation
final private int capacity;
private int level;
private boolean overflow;
public LeakyBucket(int capacity) {
this.capacity = capacity;
drain();
}
public void drain () {
this.level = 0;
this.overflow = false;
}
...
Leaky Bucket (2)
...
public void fill() {
level++;
if (level > capacity) {
overflow = true;
}
}
public void leak() {
level--;
if (level < 0) {
level = 0;
}
}
public boolean overflowed() {
return overflow;
}
}
Leaky Bucket (3)
Thread-Safe Leaky Bucket

Leaking strategies

Tuning Leaky Bucket

Available Implementations
Pattern #7

Limited Retries
// doAction returns true if successful, false otherwise
// General pattern
boolean success = false
int tries = 0;
while (!success && (tries < MAX_TRIES)) {
success = doAction(...);
tries++;
}
// Alternative one-retry-only variant
success = doAction(...) || doAction(...);
Limited Retries (1)
Idempotent Actions

Closures / Lambdas

Tuning Retries
More Patterns






•  Complete Parameter Checking
•  Marked Data
•  Routine Audits
Further reading

1.  Michael T. Nygard, Release It!,
Pragmatic Bookshelf, 2007
2.  Robert S. Hanmer,

Patterns for Fault Tolerant Software,
Wiley, 2007
3.  James Hamilton, On Designing and
Deploying Internet-Scale Services,

21st LISA Conference 2007
4.  Andrew Tanenbaum, Marten van Steen,
Distributed Systems – Principles and
Paradigms,

Prentice Hall, 2nd Edition, 2006
It‘s all about production!
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Fault tolerance made easy

Weitere ähnliche Inhalte

Was ist angesagt?

J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
ksain
 

Was ist angesagt? (20)

Load Testing with RedLine13: Or getting paid to DoS your own systems
Load Testing with RedLine13: Or getting paid to DoS your own systemsLoad Testing with RedLine13: Or getting paid to DoS your own systems
Load Testing with RedLine13: Or getting paid to DoS your own systems
 
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_wormDefcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
Defcon_Oracle_The_Making_of_the_2nd_sql_injection_worm
 
Oaktable World 2014 Toon Koppelaars: database constraints polite excuse
Oaktable World 2014 Toon Koppelaars: database constraints polite excuseOaktable World 2014 Toon Koppelaars: database constraints polite excuse
Oaktable World 2014 Toon Koppelaars: database constraints polite excuse
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
 
Unit testing patterns for concurrent code
Unit testing patterns for concurrent codeUnit testing patterns for concurrent code
Unit testing patterns for concurrent code
 
Unit testing with Spock Framework
Unit testing with Spock FrameworkUnit testing with Spock Framework
Unit testing with Spock Framework
 
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
 
Painless JavaScript Testing with Jest
Painless JavaScript Testing with JestPainless JavaScript Testing with Jest
Painless JavaScript Testing with Jest
 
Performance tests with Gatling (extended)
Performance tests with Gatling (extended)Performance tests with Gatling (extended)
Performance tests with Gatling (extended)
 
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
Connect2017 DEV-1550 Why Java 8? Or, What's a Lambda?
 
Real world functional reactive programming
Real world functional reactive programmingReal world functional reactive programming
Real world functional reactive programming
 
Rules With Drools
Rules With DroolsRules With Drools
Rules With Drools
 
Stress test your backend with Gatling
Stress test your backend with GatlingStress test your backend with Gatling
Stress test your backend with Gatling
 
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in'tScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
ScalaSwarm 2017 Keynote: Tough this be madness yet theres method in't
 
Qunit Java script Un
Qunit Java script UnQunit Java script Un
Qunit Java script Un
 
Spock Framework
Spock FrameworkSpock Framework
Spock Framework
 
Unit testing in JavaScript with Jasmine and Karma
Unit testing in JavaScript with Jasmine and KarmaUnit testing in JavaScript with Jasmine and Karma
Unit testing in JavaScript with Jasmine and Karma
 
C++ Unit Test with Google Testing Framework
C++ Unit Test with Google Testing FrameworkC++ Unit Test with Google Testing Framework
C++ Unit Test with Google Testing Framework
 
From Elixir to Akka (and back) - ElixirConf Mx 2017
From Elixir to Akka (and back) - ElixirConf Mx 2017From Elixir to Akka (and back) - ElixirConf Mx 2017
From Elixir to Akka (and back) - ElixirConf Mx 2017
 
J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
 

Andere mochten auch

The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
Uwe Friedrichsen
 

Andere mochten auch (20)

How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE world
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinism
 
Self healing data
Self healing dataSelf healing data
Self healing data
 
Devops for Developers
Devops for DevelopersDevops for Developers
Devops for Developers
 
Patterns of resilience
Patterns of resiliencePatterns of resilience
Patterns of resilience
 
Resilience reloaded - more resilience patterns
Resilience reloaded - more resilience patternsResilience reloaded - more resilience patterns
Resilience reloaded - more resilience patterns
 
The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
 
Cloud fuer entscheider
Cloud fuer entscheiderCloud fuer entscheider
Cloud fuer entscheider
 
Komplexität - Na und?
Komplexität - Na und?Komplexität - Na und?
Komplexität - Na und?
 
Cloud Compliance - Bestimmungen, Zertifizierungen und all das
Cloud Compliance - Bestimmungen, Zertifizierungen und all dasCloud Compliance - Bestimmungen, Zertifizierungen und all das
Cloud Compliance - Bestimmungen, Zertifizierungen und all das
 
Scalability patterns
Scalability patternsScalability patterns
Scalability patterns
 
Emergent architecture
Emergent architectureEmergent architecture
Emergent architecture
 
Der Business Case für Architektur
Der Business Case für ArchitekturDer Business Case für Architektur
Der Business Case für Architektur
 
The agile Architect - Craftsmanship on a new Level
The agile Architect - Craftsmanship on a new LevelThe agile Architect - Craftsmanship on a new Level
The agile Architect - Craftsmanship on a new Level
 
Down with the_sandals
Down with the_sandalsDown with the_sandals
Down with the_sandals
 
No crash allowed - Fault tolerance patterns
No crash allowed - Fault tolerance patternsNo crash allowed - Fault tolerance patterns
No crash allowed - Fault tolerance patterns
 
Hochskalierbare Cloud-Architekturen
Hochskalierbare Cloud-ArchitekturenHochskalierbare Cloud-Architekturen
Hochskalierbare Cloud-Architekturen
 
Kommunikation und Qualität - Java Forum Nord 2016
Kommunikation und Qualität - Java Forum Nord 2016Kommunikation und Qualität - Java Forum Nord 2016
Kommunikation und Qualität - Java Forum Nord 2016
 
OAuth 2.0
OAuth 2.0OAuth 2.0
OAuth 2.0
 
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
OpenStack Summit :: Redundancy Doesn't Always Mean "HA" or "Cluster"
 

Ähnlich wie Fault tolerance made easy

33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests
Tomek Kaczanowski
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
Holden Karau
 
Java 5 concurrency
Java 5 concurrencyJava 5 concurrency
Java 5 concurrency
priyank09
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in Practice
ericbeyeler
 
Os Practical Assignment 1
Os Practical Assignment 1Os Practical Assignment 1
Os Practical Assignment 1
Emmanuel Garcia
 

Ähnlich wie Fault tolerance made easy (20)

Java Concurrency
Java ConcurrencyJava Concurrency
Java Concurrency
 
2012 JDays Bad Tests Good Tests
2012 JDays Bad Tests Good Tests2012 JDays Bad Tests Good Tests
2012 JDays Bad Tests Good Tests
 
33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests33rd Degree 2013, Bad Tests, Good Tests
33rd Degree 2013, Bad Tests, Good Tests
 
Developer Test - Things to Know
Developer Test - Things to KnowDeveloper Test - Things to Know
Developer Test - Things to Know
 
How to Start Test-Driven Development in Legacy Code
How to Start Test-Driven Development in Legacy CodeHow to Start Test-Driven Development in Legacy Code
How to Start Test-Driven Development in Legacy Code
 
The Promised Land (in Angular)
The Promised Land (in Angular)The Promised Land (in Angular)
The Promised Land (in Angular)
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
 
Sane Async Patterns
Sane Async PatternsSane Async Patterns
Sane Async Patterns
 
Java 5 concurrency
Java 5 concurrencyJava 5 concurrency
Java 5 concurrency
 
Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018  Resiliency & Security_Ballerina Day CMB 2018
Resiliency & Security_Ballerina Day CMB 2018
 
Ten useful JavaScript tips & best practices
Ten useful JavaScript tips & best practicesTen useful JavaScript tips & best practices
Ten useful JavaScript tips & best practices
 
Multithreading in Java
Multithreading in JavaMultithreading in Java
Multithreading in Java
 
Confitura 2012 Bad Tests, Good Tests
Confitura 2012 Bad Tests, Good TestsConfitura 2012 Bad Tests, Good Tests
Confitura 2012 Bad Tests, Good Tests
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrent
 
GeeCON 2012 Bad Tests, Good Tests
GeeCON 2012 Bad Tests, Good TestsGeeCON 2012 Bad Tests, Good Tests
GeeCON 2012 Bad Tests, Good Tests
 
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
 
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
Nevyn — Promise, It's Async! Swift Language User Group Lightning Talk 2015-09-24
 
Beyond parallelize and collect - Spark Summit East 2016
Beyond parallelize and collect - Spark Summit East 2016Beyond parallelize and collect - Spark Summit East 2016
Beyond parallelize and collect - Spark Summit East 2016
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in Practice
 
Os Practical Assignment 1
Os Practical Assignment 1Os Practical Assignment 1
Os Practical Assignment 1
 

Mehr von Uwe Friedrichsen

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
Uwe Friedrichsen
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
Uwe Friedrichsen
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
Uwe Friedrichsen
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
Uwe Friedrichsen
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
Uwe Friedrichsen
 
Resilient Functional Service Design
Resilient Functional Service DesignResilient Functional Service Design
Resilient Functional Service Design
Uwe Friedrichsen
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
Uwe Friedrichsen
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
Uwe Friedrichsen
 

Mehr von Uwe Friedrichsen (20)

Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Life after microservices
Life after microservicesLife after microservices
Life after microservices
 
The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developer
 
Digitization solutions - A new breed of software
Digitization solutions - A new breed of softwareDigitization solutions - A new breed of software
Digitization solutions - A new breed of software
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
 
The 7 quests of resilient software design
The 7 quests of resilient software designThe 7 quests of resilient software design
The 7 quests of resilient software design
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
 
Resilient Functional Service Design
Resilient Functional Service DesignResilient Functional Service Design
Resilient Functional Service Design
 
Watch your communication
Watch your communicationWatch your communication
Watch your communication
 
Life, IT and everything
Life, IT and everythingLife, IT and everything
Life, IT and everything
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
 
Production-ready Software
Production-ready SoftwareProduction-ready Software
Production-ready Software
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architectures
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective IT
 
Microservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack riskMicroservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack risk
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) IT
 
Why resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudesWhy resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudes
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Fault tolerance made easy

  • 1. Fault tolerance made easy Patterns for fault tolerance implemented surprisingly easy Uwe Friedrichsen, codecentric AG, 2013-2014
  • 2. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
  • 3. It‘s all about production!
  • 5. Your web server doesn‘t look good …
  • 7. Timeouts (1) // Basics myObject.wait(); // Do not use this by default myObject.wait(TIMEOUT); // Better use this // Some more basics myThread.join(); // Do not use this by default myThread.join(TIMEOUT); // Better use this
  • 8. Timeouts (2) // Using the Java concurrent library Callable<MyActionResult> myAction = <My Blocking Action> ExecutorService executor = Executors.newSingleThreadExecutor(); Future<MyActionResult> future = executor.submit(myAction); MyActionResult result = null; try { result = future.get(); // Do not use this by default result = future.get(TIMEOUT, TIMEUNIT); // Better use this } catch (TimeoutException e) { // Only thrown if timeouts are used ... } catch (...) { ... }
  • 9. Timeouts (3) // Using Guava SimpleTimeLimiter Callable<MyActionResult> myAction = <My Blocking Action> SimpleTimeLimiter limiter = new SimpleTimeLimiter(); MyActionResult result = null; try { result = limiter.callWithTimeout(myAction, TIMEOUT, TIMEUNIT, false); } catch (UncheckedTimeoutException e) { ... } catch (...) { ... }
  • 10. Determining Timeout Duration Configurable Timeouts Self-Adapting Timeouts Timeouts in JavaEE Containers
  • 12. Circuit Breaker (1) Client Resource Circuit Breaker Request Resource unavailable Resource available Closed Open Half-Open Lifecycle
  • 13. Circuit Breaker (2) Closed on call / pass through call succeeds / reset count call fails / count failure threshold reached / trip breaker Open on call / fail on timeout / attempt reset trip breaker Half-Open on call / pass through call succeeds / reset call fails / trip breaker trip breaker attempt reset reset Source: M. Nygard, „Release It!“
  • 14. Circuit Breaker (3) public class CircuitBreaker implements MyResource { public enum State { CLOSED, OPEN, HALF_OPEN } final MyResource resource; State state; int counter; long tripTime; public CircuitBreaker(MyResource r) { resource = r; state = CLOSED; counter = 0; tripTime = 0L; } ...
  • 15. Circuit Breaker (4) ... public Result access(...) { // resource access Result r = null; if (state == OPEN) { checkTimeout(); throw new ResourceUnavailableException(); } try { r = r.access(...); // should use timeout } catch (Exception e) { fail(); throw e; } success(); return r; } ...
  • 16. Circuit Breaker (5) ... private void success() { reset(); } private void fail() { counter++; if (counter > THRESHOLD) { tripBreaker(); } } private void reset() { state = CLOSED; counter = 0; } ...
  • 17. Circuit Breaker (6) ... private void tripBreaker() { state = OPEN; tripTime = System.currentTimeMillis(); } private void checkTimeout() { if ((System.currentTimeMillis - tripTime) > TIMEOUT) { state = HALF_OPEN; counter = THRESHOLD; } } public State getState() return state; } }
  • 18. Thread-Safe Circuit Breaker Failure Types Tuning Circuit Breakers Available Implementations
  • 20. Fail Fast (1) Client Resources Expensive Action Request Uses
  • 21. Fail Fast (2) Client Resources Expensive Action Request Fail Fast Guard Uses Check availability Forward
  • 22. Fail Fast (3) public class FailFastGuard { private FailFastGuard() {} public static void checkResources(Set<CircuitBreaker> resources) { for (CircuitBreaker r : resources) { if (r.getState() != CircuitBreaker.CLOSED) { throw new ResourceUnavailableException(r); } } } }
  • 23. Fail Fast (4) public class MyService { Set<CircuitBreaker> requiredResources; // Initialize resources ... public Result myExpensiveAction(...) { FailFastGuard.checkResources(requiredResources); // Execute core action ... } }
  • 26. Shed Load (1) Clients Server Too many Requests
  • 27. Shed Load (2) Server Too many Requests Gate Keeper Monitor Requests Request Load Data Monitor Load Shedded Requests Clients
  • 28. Shed Load (3) public class ShedLoadFilter implements Filter { Random random; public void init(FilterConfig fc) throws ServletException { random = new Random(System.currentTimeMillis()); } public void destroy() { random = null; } ...
  • 29. Shed Load (4) ... public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws java.io.IOException, ServletException { int load = getLoad(); if (shouldShed(load)) { HttpServletResponse res = (HttpServletResponse)response; res.setIntHeader("Retry-After", RECOMMENDATION); res.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE); return; } chain.doFilter(request, response); } ...
  • 30. Shed Load (5) ... private boolean shouldShed(int load) { // Example implementation if (load < THRESHOLD) { return false; } double shedBoundary = ((double)(load - THRESHOLD))/ ((double)(MAX_LOAD - THRESHOLD)); return random.nextDouble() < shedBoundary; } }
  • 33. Shedding Strategy Retrieving Load Tuning Load Shedders Alternative Strategies
  • 35. Deferrable Work (1) Client Requests Request Processing Resources Use Routine Work Use
  • 36. OVERLOAD Deferrable Work (2) Without
 Deferrable Work 100% OVERLOAD With
 Deferrable Work 100% Request Processing Routine Work
  • 37. // Do or wait variant ProcessingState state = initBatch(); while(!state.done()) { int load = getLoad(); if (load > THRESHOLD) { waitFixedDuration(); } else { state = processNext(state); } } void waitFixedDuration() { Thread.sleep(DELAY); // try-catch left out for better readability } Deferrable Work (3)
  • 38. // Adaptive load variant ProcessingState state = initBatch(); while(!state.done()) { waitLoadBased(); state = processNext(state); } void waitLoadBased() { int load = getLoad(); long delay = calcDelay(load); Thread.sleep(delay); // try-catch left out for better readability } long calcDelay(int load) { // Simple example implementation if (load < THRESHOLD) { return 0L; } return (load – THRESHOLD) * DELAY_FACTOR; } Deferrable Work (4)
  • 40. I can hardly hear you …
  • 42. Leaky Bucket (1) Leaky Bucket Fill Problem occured Periodically Leak Error Handling Overflowed?
  • 43. public class LeakyBucket { // Very simple implementation final private int capacity; private int level; private boolean overflow; public LeakyBucket(int capacity) { this.capacity = capacity; drain(); } public void drain () { this.level = 0; this.overflow = false; } ... Leaky Bucket (2)
  • 44. ... public void fill() { level++; if (level > capacity) { overflow = true; } } public void leak() { level--; if (level < 0) { level = 0; } } public boolean overflowed() { return overflow; } } Leaky Bucket (3)
  • 45. Thread-Safe Leaky Bucket Leaking strategies Tuning Leaky Bucket Available Implementations
  • 47. // doAction returns true if successful, false otherwise // General pattern boolean success = false int tries = 0; while (!success && (tries < MAX_TRIES)) { success = doAction(...); tries++; } // Alternative one-retry-only variant success = doAction(...) || doAction(...); Limited Retries (1)
  • 48. Idempotent Actions Closures / Lambdas Tuning Retries
  • 49. More Patterns •  Complete Parameter Checking •  Marked Data •  Routine Audits
  • 50. Further reading 1.  Michael T. Nygard, Release It!, Pragmatic Bookshelf, 2007 2.  Robert S. Hanmer,
 Patterns for Fault Tolerant Software, Wiley, 2007 3.  James Hamilton, On Designing and Deploying Internet-Scale Services,
 21st LISA Conference 2007 4.  Andrew Tanenbaum, Marten van Steen, Distributed Systems – Principles and Paradigms,
 Prentice Hall, 2nd Edition, 2006
  • 51. It‘s all about production!
  • 52. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com