Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 9 Anzeige

Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022

Herunterladen, um offline zu lesen

Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022

Kafka Streams has recently expanded its options for handling thread death. Historically, upon reaching a fatal exception in a Streams Task, each thread would shut down, causing a rebalance. A different thread would then encounter the error as it picked up the task. This cascading thread failure could take a while. The performance suffers during the entire process due to constant rebalances and a non-optimal amount of threads. Eventually, there will be no threads alive, causing processing to halt entirely. Only then would the state change, alerting users to the issue.
In this talk, we will cover the changes to the threading model that made more dynamic error handling possible. We will also introduce the Streams handler, which unlocked options to react immediately in cases that would previously cause cascading thread death. Further improvements included modifying the state machine to clarify the meaning of the ERROR state. The inclusions of Kips 671, 696, and 663 allowed for much more flexibility in exceptional cases.
After this talk, the audience can use the new handler to react to exceptional cases in Kafka Streams. They will also understand the updates to the threading model and the changes in the state machine.

Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022

Kafka Streams has recently expanded its options for handling thread death. Historically, upon reaching a fatal exception in a Streams Task, each thread would shut down, causing a rebalance. A different thread would then encounter the error as it picked up the task. This cascading thread failure could take a while. The performance suffers during the entire process due to constant rebalances and a non-optimal amount of threads. Eventually, there will be no threads alive, causing processing to halt entirely. Only then would the state change, alerting users to the issue.
In this talk, we will cover the changes to the threading model that made more dynamic error handling possible. We will also introduce the Streams handler, which unlocked options to react immediately in cases that would previously cause cascading thread death. Further improvements included modifying the state machine to clarify the meaning of the ERROR state. The inclusions of Kips 671, 696, and 663 allowed for much more flexibility in exceptional cases.
After this talk, the audience can use the new handler to react to exceptional cases in Kafka Streams. They will also understand the updates to the threading model and the changes in the state machine.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022 (20)

Weitere von HostedbyConfluent (20)

Anzeige

Aktuellste (20)

Handing Failure With Grace in Kafka Streams With Walker Carlson | Current 2022

  1. 1. Handling Failure with Grace Stream’s uncaught exception handler Walker Carlson wcarlson@confluent.io
  2. 2. KIPs 663, 671 and 696 • KIP-663 - Threading model moves to be more dynamic • KIP-671 - Introduce a new handler to prevent issues we have seen in the past • KIP-696 - Changed the state machine
  3. 3. ERROR state Same as NOT_RUNNING but triggered by an expectation instead of a close call. CREATED RUNNING RE- BALANCING PENDING SHUTDOWN PENDING ERROR NOT RUNNING ERROR
  4. 4. Old Handler • Uses the Java runtime handler behind the scenes • Kills the thread • Cascading thread death is possible if not aware • Is deprecated and will be removed soon
  5. 5. New Handler • User code returns an ENUM to choose behavior after notification logic is done • Takes advantage of the threading model to make a thread replacement algorithm • Gives more flexibility • Does not do anything new with global threads
  6. 6. Shutdown Client • Closest to the old behavior • Close the client to ERROR state instead of one thread at a time
  7. 7. Shutdown Application • Uses the rebalance protocall to try to shutdown all clients of the application • May fail if a different client is not in the group • Fastest/only internal way to stop all clients
  8. 8. Replace Thread • New thread is brought up with new ID • Rebalances • Works with static membership • Not always a good idea

×