Access to real-time data is increasingly important for many organizations. At Lyft, we process millions of events per second in real-time to compute prices, balance marketplace dynamics, detect fraud, among many other use cases. To do so, we run dozens of Apache Flink and Apache Beam pipelines. Flink provides a powerful framework that makes it easy for non-experts to write correct, high-scale streaming jobs, while Beam extends that power to our large base of Python programmers.
Historically, we have run Flink clusters on bare, custom-managed EC2 instances. In order to achieve greater elasticity and reliability, we decided to rebuild our streaming platform on top of Kubernetes. In this session, I’ll cover how we designed and built an open-source Kubernetes operator for Flink and Beam, some of the unique challenges of running a complex, stateful application on Kubernetes, and some of the lessons we learned along the way.
50. Status:
Deploy Hash: b6c4bb26
Failed Deploy Hash:
Job Status:
Completed Checkpoint Count: 3908
Entry Class: com.lyft.wordcount.WordCount
Failed Checkpoint Count: 0
Health: Green
Jar Name: wordcount-1.0-SNAPSHOT.jar
Job ID: 1ebd3cd9445dda09d1ebe5b28b1661ee
Job Restart Count: 1
Last Checkpoint Time: 2019-09-12T01:04:24Z
Last Failing Time: <nil>
Parallelism: 8
Restore Time: 2019-09-10T21:18:42Z
Start Time: 2019-09-10T21:18:40Z
State: RUNNING
Last Seen Error: <nil>
Last Updated At: 2019-09-12T01:04:38Z
Phase: Running
Retry Count: 0
51. Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatingCluster 2m29s flinkK8sOperator Creating Flink cluster for deploy a931a6c4
Normal CancellingJob 93s flinkK8sOperator Cancelling job 1ebd3cd9445dda09d1ebe5b28b1661ee with a
final savepoint
Normal CanceledJob 63s flinkK8sOperator Canceled job with savepoint
s3://streamingplatform/wordcount/savepoints/savepoint-1ebd3c-4e3f35489444
Normal JobSubmitted 59s flinkK8sOperator Flink job submitted to cluster with id
0c1790aa33fdc8dd2798bad1d55ddfa8
Normal ToreDownCluster 33s flinkK8sOperator Deleted old cluster with hash b6c4bb26