Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Ufuc Celebi – Stream & Batch Processing in one System

6.671 Aufrufe

Veröffentlicht am

Flink Forward 2015

Veröffentlicht in: Technologie

Ufuc Celebi – Stream & Batch Processing in one System

  1. 1. Ufuk Celebi uce@apache.org Flink Forward October 13, 2015 Stream & Batch Processing in One System Apache Flink’s Streaming Data Flow Engine
  2. 2. System Architecture Deployment
 Local (Single JVM) · Cluster (Standalone, YARN) DataStream API Unbounded Data DataSet API Bounded Data Runtime Distributed Streaming Data Flow Libraries Machine Learning · Graph Processing · SQL-like API 1
  3. 3. User Deployment
 Local (Single JVM) · Cluster (Standalone, YARN) DataStream API Unbounded Data DataSet API Bounded Data Runtime Distributed Streaming Data Flow Libraries Machine Learning · Graph Processing · SQL-like API 1
  4. 4. System Deployment
 Local (Single JVM) · Cluster (Standalone, YARN) DataStream API Unbounded Data DataSet API Bounded Data Runtime Distributed Streaming Data Flow Libraries Machine Learning · Graph Processing · SQL-like API 1
  5. 5. Today
 Journey from APIs to Parallel Execution A look behind the scenes. You don’t have to worry about this.
  6. 6. Components JobManager Master Client TaskManager Worker TaskManager Worker TaskManager Worker TaskManager Worker User System public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } 2
  7. 7. Components JobManager Master Client TaskManager Worker TaskManager Worker TaskManager Worker TaskManager Worker User System public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } Submit Program 2
  8. 8. Components JobManager Master Client TaskManager Worker TaskManager Worker TaskManager Worker TaskManager Worker User System public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } Submit Program Schedule 2
  9. 9. Components JobManager Master Client TaskManager Worker TaskManager Worker TaskManager Worker TaskManager Worker User System public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } Submit Program Schedule Execute 2
  10. 10. Client Translates the API code to 
 a data flow graph called JobGraph and submits it to the JobManager. Source Transform Sink public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } Translate 3
  11. 11. JobGraph JobVertex Intermediate
 Result Computation Data 4
  12. 12. JobGraph JobVertex Intermediate
 Result JobVertex Intermediate
 Result Produce Computation Data 4
  13. 13. JobGraph JobVertex Intermediate
 Result JobVertex Intermediate
 Result JobVertex Intermediate
 Result Produce Consume Computation Data 4
  14. 14. The JobGraph Vertices and results are combined to a directed acyclic graph (DAG) representing the user program. 5 Source Source Sink SinkJoin Map
  15. 15. JobGraph Translation • Translation includes optimizations like chaining: f g f · g • DataSet API translation with cost-based optimization 6
  16. 16. JobGraph JobVertex Parameters • Parallelism • Code to run • Consumed result(s) • Connection pattern JobGraph is common abstraction for both DataStream and DataSet API. Result Parameters • Producer • Type Runtime is agnostic to the respective API. It’s only a question of JobGraph parameterization. 7
  17. 17. JobGraph JobVertex Parameters • Parallelism • Code to run • Consumed result(s) • Connection pattern JobGraph is common abstraction for both DataStream and DataSet API. Result Parameters • Producer • Type Runtime is agnostic to the respective API. It’s only a question of JobGraph parameterization. 7
  18. 18. TaskManagerTaskManager Coordination • Coordination between components via Akka Actors • Actors exchange asynchronous messages • Each actor has own isolated state JobManager Master Client Actor SystemActor System 8 TaskManager
  19. 19. System Components JobManager Master Client TaskManager Worker TaskManager Worker TaskManager Worker TaskManager Worker User System public class WordCount { public static void main(String[] args) throws Exception { // Flink’s entry point StreamExecutionEnvironment env = StreamExecutionEnvironment .getExecutionEnvironment(); DataStream<String> data = env.fromElements( "O Romeo, Romeo! wherefore art thou Romeo?", "Deny thy father and refuse thy name", "Or, if thou wilt not, be but sworn my love,", "And I'll no longer be a Capulet."); // Split by whitespace to (word, 1) and sum up ones DataStream<Tuple2<String, Integer>> counts = data .flatMap(new SplitByWhitespace()) .keyBy(0) .timeWindow(Time.of(10, TimeUnit.SECONDS)) .sum(1); counts.print(); // Today: What happens now? env.execute(); } } Submit Program
  20. 20. JobManager • All coordination via JobManager (master): • Scheduling programs for execution • Checkpoint coordination (Till’s talk later today) • Monitoring workers Actor System Scheduling Checkpoint Coordination 9
  21. 21. ExecutionGraph • Receive JobGraph and span out to ExecutionGraph JobVertex Result JobVertex 10
  22. 22. ExecutionGraph • Receive JobGraph and span out to ExecutionGraph EV1 EV3 EV2 EV4 RP1 RP2 RP3 RP4 EV1 EV2 Point to Point JobVertex Result ExecutionVertex (EV) ResultPartition (RP) JobVertex 10
  23. 23. ExecutionGraph • Receive JobGraph and span out to ExecutionGraph EV1 EV3 EV2 EV4 RP1 RP2 RP3 RP4 EV1 EV2 All to All JobVertex Result ExecutionVertex (EV) ResultPartition (RP) JobVertex 10
  24. 24. TaskManager Actor System Task SlotTask SlotTask SlotTask Slot • All data processing in TaskManager (worker): • Communicate with JobManager via Actor messages • Exchange data between themselves via dedicated data connections • Expose task slots for execution I/O Manager Memory Manager 11
  25. 25. Scheduling • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable 12 p=4 p=4 p=3 All to allPointwise TaskManager 1 TaskManager 2
  26. 26. Scheduling TaskManager 1 TaskManager 2 • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable p=4 p=4 p=3 All to allPointwise 12
  27. 27. Scheduling TaskManager 1 TaskManager 2 • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable p=4 p=4 p=3 All to allPointwise 12
  28. 28. Scheduling TaskManager 1 TaskManager 2 • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable p=4 p=4 p=3 All to allPointwise 12
  29. 29. Scheduling TaskManager 1 TaskManager 2 • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable p=4 p=4 p=3 All to allPointwise 12
  30. 30. Scheduling TaskManager 1 TaskManager 2 • Each ExecutionVertex will be executed one or more times • The JobManager maps Execution to task slots • Pipelined execution in same slot where applicable p=4 p=4 p=3 All to allPointwise 12
  31. 31. Scheduling • Scheduling happens from the sources • Later tasks are scheduled during runtime • Depending on the result type JobManager Master Actor System TaskManager Worker Actor System Submit Task State Updates 13
  32. 32. Execution • The ExecutionGraph tracks the state of each parallel Execution • Asynchronous messages from the 
 TaskManager and Client Failed FinishedCancellingCancelled Created Scheduled RunningDeploying 14
  33. 33. Task Execution • TaskManager receives Task per Execution • Task descriptor is limited to: • Location of consumed results • Produced results • Operator & user code User 
 Code Operator Task 15 ? ?
  34. 34. Task Execution DataStream<Tuple2<String, Integer>> counts = data.flatMap(new SplitByWhitespace()); User 
 Code 17
  35. 35. Task Execution DataStream<Tuple2<String, Integer>> counts = data.flatMap(new SplitByWhitespace()); User 
 Code for (…) {
 out.collect(new Tuple2<>(w, 1));
 } 17
  36. 36. Task Execution DataStream<Tuple2<String, Integer>> counts = data.flatMap(new SplitByWhitespace()); User 
 Code StreamTask with
 StreamFlatMap operator for (…) {
 out.collect(new Tuple2<>(w, 1));
 } 17
  37. 37. Task Execution DataStream<Tuple2<String, Integer>> counts = data.flatMap(new SplitByWhitespace()); User 
 Code StreamTask with
 StreamFlatMap operator Task with one consumed and one produced result for (…) {
 out.collect(new Tuple2<>(w, 1));
 } 17
  38. 38. Data Connections • Input Gates request input from local and remote channels on first read Task Result ResultManager TaskManager ResultManager TaskManager NetworkManagerNetworkManager Input Gate 18
  39. 39. Data Connections • Input Gates request input from local and remote channels on first read Task Result ResultManager TaskManager ResultManager TaskManager NetworkManagerNetworkManager Input Gate 1.Initiate TCP connection 18
  40. 40. Data Connections • Input Gates request input from local and remote channels on first read Task Result ResultManager TaskManager ResultManager TaskManager NetworkManagerNetworkManager Input Gate 2. Request 1.Initiate TCP connection 18
  41. 41. Data Connections • Input Gates request input from local and remote channels on first read Task Result ResultManager TaskManager ResultManager TaskManager NetworkManagerNetworkManager Input Gate 2. Request 3. Send via TCP 1.Initiate TCP connection 18
  42. 42. Result Characteristics vs. vs. Ephemeral Checkpointed Pipelined Blocking How and when to do data exchange? How long to keep results around? 20
  43. 43. Map Pipelined Result 1101 0101 0100 Pipelined Results 21
  44. 44. Map Pipelined Result11010101 0100 Pipelined Results 21
  45. 45. Map Pipelined Result 1101 0101 0100 Pipelined Results 21
  46. 46. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  47. 47. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  48. 48. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  49. 49. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  50. 50. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  51. 51. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  52. 52. Map Pipelined Result Reduce 1101 01010100 Pipelined Results 21
  53. 53. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  54. 54. Map Pipelined Result Reduce 1101 0101 0100 Pipelined Results 21
  55. 55. Map Blocking Result 1101 0101 0100 Blocking Results 22
  56. 56. Map Blocking Result11010101 0100 Blocking Results 22
  57. 57. Map Blocking Result 1101 0101 0100 Blocking Results 22
  58. 58. Map Blocking Result 1101 0101 0100 Blocking Results 22
  59. 59. Map Blocking Result 1101 0101 0100 Blocking Results 22
  60. 60. Map Blocking Result 1101 01010100 Blocking Results 22
  61. 61. Map Blocking Result 1101 0101 0100 Blocking Results 22
  62. 62. Map Blocking Result Reduce 1101 0101 0100 Blocking Results 22
  63. 63. Map Blocking Result Reduce 1101 0101 0100 Blocking Results 22
  64. 64. Map Blocking Result Reduce 1101 0101 0100 Blocking Results 22
  65. 65. Map Blocking Result Reduce 1101 0101 0100 Blocking Results 22
  66. 66. Map Blocking Result Reduce 1101 0101 0100 Blocking Results 22
  67. 67. Batch Pipelines
  68. 68. Batch Pipelines Data exchange
 is mostly streamed
  69. 69. Batch Pipelines Data exchange
 is mostly streamed Some operators block (e.g. sort, hash table)
  70. 70. Recap Client JobManager TaskManager Communication Actor-only (coordination) Actor-only (coordination) Actor & Data Streams Central 
 Abstraction JobGraph ExecutionGraph Task State Tracking – Complete
 program Single Task 23
  71. 71. Thank You!
  72. 72. Stream & Batch Processing DataStream DataSet JobGraph Chaining Chaining and cost- based optimisation Intermediate
 Results Pipelined Pipelined and Blocking Operators Stream operators Batch operators User function Common interface for
 map, reduce, … 25
  73. 73. Stream & Batch Processing • Stream and Batch programs are different parameterizations of the JobGraph • Everything goes down to the same runtime • Streaming first, batch as special case • Cost-based optimizer on translation • Blocking results for less resource fragmentation • But still profit from streaming • DataSet and DataStream API are essentially all user code to the runtime 24
  74. 74. Result Characteristics vs. vs. Ephemeral Checkpointed Pipelined Blocking How and when to do data exchange? How long to keep results around?
  75. 75. Map Ephemeral Result 45 Ephemeral Results 1101 0101 0100
  76. 76. Map Ephemeral Result 45 Ephemeral Results 11010101 0100
  77. 77. Map Ephemeral Result 45 Ephemeral Results 1101 0101 0100
  78. 78. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  79. 79. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  80. 80. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  81. 81. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  82. 82. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  83. 83. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  84. 84. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  85. 85. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  86. 86. Map Ephemeral Result Reduce 45 Ephemeral Results 1101 0101 0100
  87. 87. Map Ephemeral Result Reduce 45 Ephemeral Results 11010101 0100
  88. 88. Map Ephemeral Result Reduce 45 Ephemeral Results 11010101 0100
  89. 89. Map Ephemeral Result Reduce 45 Ephemeral Results 11010101 0100
  90. 90. Map Ephemeral Result Reduce 45 Ephemeral Results 11010101 0100
  91. 91. Map Ephemeral Result Reduce 45 Ephemeral Results 11010101 0100
  92. 92. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100
  93. 93. Map Checkpointed Result Reduce 46 Checkpointed Results 11010101 0100
  94. 94. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100
  95. 95. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101
  96. 96. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101
  97. 97. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101
  98. 98. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101
  99. 99. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 01010100 1101 0101
  100. 100. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101
  101. 101. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101 0100
  102. 102. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101 0100
  103. 103. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101 0100
  104. 104. Map Checkpointed Result Reduce 46 Checkpointed Results 1101 0101 0100 1101 0101 0100

×