1
Kostas Kloudas
@KLOUBEN_K
BERLIN BUZZWORDS
JUNE 13, 2017
Complex Event Processing with
Flink The state of FlinkCEP
2
Original creators of Apache
Flink®
Providers of the
dA Platform, a supported
Flink distribution
What is CEP?
3
CEP: Complex Event Processing
 Detecting event patterns
 Over continuous streams of events
 Often arriving out-of-order...
CEP: Complex Event Processing
5
Input
CEP: Complex Event Processing
6
Pattern
Input
CEP: Complex Event Processing
7
Pattern
Output
Input
CEP: use-cases
 IoT
 Intrusion detection
 Inventory Management
 Click Stream Analysis
 Trend detection in financial s...
What is Stream Processing?
9
Stream Processing
10
Computation
Computations on
never-ending
“streams” of events
Distributed Stream Processing
11
Computation
Computation
spread across
many machines
Computation Computation
Stateful Stream Processing
12
Computation
State
Result depends
on history of
stream
13
Stream Processors are a natural fit
for CEP
FlinkCEP
14
Pattern
Output
FlinkCEP
Input
What does FlinkCEP offer?
15
Pattern Definition
16
Pattern
Pattern Definition
 Composed of Individual Patterns
• P1(shape == rectangle)
• P2(shape == triangle)
17
Pattern
P2
P1
Pattern Definition
 Composed of Individual Patterns
• P1(shape == rectangle)
• P2(shape == triangle)
 Combined by Contig...
FlinkCEP Individual Patterns
 Unique Name
 Condition : which elements to accept
• Simple e.g shape == rectangle
• Iterat...
FlinkCEP Complex Patterns
 Combine Individual Patterns
 Contiguity Conditions
• how to select relevant events given an i...
FlinkCEP Contiguity Conditions
21
Pattern
Input
FlinkCEP Contiguity Conditions
22
Pattern
OutputInput
Strict Contiguity
• matching events strictly follow each other
FlinkCEP Contiguity Conditions
23
Pattern
OutputInput
FlinkCEP Contiguity Conditions
24
Pattern
Relaxed Contiguity
• non-matching events to simply be ignored
Input Output
FlinkCEP Contiguity Conditions
25
Pattern
Input Output
FlinkCEP Contiguity Conditions
26
Pattern
Input Output
FlinkCEP Contiguity Conditions
27
Pattern
Input Output
Non-Deterministic Relaxed Contiguity
• allows non-deterministic act...
FlinkCEP Contiguity Conditions
28
Pattern
NOT patterns:
• for strict and relaxed contiguity
• for cases where an event sho...
FlinkCEP Summary
29
 Quantifiers
• oneOrMore(), times(), optional()
 Conditions
• Simple & Iterative
 Time Constraints
...
 Trace all shipments which:
• start at location A
• have at least 5 stops
• end at location B
• within the last 24h
30
Ru...
 Trace all shipments which:
• start at location A
• have at least 5 stops
• end at location B
• within the last 24h
31
Ob...
32
Observation B Quantifiers
 Start/End: single event
 Middle: multiple events
• .oneOrMore()
Start
End
Mid
ev.from == A...
33
Observation C Conditions
 Start -> Simple
• properties of the event
 Middle/End -> Iterative
• Depend on previous eve...
34
 Trace all shipments which:
• start at location A
• have at least 5 stops
• end at location B
• within the last 24h
Ob...
35
 We opt for relaxed continuity
Observation E Contiguity
Pattern<Event, ?> pattern = Pattern
.<Event>begin("start")
.where(mySimpleCondition)
.followedBy ("middle")
.where(myItera...
Pattern<Event, ?> pattern = Pattern
.<Event>begin("start")
.where(mySimpleCondition)
.followedBy ("middle")
.where(myItera...
Pattern<Event, ?> pattern = Pattern
.<Event>begin("start")
.where(mySimpleCondition)
.followedBy ("middle")
.where(myItera...
Pattern<Event, ?> pattern = Pattern
.<Event>begin("start")
.where(mySimpleCondition)
.followedBy ("middle")
.where(myItera...
Running Example Pattern Integration
Pattern<Event, ?> pattern = ...
PatternStream<Event> patternStream = CEP.pattern(input...
Running Example Pattern Integration
Pattern<Event, ?> pattern = ...
PatternStream<Event> patternStream = CEP.pattern(input...
Running Example Pattern Integration
Pattern<Event, ?> pattern = ...
PatternStream<Event> patternStream = CEP.pattern(input...
Documentation
 FlinkCEP documentation:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html
43
4
Thank you!
@KLOUBEN_K
@ApacheFlink
@dataArtisans
45
Stream Processing
and Apache Flink®'s
approach to it
@StephanEwen
Apache Flink PMC
CTO @ data ArtisansFLINKFORWARD IS C...
We are hiring!
data-artisans.com/careers
Nächste SlideShare
Wird geladen in …5
×

Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP

801 Aufrufe

Veröffentlicht am

Pattern matching over event streams is increasingly being employed in many areas including financial services and click stream analysis. Flink, as a true stream processing engine, emerges as a natural candidate for these usecases. In this talk, we will present FlinkCEP, a library for Complex Event Processing (CEP) based on Flink. At the conceptual level, we will see the different patterns the library can support, we will present the main building blocks we implemented to support them, and we will discuss possible future additions that will further enhance the coverage of the library. At the practical level, we will show how the integration of FlinkCEP with Flink allows the former to take advantage of Flink's rich ecosystem (e.g. connectors) and its stream processing capabilities, such as support for event-time processing, exactly-once state semantics, fault-tolerance, savepoints and high throughput.

Veröffentlicht in: Daten & Analysen
0 Kommentare
4 Gefällt mir
Statistik
Notizen
  • Als Erste(r) kommentieren

Keine Downloads
Aufrufe
Aufrufe insgesamt
801
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
18
Aktionen
Geteilt
0
Downloads
31
Kommentare
0
Gefällt mir
4
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie
  • Hello everyone and thanks for coming!

    My name is Kostas Kloudas and I am here to talk to you about FlinkCEP, a library for complex event processing built atop Apache Flink.
  • A little bit about myself, I am a committer for Apache Flink and a software engineer for data Artisans, the original creators of Apache Flink and the providers of the dA Platform.
  • So without further adue, let’s start by seeing what is CEP or Complex Event Processing?
  • Complex Event Processing is the “art” of detecting event patterns, over continuous streams of data, often arriving out of order. To visualize it....
  • Imagine that you have a stream containing elements of different shapes and colours, as shown in the figure...
  • And you want to detect sequences of events where a triangle, follows after a rectangle of the SAME color. A CEP library, would take the input and the pattern, and it will return the matching patterns, ...
  • As shown in the figure.
  • Many interesting usecases fall into the category of complex event processing problems. To name a few, we have usecases from IoT....
  • We saw what is the basic idea behind CEP, now let’s see what is stream processing, and why a stream processor provides a good substrate for building a CEP library.
  • Stream processing, in its simplest form, stands for computations on never-ending streams of events.
  • Distributed stream processing, implies that the aforementioned computation is spread across many machines.
  • While stateful distributed stream processing, has the additional property of the result depending on the history of the stream. To do this the stream processor must be able to keep state in a fault-tolerant manner. Most of the interesting computations are stateful, in fact even a simple event counter needs to keep state. This is where Flink shines.
  • From the above, it is not difficult to see that stream processors are a natural fit for CEP.
    This was the main motivation behind the first implementation of FlinkCEP, more than a year ago, and
    this talk focuses on what the current capabilities of FlinkCEP, (slide)
  • ... a library that takes your input stream and your desired pattern and returns you the matching event sequences.
  • So what does FlinkCEP offer? We will start by describing the building blocks the library offers for defining a complex pattern, before describing how to integrate it in your program.
  • Pattern definition: taking our previous pattern, where we wanted to find all rectangles followed by triangles, we see that (slide)
  • A complex pattern, is composed of individual patterns, or simply patterns, which search for a specific type of event. In our case, we have two individual patterns, one searching for rectangles and another searching for triangles.
  • These individual patterns are combined into a complex one by specifying the contiguity condition between them. We will come back to this later, but in a nutshell, contiguity describes how to select relevant events given an input mixing relevant and irrelevant events. In our example, we say that the triangle should strictly follow the rectangle.

    Given that complex patterns are composed of individual patterns, we start by describing them first, before showing how to combine them together.
  • Individual Patterns must have a unique name and for each one of them we can define a condition based on which it accepts relevant events.
    This condition can depend on properties of the event itself, in which case it is a SIMPLE condition, or on properties or statistics over a sunbset of previously accepted events, in whuch case it is an Iterative Condition.

    In addition to the condition, a pattern can also have quantifiers. By default, when an individual pattern appears in a complex pattern, FlinkCEP expects the described type of event to appear exactly once, in order to have a match. This is a singleton pattern. In our case, we expect exactly one rectangle, followed by exactly one triangle. FlinkCEP also supports quantifiers. These are oneOrMore() for usecases where a specific type of event is expected “at-least once”, times() when we want it to appear a specified amount of times, and optional() if the event is optional.

    The above are the possibilities offered when defining individual Patterns. These patterns can be combined into complex patterns (slide)
  • ...by specifying the “contiguity conditions” between individual patterns, and, potentially a time constraint using the within() clause. The time constraint allows you to express usecases where, for example, “I want all my event to happen within 24h”.
    To understand contiguity, let’s take our pattern as shown on the left-hand side, and our previous input... (slide)
  • Previously we only accepted event sequences where the triangle strictly followed the rectangle without any non-matching events in-between. This is the first form of supported contiguity, called STRICT CONTIGUITY. FlinkCEP supports 2 more modes, namely RELAXED and NON-DETERMINISTIC RELAXED contiguity.
  • To understand relaxed contiguity, let’s focus on the green highlighted sequence in the input box. We see that with strict contiguity, this sequence is rejected, because between the green rectangle and triangle there is a circle. In many use-cases, we want the non-matching events to simply be ignored, without invalidating previous partial matches. EXAMPLE user interaction
  • For these use-cases, FlinkCEP also supports Relaxed Continuity, where non-matching events are simply ignored. EXAMPLE user interaction
  • Finally, non-deterministic relaxed contiguity further relaxes contiguity by allowing non-deterministic actions on relevant events. To illustrate this, let’s focus on the new highlighted green sequence in the input box. For this, we see that only the sequence containing the rectangle and the first triangle was accepted (slide)
  • In some cases, we want this pair to be accepted, but also to have a match containing the rectangle and the second triangle. For these cases, we have the non-deterministic relaxed continuity. (slide)
  • Finally, for cases where an event should invalidate a match, FlinkCEP also supports NOT patterns. More on this in the documentation. NOT patterns allow to express usecases like SHOPLIFTING
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • For now we intentionally ignore the “marked as fragile condition”.
  • Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP

    1. 1. 1 Kostas Kloudas @KLOUBEN_K BERLIN BUZZWORDS JUNE 13, 2017 Complex Event Processing with Flink The state of FlinkCEP
    2. 2. 2 Original creators of Apache Flink® Providers of the dA Platform, a supported Flink distribution
    3. 3. What is CEP? 3
    4. 4. CEP: Complex Event Processing  Detecting event patterns  Over continuous streams of events  Often arriving out-of-order 4
    5. 5. CEP: Complex Event Processing 5 Input
    6. 6. CEP: Complex Event Processing 6 Pattern Input
    7. 7. CEP: Complex Event Processing 7 Pattern Output Input
    8. 8. CEP: use-cases  IoT  Intrusion detection  Inventory Management  Click Stream Analysis  Trend detection in financial sector  ...yours? 8
    9. 9. What is Stream Processing? 9
    10. 10. Stream Processing 10 Computation Computations on never-ending “streams” of events
    11. 11. Distributed Stream Processing 11 Computation Computation spread across many machines Computation Computation
    12. 12. Stateful Stream Processing 12 Computation State Result depends on history of stream
    13. 13. 13 Stream Processors are a natural fit for CEP
    14. 14. FlinkCEP 14 Pattern Output FlinkCEP Input
    15. 15. What does FlinkCEP offer? 15
    16. 16. Pattern Definition 16 Pattern
    17. 17. Pattern Definition  Composed of Individual Patterns • P1(shape == rectangle) • P2(shape == triangle) 17 Pattern P2 P1
    18. 18. Pattern Definition  Composed of Individual Patterns • P1(shape == rectangle) • P2(shape == triangle)  Combined by Contiguity Conditions • ...later 18 Pattern P2 P1
    19. 19. FlinkCEP Individual Patterns  Unique Name  Condition : which elements to accept • Simple e.g shape == rectangle • Iterative e.g rectangle.surface < triangle.surface  Quantifiers (or not) • Looping/Optional oneOrMore(),times(#),optional() 19 Pattern P2 P1
    20. 20. FlinkCEP Complex Patterns  Combine Individual Patterns  Contiguity Conditions • how to select relevant events given an input mixing relevant and irrelevant events  Time Constraints • within(time) e.g. all events have to come within 24h 20 Pattern P2 P1
    21. 21. FlinkCEP Contiguity Conditions 21 Pattern Input
    22. 22. FlinkCEP Contiguity Conditions 22 Pattern OutputInput Strict Contiguity • matching events strictly follow each other
    23. 23. FlinkCEP Contiguity Conditions 23 Pattern OutputInput
    24. 24. FlinkCEP Contiguity Conditions 24 Pattern Relaxed Contiguity • non-matching events to simply be ignored Input Output
    25. 25. FlinkCEP Contiguity Conditions 25 Pattern Input Output
    26. 26. FlinkCEP Contiguity Conditions 26 Pattern Input Output
    27. 27. FlinkCEP Contiguity Conditions 27 Pattern Input Output Non-Deterministic Relaxed Contiguity • allows non-deterministic actions on relevant events
    28. 28. FlinkCEP Contiguity Conditions 28 Pattern NOT patterns: • for strict and relaxed contiguity • for cases where an event should invalidate a match Input
    29. 29. FlinkCEP Summary 29  Quantifiers • oneOrMore(), times(), optional()  Conditions • Simple & Iterative  Time Constraints • Event and Processing time  Different Contiguity Constraints • Strict, relaxed, non-deterministic relaxed, NOT
    30. 30.  Trace all shipments which: • start at location A • have at least 5 stops • end at location B • within the last 24h 30 Running Example: retailer A B M1 M2 M3 M4 M5
    31. 31.  Trace all shipments which: • start at location A • have at least 5 stops • end at location B • within the last 24h 31 Observation A Individual Patterns Start End Mid ev.from == A ev[i].from == ev[i-1].to ev.to == B && size(“mid”) >= 5
    32. 32. 32 Observation B Quantifiers  Start/End: single event  Middle: multiple events • .oneOrMore() Start End Mid ev.from == A ev[i].from == ev[i-1].to ev.to == B && size(“mid”) >= 5
    33. 33. 33 Observation C Conditions  Start -> Simple • properties of the event  Middle/End -> Iterative • Depend on previous events Start End Mid ev.from == A ev[i].from == ev[i-1].to ev.to == B && size(“mid”) >= 5
    34. 34. 34  Trace all shipments which: • start at location A • have at least 5 stops • end at location B • within the last 24h Observation D Time Constraints Start End Mid ev.from == A ev[i].from == ev[i-1].to ev.to == B && size(“mid”) >= 5
    35. 35. 35  We opt for relaxed continuity Observation E Contiguity
    36. 36. Pattern<Event, ?> pattern = Pattern .<Event>begin("start") .where(mySimpleCondition) .followedBy ("middle") .where(myIterativeCondition1) .oneOrMore() .followedBy ("end”) .where(myIterativeCondition2) .within(Time.hours(24)) Start Middle End Running Example Individual Patterns
    37. 37. Pattern<Event, ?> pattern = Pattern .<Event>begin("start") .where(mySimpleCondition) .followedBy ("middle") .where(myIterativeCondition1) .oneOrMore() .followedBy ("end”) .where(myIterativeCondition2) .within(Time.hours(24)) Start Middle End Running Example Quantifiers
    38. 38. Pattern<Event, ?> pattern = Pattern .<Event>begin("start") .where(mySimpleCondition) .followedBy ("middle") .where(myIterativeCondition1) .oneOrMore() .followedBy ("end”) .where(myIterativeCondition2) .within(Time.hours(24)) Start Middle End Running Example Conditions
    39. 39. Pattern<Event, ?> pattern = Pattern .<Event>begin("start") .where(mySimpleCondition) .followedBy ("middle") .where(myIterativeCondition1) .oneOrMore() .followedBy ("end”) .where(myIterativeCondition2) .within(Time.hours(24)) Start Middle End Running Example Time Constraint
    40. 40. Running Example Pattern Integration Pattern<Event, ?> pattern = ... PatternStream<Event> patternStream = CEP.pattern(input, pattern); DataStream<Alert> result = patternStream.select ( new PatternSelectFunction<Event, Alert>() { @Override public Alert select(Map<String, List<Event>> pattern) { return parseMatch(pattern); } );
    41. 41. Running Example Pattern Integration Pattern<Event, ?> pattern = ... PatternStream<Event> patternStream = CEP.pattern(input, pattern); DataStream<Alert> result = patternStream.select ( new PatternSelectFunction<Event, Alert>() { @Override public Alert select(Map<String, List<Event>> pattern) { return parseMatch(pattern); } );
    42. 42. Running Example Pattern Integration Pattern<Event, ?> pattern = ... PatternStream<Event> patternStream = CEP.pattern(input, pattern); DataStream<Alert> result = patternStream.select ( new PatternSelectFunction<Event, Alert>() { @Override public Alert select(Map<String, List<Event>> pattern) { return parseMatch(pattern); } );
    43. 43. Documentation  FlinkCEP documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 43
    44. 44. 4 Thank you! @KLOUBEN_K @ApacheFlink @dataArtisans
    45. 45. 45 Stream Processing and Apache Flink®'s approach to it @StephanEwen Apache Flink PMC CTO @ data ArtisansFLINKFORWARD IS COMING BACKTO BERLIN SEPTEMBER11-13, 2017 BERLIN.FLINK-FORWARD.ORG -
    46. 46. We are hiring! data-artisans.com/careers

    ×