We have built a Flink-based system to allow our business users to configure processing rules on a Kafka stream dynamically. Additionally it allows the state to be built dynamically using replay of targeted messages from a long term storage system. This allows for new rules to deliver results based on prior data or to re-run existing rules that had breaking changes or a defect. Why we submitted this talk: We developed a unique solution that allows us to handle on the fly changes of business rules for stateful stream processing. This challenge required us to solve several problems -- data coming in from separate topics synchronized on a tracer-bullet, rebuilding state from events that are no longer on Kafka, and processing rule changes without interrupting the stream.
Chintamani Call Girls: đ 7737669865 đ High Profile Model Escorts | Bangalore ...
Â
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynamically Configured Stream Processing using Flink & Kafka
1. Powering Cloud IT
Dynamically Configured
Stream Processing
Using Flink & Kafka
David Hardwick
Sean Hester
David Brelloch
https://github.com/brelloch/FlinkForward2017
[ Time : 30 seconds, 30 seconds total ]
DYNAMICALLY CONFIGuRED STREAM PROCESSING using Flink and Kafka!
I have five minutes to tell you about the marketspace BetterCloud resides in and how it drove us to solve the problems that we will cover in todayâs session.
[ Time : 15 seconds, 45 seconds total ]
UNIFED SAAS MANAGEMENT, thatâs the marketspace that we are in....so WHAT DOES THAT MEAN, EXACTLY?
[ Time : 15 seconds, 60 seconds total ]
A few years ago, Cloud was relatively easy...but now there are so many awesome SaaS apps that IT Admins struggle to control it
[ Time : 15 seconds, 1m 15 seconds total ]
Unified SaaS Management feels like âMISSION CONTROLâ for IT Admins and Support Techs... They can have intelligent alerts, centralize all their data and orchestrate both within and across their favorite IT SaaS apps.
...let me quickly visualize what all this means, with a focus on INTELLIGENCE because that is where we leverage Flink for the solution we are covering today
[ Time : 15 seconds, 1m 30 seconds total ]
[ Time : 15 seconds, 1m 45 seconds total ]
[ Time : 15 seconds, 2m 00 seconds total ]
[ Time : 15 seconds, 2m 15 seconds total ]
[ Time : 15 seconds, 2m 30 seconds total ]
Hereâs a detail view...showing all the SaaS groups the user to which it belongs.
[ Time : 15 seconds, 2m 45 seconds total ]
[ Time : 15 seconds, 3m 00 seconds total ]
[ Time : 15 seconds, 4m 15 seconds total ]
...the ability to create alerts based on the meta data that is being streamed into the platform.
For example: If Two factor is on for Google but off for Slack
[ Time : 15 seconds, 3m 00 seconds total ]
[ Time : 15 seconds, 3m 15 seconds total ]
If create Jira project then create new Slack channel with project name
[ Time : 10 seconds, 1m 47.5 sec seconds total ]
The second way to visualize our orchestrator, is from the âaction engineâ this is what we added to Jira
[ Time : 10 seconds, 1m 37.5 sec seconds total ]
If Create Jira project then create new âprivate-roomâ named the same as âproject nameâ
If Added to a Jira group, then add to this âprivate-roomâ
If Create alias for a team, then add to team members to the alias as they add to the team
[ Time : 15 seconds, 3m 45 seconds total ]
[ Time : 45 seconds, 5m 00 seconds total ]
An alert that fires 12 hours from now isnât very helpful
Weâve already had peak loads exceeding 100M events in a day
We need to stay agile, and using continuous delivery techniques such as features flags and jenkins pipelines to ship new capabilities to customers multiple times a day
We need to show the power by providing commong alerts across many thousands of customers that have multiple SaaS apps and multiple alerts per SaaS app. AND put the customers in the drivers seat by allowing them to create custom alerts for their specific needs.
And finally, we need to be able to replay history, to go back in time in case we have to correct a bug in a global alert or if customer deploys a new alert that would benefit from looking at the historical data and not just the data going forward from the time that the alert is deployed.
[ Time : 15 seconds, 45 seconds total ]
UNIFED SAAS MANAGEMENT, thatâs the marketspace that we are in....so WHAT DOES THAT MEAN, EXACTLY?
----- HESTER START -----
Thanks, Hardwick! Hi everyone, Iâm Sean Hester, a Director of Engineering at BetterCloud. I provide architectural support and leadership for teams in the Operational Intelligence segment of BetterCloudâs platform.
Before my teammate David Brelloch takes on a dive through some code, I wanted to give you some quick architectural backstory.
Before we transitioned into this project both David and I had worked on a somewhat similar project for BetterCloud, and we had several âlearningsâ from that project that we wanted to apply here:
Event Stream Processing is a better solution for this domain than Ingest + Query.
Ingest + Query is where you read all events into a giant database, and then run a bunch of queries against the it to extract the Operational Intelligence.
It has many downsides:
itâs slow and inefficient--lots of redundant processing
It has high latency--best case it takes minutes or hours to surface the latest data
Itâs datastore intensive, which makes it expensive to scale
Worst of all, it focuses on state instead of state transitions. In Operational Intelligence we care at least as much about state transitions as we do the current state.
Rules Engines add significantly more value than hard-coded rules.
Theyâre slow to develop.
Theyâre expensive to maintain.
No one likes to code them--theyâre boring and redundant.
They arenât conducive to an agile business--long cycle times from your Kanban board to production.
The first point was a fairly easy sell for us. As you may have noticed during Hardwickâs introâŠ
The entire organization has embraced an event stream model. Everything flows from the SaaS provider APIs through the BetterCloud platform as a set of continuous streams. This made it simple to take our prefered approach.
The entire organization has embraced an event stream model.
Everything flows from the SaaS provider APIs through the BetterCloud platform as a set of continuous streams.
This made it simple for our team to take our prefered approach.
For the second point we went outside the box a bit more. Iâve been around long enough to have had significant experience with the XML stack back before Douglas Crockford killed it. XPath + XSLT were two of my favorite tools in the stack, iâve built rules engines with them, and it turns out several folks have ported them to the JSON stack.
For those of you not familiar with JSONPath (or XPath), let me give you a quick intro so that youâll be on better footing when David starts walking through the code.
As I mentioned, weâre using the Jayway JSONPath implementation, and you can find more info at the URL shown here.
It executes queries (of a sort) against JSON documents, and
It offers a simple model for creating a user interface for non-technical users
Bonus: since itâs Javascript at itâs core, there are pure javascript implementations that weâve used to build tooling to validate the queries in the web browser before theyâre submitted to the stream processing jobs.
To put some meat on JSONPath, hereâs an example JSON document weâve borrowed from the Jayway docs, and out to the right are three JSONPath query examples.
In JSONPath the dollar sign indicates the root node of the document tree, and from there the queries walk down the tree structure using dot notation.
As you can see in the third example it supports selection of a specific array index, which offers just a hint that JSONPath supports some non-trivial queries. The language supports a fairly robust set of WHERE-style conditional clauses, as well as a subset of standard Javascript operations.
Hereâs an example of one of the UIs skins weâve built around JSONPath queries.
If you look at the highlighted conditions section, each line there translates to an individual JSONPath statement, and then each of those are all combined into a compound query.
The output of the UI sends the alert as a control event to a Kafka topic dedicated to control messages that all of our jobs consume from.
The control topic is a Kafka topic thatâs a standard Data Source used in almost all of our Flink jobs
Most of our Flink Functions are actually CoFlatMap, with the flatMap1 method dedicated to control messages, and flatMap2 dedicated to the main event stream
Control streams are one of our key enablers--we use it to dynamically update the behaviors of our Flink jobs in many ways:
We do use it for sending dynamic queries and/or updating alert configurations, but we also use it for...
Resetting state prior to state rebuilding operations
Coordinating behaviors across multiple Flink jobs or that require coordinating multiple data sources
Modifying logging + monitoring behaviors
We do use both global and key-specific control messages in production...but in the demo weâre going to stick to only global control messages.
Since our production jobs would overwhelm our time budget, hereâs the simplified version Davidâs going to walk you through.
There are two Kafka data sources: one for the main event stream and one for the control events, which is where the queries are sent.
The Filtering Function is responsible for both storing the queries and for preliminarily matching incoming events to queries that might apply to them.
The Qualifier Function is responsible for executing the pre-matched queries against each event.
And finally the events that match a query are collected to the âUnnamedâ data sink.
There is one more critical high-level architectural consideration, but weâll let David speak to that after heâs walked you through the code a bit. David...
CustomerEvent: comes from events kafka topic and contains customer id and json payload
ControlEvent: event that comes in from control kafka - note: customerId, alertId, jsonPath and later bootstrapCustomerId
eventStream:
Returns Option[CustomerEvent], removes undefined (None) events, gets event from Some(event), keys by the customer id
controlStream: listens to control topic and splits the stream based on global alerts vs single company alerts
globalControlStream: broadcast global alerts so all task managers will have those configs
specificControlStream: keyed stream based on customerId
ControlEvent: event that comes in from control kafka - note: customerId, alertId, jsonPath and later bootstrapCustomerId
controlStream: listens to control topic and splits the stream based on global alerts vs single company alerts
globalControlStream: broadcast global alerts so all task managers will have those configs
specificControlStream: keyed stream based on customerId
filteredStream unions the specific and global control stream and connects the event stream
FilteredFunction takes in the connect control and event streams
flatMap1: takes control stream and tracks the list of configurations - no output
flatmap2: takes the event stream, filters for any global or cusomter specific configs based on event customerId, outputs a FilteredEvent(the event and any matching configs)
Qualifier function takes in an event and all potentially matching controls
Evaluates the event against each matching controlâs json path
JsonPath.parse should be moved out of the foreach for production - just easier to read this way
Counts the number of times a message matching that customer and alertid flow through
Connect the stream -> filter -> qualifier -> counter
Shows the dag for the system from earlier with support for bootstrap requests
Updated counter function outputs a request to bootstrap on the first time this customer-alert is hit
Flatmap only shown
Takes in a bootstrap request control message
As an example reads from a text file for âhistoricalâ events - would normally be an API, database or replay on kafka topic
Output each event which matches the customerId with the configuration to qualify
bootstrapStream: the stream of control event bootstrap requests
bootStrapSink: sink out of counter to request bootstrap kafka topic
Union of bootstrap and filterfunction streams
Ingest takes in a configurable list of topics and saves them keyed by customer and provider
Replay accepts a query on a kafka topic and replays the requested data to the requested topic