SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Fabric
Real-time stream processing framework
Shashank Gautam
Sathish Kumar KS
What is Fabric?
Fabric is a scalable, practical and reliable real-time stream processing
framework designed for easy operability and extension.
Fabric is proven to work very well for:
● High velocity multi-destination event ingestion with guaranteed
persistence.
● Rules/Filter based real-time triggers for advertising/broadcast
● Online Fraud detection
● Real-time pattern matching
● Streaming analytics
The Problem
● Primary motivation
○ Streaming millions of messages per second
○ Connectivity to different source - Kafka, MySql etc
○ Write to different targets - DB, Queue, API or publish to other
high level applications
○ Near real time
● Desirable properties from the framework
○ High Throughput - support of batching of events
○ Data sanity - Avoiding datasets which makes no sense
○ Make data available for other applications to consume
○ Scalability and Data Reliability
○ Provide easy development and deployment
○ Resource effectiveness
Fabric
Core Components
Fabric Compute and Executor
Fabric Compute Framework
● Computation pipeline setup
● Batch event processing
● Event passing among components
● Acknowledgements
Fabric Compute and Executor continued...
Fabric Compute and Executor continued...
Fabric-executor
Responsible for :
● Launching, monitoring and managing deployed computations
● 1:1 relation between 1 instance of computation : fabric executor
process
● Fabric executor is single JVM process within a docker container
Fabric Terminologies
● Compute Framework
○ Realtime event processing framework
○ Core event orchestration
○ Perform user-defined operations
● EventSet
○ Collection(of configurable size) of events
○ Basic transmission unit within the computation
● Computation/Topology
○ Pipeline for data flow using fabric components created by user
○ Components can be of two types, Source and Processor
Fabric Terminologies continued...
● Source
○ Sources event sets into the computation
○ Manages the Qos of the events ingested into the computation
● Processor
○ Performs computation on an incoming event set
○ Emits an outgoing event set
○ Types:
■ Streaming Processor: Streaming Processor is triggered
whenever and event set is sent to the processor.
■ Scheduled Processor: Scheduled Processor is triggered
whenever a fixed period of time elapses in a periodic fashion.
Management And Deployment
Fabric Manager
● Dropwizard Web Service and runs inside a docker container
● Provides APIs to register components - sources and processors
● Provides APIs to perform CRUD on computations
● Management APIs to deploy, scale, get, delete computations
● Application resource exposes APIs for deployment related operations of computations.
● Deployment Env: Marathon and Mesos
Sample Resources
● Components. eg: POST /v1/components
○ Other APIs - get, search, register etc
● Computation. eg: POST /v1/computations/{tenant}
○ Other APIs - get, search, update, deactivate etc
● Application. eg: POST /v1/applications/{tenant}/{computation_name}
○ Other APIs - get, scale, suspend etc
Fabric
Sample
Fabric Components
Create Components using maven archetype
Maven archetype command -
mvn archetype:generate -DarchetypeGroupId=com.olacabs.fabric -DarchetypeArtifactId=fabric-
processor-archetype -DarchetypeVersion=1.0.0-SNAPSHOT -DartifactId=<artifact_id_of_your_project> -
DgroupId=<group_id_of_your_project> -DinteractiveMode=ture
Example -
mvn archetype:generate -DarchetypeGroupId=com.olacabs.fabric -DarchetypeArtifactId=fabric-
processor-archetype -DarchetypeVersion=1.0.0-SNAPSHOT -DartifactId=fabric-my-processor -
DgroupId=com.olacabs.fabric -DinteractiveMode=ture
What it does -
Creates the pom project for the processor with all the updated version of compute and other related
jars.
Creates boilerplate code, with example, for scheduled and stream processor. You can modify the
example java file as per your need.
Sample Fabric Source
/**
* A Sample Source Implementation which generates
* Random sentences.
*/
@Source(namespace = "global", name = "random-sentence-source", version = "0.1",
description = "Sample source", cpu = 0.1, memory = 64,requiredProperties = {}, optionalProperties = {"randomGeneratorSeed"})
public class RandomSentenceSource implements PipelineSource {
Random random;
String[] sentences = {
"A quick brown fox jumped over the lazy dog",
"Life is what happens to you when you are busy making other plans"
. . .
. . .
};
@Override
public void initialize(final String instanceName,final Properties global,final Properties local,
final ProcessingContext processingContext, final ComponentMetadata componentMetadata) throws Exception {
int seed = ComponentPropertyReader.readInteger(local, global, "randomGeneratorSeed", instanceName, componentMetadata, 42);
random = new Random(seed);
}
Sample Fabric Source continued...
@Override
public RawEventBundle getNewEvents() {
return RawEventBundle.builder().events(
getSentences(5).stream().map(sentence -> Event.builder()
.id(random.nextInt())
.data(sentence.toLowerCase())
.build())
.collect(Collectors.toCollection(ArrayList::new)))
.meta(Collections.emptyMap())
.partitionId(Integer.MAX_VALUE)
.transactionId(Integer.MAX_VALUE)
.build();
}
private List<String> getSentences(int n) {
List<String> listOfSentences = new ArrayList<>();
for (int i = 0; i < n; i++) {
listOfSentences.add(sentences[random.nextInt(sentences.length)]);
}
return listOfSentences;
}
}
Sample Fabric Processor
/**
* A sample Processor implementation which
* Gets the data (sentences) and splits based on delim.
*/
@Processor(namespace = "global", name = "splitter-processor", version = "0.1", cpu = 0.1, memory = 32,
description = "A processor that splits sentences by a given delimiter", processorType = ProcessorType.EVENT_DRIVEN,
requiredProperties = {}, optionalProperties = {"delimiter"})
public class SplitterProcessor extends StreamingProcessor {
private String delimiter;
@Override
public void initialize(final String instanceName, final Properties global, final Properties local,
final ComponentMetadata componentMetadata) throws InitializationException {
delimiter = ComponentPropertyReader.readString(local, global, "delimiter", instanceName, componentMetadata, ",");
}
Sample Fabric Processor continued...
@Override
protected EventSet consume(final ProcessingContext processingContext, final EventSet eventSet) throws ProcessingException {
List<Event> events = new ArrayList<>();
eventSet.getEvents().stream()
.forEach(event -> {
String sentence = (String) event.getData();
String[] words = sentence.split(delimiter);
events.add(Event.builder().data(words)id(Integer.MAX_VALUE).properties(Collections.emptyMap()).build());
});
return EventSet.eventFromEventBuilder()
.partitionId(eventSet.getPartitionId())
.events(events)
.build();
}
@Override
public void destroy() {
// do some cleanup if necessary
}
}
Sample Computation / Topology
A sample topology -
● Select random sentence from in memory list
● Split the sentence based on a delimiter
● Counts the word
● Prints the count on console
Sample Computation / Topology Spec continued...
{
"name": "word-count-print-topology",
"sources": [
{
"id": "random-sentence-source",
"meta": { // … meta for source}
},
"properties": { //.. properties for source}
],
"processors": [
{
"id": "splitter-processor",
"meta": { // … meta for processor}
"properties": { //.. properties for processor}
},
{
"id": "word-count-processor",
"meta": { // … meta for processor}
"properties": { //.. properties for processor}
},
{
"id": "console-printer-processor",
"meta": { // … meta for processor}
"properties": { //.. properties for processor}
}
],
"connections": [
{
"fromType": "SOURCE",
"from": "random-sentence-source",
"to": "splitter-processor"
},
{
"fromType": "PROCESSOR",
"from": "splitter-processor",
"to": "word-count-processor"
},
{
"fromType": "PROCESSOR",
"from": "word-count-processor",
"to": "console-print-processor"
}
],
"properties": {
// … global properties
}
}
Steps for Action
Fabric
Implementation at Ola
Fabric At Ola
Fabric At Ola continued...
Artifact Registration View
Fabric At Ola continued...
Topology Creation View
Fabric At Ola continued...
Created Topology View
Fabric At Ola continued...
One click deployment
Fabric At Ola continued...
Marathon App
Fabric
Numbers
Fabric At Ola Stats
Ola is currently receiving ~2.5 million events per second from its end users - driver and customer
apps as well as internally generated events. Multiple real-time use cases stem from the events
which includes:
● Fraud detection and prevention
● Just-in-time notifications
● Security alerts
● Real-time reporting
● Generating user specific offers
Fabric has been in production at Ola for 10 months now and powering these applications apart from
acting as raw event ingestion and pub-sub system.
Fabric At Ola Stats continued...
Key Stats -
● Event Streams Handled : 375+
● No of topologies live : 160+
● Ingestion rate : ~2.5 million per second on 10 nodes
● Node Config : C4.8x large machines
Fabric Summary Points
1. Developed in Java.
2. Highly scalable and guaranteed availability
3. Reliable - Framework level guarantees against message loss, support for replay, multiple
sources and complex tuple trees
4. Event batching is supported at the core level.
5. Source level event partitioning used as unit for scalability.
6. Uses capabilities provided by docker to ensure strong application
7. On the fly topology creation and deployment by dynamically assembling topologies using
components directly from artifactory
8. Inbuilt support for custom metrics and custom code level healthchecks to catch application
failures right when they happen
9. Easy development and deployment
And many more...
Links
Fabric is recently open sourced on github.
● Github link: https://github.com/olacabs/fabric
● Documentation: https://github.com/olacabs/fabric/blob/develop/README.md
Please Contribute…!
Thank You!
Shashank Gautam
Sathish Kumar KS

Weitere ähnliche Inhalte

Was ist angesagt?

Mockito vs JMockit, battle of the mocking frameworks
Mockito vs JMockit, battle of the mocking frameworksMockito vs JMockit, battle of the mocking frameworks
Mockito vs JMockit, battle of the mocking frameworks
EndranNL
 
Linux Kernel Startup Code In Embedded Linux
Linux    Kernel    Startup  Code In  Embedded  LinuxLinux    Kernel    Startup  Code In  Embedded  Linux
Linux Kernel Startup Code In Embedded Linux
Emanuele Bonanni
 

Was ist angesagt? (20)

Mockito vs JMockit, battle of the mocking frameworks
Mockito vs JMockit, battle of the mocking frameworksMockito vs JMockit, battle of the mocking frameworks
Mockito vs JMockit, battle of the mocking frameworks
 
ElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementElasticSearch : Architecture et Développement
ElasticSearch : Architecture et Développement
 
Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)Embedded Android : System Development - Part III (Audio / Video HAL)
Embedded Android : System Development - Part III (Audio / Video HAL)
 
Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Hearts Of Darkness - a Spring DevOps Apocalypse
Hearts Of Darkness - a Spring DevOps ApocalypseHearts Of Darkness - a Spring DevOps Apocalypse
Hearts Of Darkness - a Spring DevOps Apocalypse
 
Hacking Jenkins
Hacking JenkinsHacking Jenkins
Hacking Jenkins
 
An Introduction To Jenkins
An Introduction To JenkinsAn Introduction To Jenkins
An Introduction To Jenkins
 
GraalVM Native Images by Oleg Selajev @shelajev
GraalVM Native Images by Oleg Selajev @shelajevGraalVM Native Images by Oleg Selajev @shelajev
GraalVM Native Images by Oleg Selajev @shelajev
 
Jenkins Pipelines
Jenkins PipelinesJenkins Pipelines
Jenkins Pipelines
 
ZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 ConferenceZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 Conference
 
Embedded Android Workshop with Pie
Embedded Android Workshop with PieEmbedded Android Workshop with Pie
Embedded Android Workshop with Pie
 
Apache Jackrabbit Oak - Scale your content repository to the cloud
Apache Jackrabbit Oak - Scale your content repository to the cloudApache Jackrabbit Oak - Scale your content repository to the cloud
Apache Jackrabbit Oak - Scale your content repository to the cloud
 
Embedded Android : System Development - Part I
Embedded Android : System Development - Part IEmbedded Android : System Development - Part I
Embedded Android : System Development - Part I
 
Linux Kernel Startup Code In Embedded Linux
Linux    Kernel    Startup  Code In  Embedded  LinuxLinux    Kernel    Startup  Code In  Embedded  Linux
Linux Kernel Startup Code In Embedded Linux
 
Embedded Android : System Development - Part IV (Android System Services)
Embedded Android : System Development - Part IV (Android System Services)Embedded Android : System Development - Part IV (Android System Services)
Embedded Android : System Development - Part IV (Android System Services)
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golang
 
Multi Stage Docker Build
Multi Stage Docker Build Multi Stage Docker Build
Multi Stage Docker Build
 
Embedded Android : System Development - Part IV
Embedded Android : System Development - Part IVEmbedded Android : System Development - Part IV
Embedded Android : System Development - Part IV
 
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnCI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
 

Andere mochten auch

Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
Eoin Shaughnessy
 

Andere mochten auch (15)

Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
Owen Shaughnessy_The Impact of Harmonised Warning Standards on European Consu...
 
Kate class presentation
Kate class presentationKate class presentation
Kate class presentation
 
Making research work
Making research workMaking research work
Making research work
 
The rise of Docker, and the future of computing
The rise of Docker, and the future of computingThe rise of Docker, and the future of computing
The rise of Docker, and the future of computing
 
Containers&Orchestration Approaches
Containers&Orchestration ApproachesContainers&Orchestration Approaches
Containers&Orchestration Approaches
 
Service excellence
Service excellenceService excellence
Service excellence
 
Ctc Hiring Trends Presentation 2-23-09
Ctc Hiring Trends Presentation 2-23-09Ctc Hiring Trends Presentation 2-23-09
Ctc Hiring Trends Presentation 2-23-09
 
HRM davis Chapter 6 hr planning 2014
HRM davis Chapter 6 hr planning 2014HRM davis Chapter 6 hr planning 2014
HRM davis Chapter 6 hr planning 2014
 
Oops, I broke my API
Oops, I broke my APIOops, I broke my API
Oops, I broke my API
 
Agile for Non-IT
Agile for Non-ITAgile for Non-IT
Agile for Non-IT
 
SenchaCon 2016: Improve Workflow Driven Applications with Ext JS Draw Package...
SenchaCon 2016: Improve Workflow Driven Applications with Ext JS Draw Package...SenchaCon 2016: Improve Workflow Driven Applications with Ext JS Draw Package...
SenchaCon 2016: Improve Workflow Driven Applications with Ext JS Draw Package...
 
SenchaCon 2016: Handle Real-World Data with Confidence - Fredric Berling
SenchaCon 2016: Handle Real-World Data with Confidence - Fredric Berling SenchaCon 2016: Handle Real-World Data with Confidence - Fredric Berling
SenchaCon 2016: Handle Real-World Data with Confidence - Fredric Berling
 
ISTQB foundation level - day 2
ISTQB foundation level - day 2ISTQB foundation level - day 2
ISTQB foundation level - day 2
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
One vagrantfile to rule them all
One vagrantfile to rule them allOne vagrantfile to rule them all
One vagrantfile to rule them all
 

Ähnlich wie Fabric - Realtime stream processing framework

Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
FIWARE
 

Ähnlich wie Fabric - Realtime stream processing framework (20)

Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Working with data using Azure Functions.pdf
Working with data using Azure Functions.pdfWorking with data using Azure Functions.pdf
Working with data using Azure Functions.pdf
 
Streaming ETL for All
Streaming ETL for AllStreaming ETL for All
Streaming ETL for All
 
Taking care of a cloud environment
Taking care of a cloud environmentTaking care of a cloud environment
Taking care of a cloud environment
 
Stream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data MicroservicesStream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data Microservices
 
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
MongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDBMongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDB
 
Andrii Dembitskyi "Events in our applications Event bus and distributed systems"
Andrii Dembitskyi "Events in our applications Event bus and distributed systems"Andrii Dembitskyi "Events in our applications Event bus and distributed systems"
Andrii Dembitskyi "Events in our applications Event bus and distributed systems"
 
iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
 
Lightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just RightLightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just Right
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEvents
 
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
 
Native container monitoring
Native container monitoringNative container monitoring
Native container monitoring
 
Native Container Monitoring
Native Container MonitoringNative Container Monitoring
Native Container Monitoring
 
Server-side JS with NodeJS
Server-side JS with NodeJSServer-side JS with NodeJS
Server-side JS with NodeJS
 
Samza Demo @scale 2017
Samza Demo @scale 2017Samza Demo @scale 2017
Samza Demo @scale 2017
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB
 
Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017Unified Stream Processing at Scale with Apache Samza - BDS2017
Unified Stream Processing at Scale with Apache Samza - BDS2017
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 

Fabric - Realtime stream processing framework

  • 1. Fabric Real-time stream processing framework Shashank Gautam Sathish Kumar KS
  • 2. What is Fabric? Fabric is a scalable, practical and reliable real-time stream processing framework designed for easy operability and extension. Fabric is proven to work very well for: ● High velocity multi-destination event ingestion with guaranteed persistence. ● Rules/Filter based real-time triggers for advertising/broadcast ● Online Fraud detection ● Real-time pattern matching ● Streaming analytics
  • 3. The Problem ● Primary motivation ○ Streaming millions of messages per second ○ Connectivity to different source - Kafka, MySql etc ○ Write to different targets - DB, Queue, API or publish to other high level applications ○ Near real time ● Desirable properties from the framework ○ High Throughput - support of batching of events ○ Data sanity - Avoiding datasets which makes no sense ○ Make data available for other applications to consume ○ Scalability and Data Reliability ○ Provide easy development and deployment ○ Resource effectiveness
  • 5. Fabric Compute and Executor Fabric Compute Framework ● Computation pipeline setup ● Batch event processing ● Event passing among components ● Acknowledgements
  • 6. Fabric Compute and Executor continued...
  • 7. Fabric Compute and Executor continued... Fabric-executor Responsible for : ● Launching, monitoring and managing deployed computations ● 1:1 relation between 1 instance of computation : fabric executor process ● Fabric executor is single JVM process within a docker container
  • 8. Fabric Terminologies ● Compute Framework ○ Realtime event processing framework ○ Core event orchestration ○ Perform user-defined operations ● EventSet ○ Collection(of configurable size) of events ○ Basic transmission unit within the computation ● Computation/Topology ○ Pipeline for data flow using fabric components created by user ○ Components can be of two types, Source and Processor
  • 9. Fabric Terminologies continued... ● Source ○ Sources event sets into the computation ○ Manages the Qos of the events ingested into the computation ● Processor ○ Performs computation on an incoming event set ○ Emits an outgoing event set ○ Types: ■ Streaming Processor: Streaming Processor is triggered whenever and event set is sent to the processor. ■ Scheduled Processor: Scheduled Processor is triggered whenever a fixed period of time elapses in a periodic fashion.
  • 10. Management And Deployment Fabric Manager ● Dropwizard Web Service and runs inside a docker container ● Provides APIs to register components - sources and processors ● Provides APIs to perform CRUD on computations ● Management APIs to deploy, scale, get, delete computations ● Application resource exposes APIs for deployment related operations of computations. ● Deployment Env: Marathon and Mesos Sample Resources ● Components. eg: POST /v1/components ○ Other APIs - get, search, register etc ● Computation. eg: POST /v1/computations/{tenant} ○ Other APIs - get, search, update, deactivate etc ● Application. eg: POST /v1/applications/{tenant}/{computation_name} ○ Other APIs - get, scale, suspend etc
  • 13. Create Components using maven archetype Maven archetype command - mvn archetype:generate -DarchetypeGroupId=com.olacabs.fabric -DarchetypeArtifactId=fabric- processor-archetype -DarchetypeVersion=1.0.0-SNAPSHOT -DartifactId=<artifact_id_of_your_project> - DgroupId=<group_id_of_your_project> -DinteractiveMode=ture Example - mvn archetype:generate -DarchetypeGroupId=com.olacabs.fabric -DarchetypeArtifactId=fabric- processor-archetype -DarchetypeVersion=1.0.0-SNAPSHOT -DartifactId=fabric-my-processor - DgroupId=com.olacabs.fabric -DinteractiveMode=ture What it does - Creates the pom project for the processor with all the updated version of compute and other related jars. Creates boilerplate code, with example, for scheduled and stream processor. You can modify the example java file as per your need.
  • 14. Sample Fabric Source /** * A Sample Source Implementation which generates * Random sentences. */ @Source(namespace = "global", name = "random-sentence-source", version = "0.1", description = "Sample source", cpu = 0.1, memory = 64,requiredProperties = {}, optionalProperties = {"randomGeneratorSeed"}) public class RandomSentenceSource implements PipelineSource { Random random; String[] sentences = { "A quick brown fox jumped over the lazy dog", "Life is what happens to you when you are busy making other plans" . . . . . . }; @Override public void initialize(final String instanceName,final Properties global,final Properties local, final ProcessingContext processingContext, final ComponentMetadata componentMetadata) throws Exception { int seed = ComponentPropertyReader.readInteger(local, global, "randomGeneratorSeed", instanceName, componentMetadata, 42); random = new Random(seed); }
  • 15. Sample Fabric Source continued... @Override public RawEventBundle getNewEvents() { return RawEventBundle.builder().events( getSentences(5).stream().map(sentence -> Event.builder() .id(random.nextInt()) .data(sentence.toLowerCase()) .build()) .collect(Collectors.toCollection(ArrayList::new))) .meta(Collections.emptyMap()) .partitionId(Integer.MAX_VALUE) .transactionId(Integer.MAX_VALUE) .build(); } private List<String> getSentences(int n) { List<String> listOfSentences = new ArrayList<>(); for (int i = 0; i < n; i++) { listOfSentences.add(sentences[random.nextInt(sentences.length)]); } return listOfSentences; } }
  • 16. Sample Fabric Processor /** * A sample Processor implementation which * Gets the data (sentences) and splits based on delim. */ @Processor(namespace = "global", name = "splitter-processor", version = "0.1", cpu = 0.1, memory = 32, description = "A processor that splits sentences by a given delimiter", processorType = ProcessorType.EVENT_DRIVEN, requiredProperties = {}, optionalProperties = {"delimiter"}) public class SplitterProcessor extends StreamingProcessor { private String delimiter; @Override public void initialize(final String instanceName, final Properties global, final Properties local, final ComponentMetadata componentMetadata) throws InitializationException { delimiter = ComponentPropertyReader.readString(local, global, "delimiter", instanceName, componentMetadata, ","); }
  • 17. Sample Fabric Processor continued... @Override protected EventSet consume(final ProcessingContext processingContext, final EventSet eventSet) throws ProcessingException { List<Event> events = new ArrayList<>(); eventSet.getEvents().stream() .forEach(event -> { String sentence = (String) event.getData(); String[] words = sentence.split(delimiter); events.add(Event.builder().data(words)id(Integer.MAX_VALUE).properties(Collections.emptyMap()).build()); }); return EventSet.eventFromEventBuilder() .partitionId(eventSet.getPartitionId()) .events(events) .build(); } @Override public void destroy() { // do some cleanup if necessary } }
  • 18. Sample Computation / Topology A sample topology - ● Select random sentence from in memory list ● Split the sentence based on a delimiter ● Counts the word ● Prints the count on console
  • 19. Sample Computation / Topology Spec continued... { "name": "word-count-print-topology", "sources": [ { "id": "random-sentence-source", "meta": { // … meta for source} }, "properties": { //.. properties for source} ], "processors": [ { "id": "splitter-processor", "meta": { // … meta for processor} "properties": { //.. properties for processor} }, { "id": "word-count-processor", "meta": { // … meta for processor} "properties": { //.. properties for processor} }, { "id": "console-printer-processor", "meta": { // … meta for processor} "properties": { //.. properties for processor} } ], "connections": [ { "fromType": "SOURCE", "from": "random-sentence-source", "to": "splitter-processor" }, { "fromType": "PROCESSOR", "from": "splitter-processor", "to": "word-count-processor" }, { "fromType": "PROCESSOR", "from": "word-count-processor", "to": "console-print-processor" } ], "properties": { // … global properties } }
  • 23. Fabric At Ola continued... Artifact Registration View
  • 24. Fabric At Ola continued... Topology Creation View
  • 25. Fabric At Ola continued... Created Topology View
  • 26. Fabric At Ola continued... One click deployment
  • 27. Fabric At Ola continued... Marathon App
  • 29. Fabric At Ola Stats Ola is currently receiving ~2.5 million events per second from its end users - driver and customer apps as well as internally generated events. Multiple real-time use cases stem from the events which includes: ● Fraud detection and prevention ● Just-in-time notifications ● Security alerts ● Real-time reporting ● Generating user specific offers Fabric has been in production at Ola for 10 months now and powering these applications apart from acting as raw event ingestion and pub-sub system.
  • 30. Fabric At Ola Stats continued... Key Stats - ● Event Streams Handled : 375+ ● No of topologies live : 160+ ● Ingestion rate : ~2.5 million per second on 10 nodes ● Node Config : C4.8x large machines
  • 31. Fabric Summary Points 1. Developed in Java. 2. Highly scalable and guaranteed availability 3. Reliable - Framework level guarantees against message loss, support for replay, multiple sources and complex tuple trees 4. Event batching is supported at the core level. 5. Source level event partitioning used as unit for scalability. 6. Uses capabilities provided by docker to ensure strong application 7. On the fly topology creation and deployment by dynamically assembling topologies using components directly from artifactory 8. Inbuilt support for custom metrics and custom code level healthchecks to catch application failures right when they happen 9. Easy development and deployment And many more...
  • 32. Links Fabric is recently open sourced on github. ● Github link: https://github.com/olacabs/fabric ● Documentation: https://github.com/olacabs/fabric/blob/develop/README.md Please Contribute…!