SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Concord: Simple & Flexible
Stream Processing on Apache Mesos
Shinji Kim
Co-founder, Concord Systems
@concord
@databythebay #datagrid
Overview
•  What is Stream Processing?
•  Today’s Stream Processing
•  Introducing Concord
1. Concepts & API
2. Job Topology Management
3. Operations, Toolings, Performance
4. Message Delivery Guarantees
•  Future Development Plans
Page 2
What is stream processing?
Page 3
•  Processing Data in motion
•  Sits between message queues and databases
•  Used for faster:
–  Data enrichment
–  Aggregation
–  Filtering / deduplication
Today’s Stream Processing
•  Faster MapReduce jobs à ends up running core
business logic on top
–  Fradulent click detection
–  Real-time budget updates
–  Trigger-based trading
•  Your stream processing jobs are more like microservices
•  Need support for services / application management:
Cluster mgmt, Monitoring, Debuggability
Page 4
Introducing Concord
Concord is a distributed stream processing framework
built in C++ on top of Apache Mesos, designed for
high-performance, real-time applications that require
flexibility & control.
Page 5
Introducing Concord
Page 6
Data	
  Sources	
   Data	
  Sinks	
  
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 7
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 8
C	
   Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.SHUFFLE],
ostreams=[])
Simple API in Multiple Languages
•  ProcessRecord, ProduceRecord, ProcessTimer
•  GetState, SetState backed by Rocksdb
•  API available in Python, Ruby, Go, Java/Scala, C++
B	
  Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[‘wordcount’])
Page 9
words	
   wordcount	
  
Key	
   Value	
  
Corgi	
   2	
  
Chiwawa	
   4	
  
Dashhound	
   5	
  
Useful for multiple teams to consume the same
streaming data in real-time
Page 10
Native Integration with Apache Mesos
Page 11
•  Dynamic resource
scheduling
•  Task Isolation
•  Task supervision
•  High Availability
Containerized Execution Environment
•  Horizontal scaling
•  Multi-tenancy
•  Hot code deployment &
dynamic topology
Page 12
Mesos	
  Agent	
  
RocksDB	
  
Concord is Flexible: Run-time deployment
Page 13
Concord is Flexible: Run-time deployment
Page 14
Concord is Flexible: Run-time deployment
Page 15
Concord is Flexible: Run-time deployment
Page 16
Concord supports Distributed Tracing
Page 17
Monitor all operator instances at glance
Page 18
Concord supports Transparent Debugging
[2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000:
traceId: -8816532120874703981,
parentId: 0, id: -6816766813334129096,
p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us
[2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001:
traceId: -4811311467074699790,
parentId: -7681059555040553620,
id: -1899872683843643522,
p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us
[2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req
[2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req
Page 19
Concord performs well at scale
•  Word count benchmark (1.13B msgs)
–  Concord: 500K QPS/node at 10ms/event
–  Storm: 16K QPS/node at 100ms/event
–  Spark Streaming: 100K QPS/node at 1s batch window
•  Server log processing (29G server log, ~260M msgs)
–  4 nodes, 8 vCPU, 32GB RAM each
–  Concord: 1M – 1.8M QPS
–  Spark Streaming: 72K – 2M QPS
•  Consistent performance
Page 20
Concord is designed for Predictability
•  As you scale, JVM reconfiguration and GC pauses are
inevitable (Framework GC vs. Application GC)
•  Cluster abstracted as CPU, Memory, Disk numbers à
cluster optimization & overall runtime
•  Fast Compile à Test à Deploy cycle without downtime
Page 21
Message Delivery Guarantees
Today: Fast > Complete or Perfect
•  Best-effort / at-most-once processing
–  When operator or node crashes, the local cache goes away
–  Automatically retries the failed operator (number of retries is
configurable)
–  Recommends implementing check mechanisms in operators
(e.g., Concord Kafka consumer)
Page 22
Message Delivery Guarantees
Soon: Fast + Complete > Perfect
•  In development for at-least-once with Kafka
–  Kafka acts as a message bus between operators
–  Kafka replays data from checked offset (data duplication)
Eventually: Fast + Complete + Perfect
•  Transactional datastore in design phase
Page 23
Future plans
•  “At least once” guarantee support with Kafka
•  DC/OS integration
•  More data source / data sink connector support
•  Higher level DSL
Page 24
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 25
•  Operator model that you can use multiple languages
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 26
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 27
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 28
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 29
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 30
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
à Predictable system for real-time applications
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 31
•  Low-latency / Real-time applications:
–  Real-time fraud detection
–  Financial market data processing for real-time risks and triggers
–  Real-time campaign management for real-time bidding (RTB)
Thank You!
Get Started: http://concord.io
shinji@concord.io / @shinjikim
@concord
@databythebay #datagrid

Weitere ähnliche Inhalte

Was ist angesagt?

Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAljoscha Krettek
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkAljoscha Krettek
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistentconfluent
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsDataWorks Summit
 
How Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleHow Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleDatabricks
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuFlink Forward
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
Apache Flink Worst Practices
Apache Flink Worst PracticesApache Flink Worst Practices
Apache Flink Worst PracticesKonstantin Knauf
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayDataWorks Summit
 

Was ist angesagt? (20)

Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applications
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
 
How Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleHow Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at Scale
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Apache Flink Worst Practices
Apache Flink Worst PracticesApache Flink Worst Practices
Apache Flink Worst Practices
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 

Ähnlich wie Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choiceAshnikbiz
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBMongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservicesMithun Arunan
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue MongoDB
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingBukhary Ikhwan Ismail
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyMongoDB
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatJessica DeVita
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast groupPasocoPteLtd
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Continuent
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los AngelesVMware Tanzu
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021Ieva Navickaite
 

Ähnlich wie Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016 (20)

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choice
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservices
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge Computing
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data Strategy
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to Habitat
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Robotics technical Presentation
Robotics technical PresentationRobotics technical Presentation
Robotics technical Presentation
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021
 

Kürzlich hochgeladen

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Kürzlich hochgeladen (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016

  • 1. Concord: Simple & Flexible Stream Processing on Apache Mesos Shinji Kim Co-founder, Concord Systems @concord @databythebay #datagrid
  • 2. Overview •  What is Stream Processing? •  Today’s Stream Processing •  Introducing Concord 1. Concepts & API 2. Job Topology Management 3. Operations, Toolings, Performance 4. Message Delivery Guarantees •  Future Development Plans Page 2
  • 3. What is stream processing? Page 3 •  Processing Data in motion •  Sits between message queues and databases •  Used for faster: –  Data enrichment –  Aggregation –  Filtering / deduplication
  • 4. Today’s Stream Processing •  Faster MapReduce jobs à ends up running core business logic on top –  Fradulent click detection –  Real-time budget updates –  Trigger-based trading •  Your stream processing jobs are more like microservices •  Need support for services / application management: Cluster mgmt, Monitoring, Debuggability Page 4
  • 5. Introducing Concord Concord is a distributed stream processing framework built in C++ on top of Apache Mesos, designed for high-performance, real-time applications that require flexibility & control. Page 5
  • 6. Introducing Concord Page 6 Data  Sources   Data  Sinks  
  • 7. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 7
  • 8. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 8 C   Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.SHUFFLE], ostreams=[])
  • 9. Simple API in Multiple Languages •  ProcessRecord, ProduceRecord, ProcessTimer •  GetState, SetState backed by Rocksdb •  API available in Python, Ruby, Go, Java/Scala, C++ B  Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[‘wordcount’]) Page 9 words   wordcount   Key   Value   Corgi   2   Chiwawa   4   Dashhound   5  
  • 10. Useful for multiple teams to consume the same streaming data in real-time Page 10
  • 11. Native Integration with Apache Mesos Page 11 •  Dynamic resource scheduling •  Task Isolation •  Task supervision •  High Availability
  • 12. Containerized Execution Environment •  Horizontal scaling •  Multi-tenancy •  Hot code deployment & dynamic topology Page 12 Mesos  Agent   RocksDB  
  • 13. Concord is Flexible: Run-time deployment Page 13
  • 14. Concord is Flexible: Run-time deployment Page 14
  • 15. Concord is Flexible: Run-time deployment Page 15
  • 16. Concord is Flexible: Run-time deployment Page 16
  • 17. Concord supports Distributed Tracing Page 17
  • 18. Monitor all operator instances at glance Page 18
  • 19. Concord supports Transparent Debugging [2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000: traceId: -8816532120874703981, parentId: 0, id: -6816766813334129096, p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us [2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001: traceId: -4811311467074699790, parentId: -7681059555040553620, id: -1899872683843643522, p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us [2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req [2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req Page 19
  • 20. Concord performs well at scale •  Word count benchmark (1.13B msgs) –  Concord: 500K QPS/node at 10ms/event –  Storm: 16K QPS/node at 100ms/event –  Spark Streaming: 100K QPS/node at 1s batch window •  Server log processing (29G server log, ~260M msgs) –  4 nodes, 8 vCPU, 32GB RAM each –  Concord: 1M – 1.8M QPS –  Spark Streaming: 72K – 2M QPS •  Consistent performance Page 20
  • 21. Concord is designed for Predictability •  As you scale, JVM reconfiguration and GC pauses are inevitable (Framework GC vs. Application GC) •  Cluster abstracted as CPU, Memory, Disk numbers à cluster optimization & overall runtime •  Fast Compile à Test à Deploy cycle without downtime Page 21
  • 22. Message Delivery Guarantees Today: Fast > Complete or Perfect •  Best-effort / at-most-once processing –  When operator or node crashes, the local cache goes away –  Automatically retries the failed operator (number of retries is configurable) –  Recommends implementing check mechanisms in operators (e.g., Concord Kafka consumer) Page 22
  • 23. Message Delivery Guarantees Soon: Fast + Complete > Perfect •  In development for at-least-once with Kafka –  Kafka acts as a message bus between operators –  Kafka replays data from checked offset (data duplication) Eventually: Fast + Complete + Perfect •  Transactional datastore in design phase Page 23
  • 24. Future plans •  “At least once” guarantee support with Kafka •  DC/OS integration •  More data source / data sink connector support •  Higher level DSL Page 24
  • 25. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 25 •  Operator model that you can use multiple languages
  • 26. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 26 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data
  • 27. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 27 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling
  • 28. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 28 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work
  • 29. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 29 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale
  • 30. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 30 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale à Predictable system for real-time applications
  • 31. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 31 •  Low-latency / Real-time applications: –  Real-time fraud detection –  Financial market data processing for real-time risks and triggers –  Real-time campaign management for real-time bidding (RTB)
  • 32. Thank You! Get Started: http://concord.io shinji@concord.io / @shinjikim @concord @databythebay #datagrid