SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Apache Pulsar as a
Dual Streaming /
Batch Processor
Joe Olson
Senior Manager, Big Data Analytics
Apache Road Show Chicago - May 2019
Agenda
United and the Airline Industry
How Publish – Subscribe Compute
Model Presents Opportunity
Apache Pulsar & Apache Bookkeeper
Use Case: FAA’s Real Time SWIM
Feed
2
About United Airlines…..
 1,348 aircraft (779 mainline, 569 regional) with 250+ on order (supply chain)
 158M passengers in 2018
(public facing web site, mobile app, time / geospatial based inventory, loyalty program, surveys, ancillary sales)
 4900 daily departures (scheduling, operations, weather, route planning)
 355 airports served, in 48 countries (baggage claim, check-ins)
 88,000 employees worldwide (scheduling, pay)
 Constantly in motion! Future (and past) always changing.
 A data scientist / data engineer dream.
Source: https://hub.united.com/corporate-fact-sheet/
3
Business Goals
 Improve Customer Experience
- How can we reduce friction when booking a reservation? Maneuvering through an airport?
- How can we deliver a consistent message across all channels? (mobile app, web site, social media etc)
 Improve Employee Experience
- How can we keep employees better informed of the current situation so they can relay it to the customers?
- What are we learning from our surveys about what the customer bases says is / isn’t working?
 Revenue Generation
- What personalized offers can we make to our customers?
- Are our offers competitive with the rest of the industry?
 Improve Operational Reliability
- How can we better prepare for weather or other operational interruptions?
- How can we manage the fleet better and insure spare parts are where they need to be?
4
Industry Ideas – Customer Experience
5
Apache Pulsar – Key Points
 “Apache Pulsar is an open-source distributed pub-sub messaging system originally
created at Yahoo and now part of the Apache Software Foundation”
- Designed for low publish latency (< 5ms) at scale with strong durability guarantees
- Persistent message storage based on Apache BookKeeper.
- Tiered storage provides opportunity for batch and stream processing in the same platform.
- Built from the ground up as a multi-tenant system: isolation, quotas, etc
- Geo-replication designed in – across data centers or geographic regions.
- Pulsar has run in production at Yahoo scale for over 3 years, with millions of messages per
second across millions of topics. Can scale to hundreds of nodes.
- Easily deploy lightweight compute logic without a separate stream processing engine.
- REST Admin API for provisioning, administration, tools and monitoring. Deploy on bare metal
or Kubernetes.
6
Apache Pulsar – Multi Tenancy
 Pulsar was designed from the
ground up to be a multi-tenant
system. In Pulsar, tenants are
the highest administrative unit
within a Pulsar instance.
 Capacity allocated to a tenant.
 A namespace is the
administrative unit
nomenclature within a tenant.
The configuration policies set
on a namespace apply to all
the topics created in that
namespace
7
Apache Pulsar – Subscription Models
 In exclusive mode, only a single consumer is
allowed to attach to the subscription
 In shared or round robin mode, multiple
consumers can attach to the same subscription.
Messages are delivered in a round robin
distribution across consumers, and any given
message is delivered to only one consumer.
Ordering not guaranteed.
 In failover mode, multiple consumers can attach
to the same subscription. The first consumer will
initially be the only one receiving messages.
This consumer is called the master consumer.
 When the master consumer disconnects, all
(non-acked and subsequent) messages will be
delivered to the next consumer in line
8
Apache Pulsar – Reference Architecture
 One or more brokers handles and load balances
incoming messages from producers, dispatches
messages to consumers
- Topic lookup + data transfer
- Messages dispatched out of a managed
ledger cache, or if under load from persistent
storage (Bookkeeper)
- Coordination with the local and global meta
stores (Zookeeper)
 A BookKeeper cluster consisting of one or more
bookies handles persistent storage of messages
 Local Zookeeper handles coordination tasks
within a cluster, and a global cluster handles
coordination instance wide (Georeplication)
9
Apache BookKeeper - Key Points
 Apache BookKeeper is a scalable, fault tolerant, low latency log storage service
delivering durability and consistency guarantees and can provide access to both historic
and real time data
- Atomic unit is an entry
- A ledger is a bound set of entries, a stream is an unbound set of ledgers.
- Individual servers storing ledgers are called bookies.
- Entries are written to ledgers sequentially, and at most, once (append-only)
- Each bookie handles fragments of ledgers as part of an ensemble. (striping)
A stream of ledgers…
entry
10
Apache BookKeeper – Reference Architecture
 Two APIs:
- Ledger API – allows direct interaction with
ledgers, allowing you most flexibility in
working with bookies.
- Log stream API – allows you to interact with
streams without dealing with lower level
ledgers.
 Bookies advertise themselves to the Zookeeper
metadata cluster.
11
Apache BookKeeper – Storage Requirements
 Clients should be able to write and read streams of entries with very low latency (under 5
milliseconds), even when providing strong durability
 Data storage should be durable, consistent, and fault tolerant
 The system should enable clients to stream or tail ledgers to propagate data as they’re written
 The system should be able to store and provide access to both historic and real-time data
12
Apache BookKeeper – Durability
 Example:bookies 1-5 are the ensemble for the ledger.
 Entries are striped across the bookies.
 Write quorum in this case is 3 (all entries written to 3
bookies)
 Write is considered successful when the ack quorum
(in this case 2) successfully acknowledge the write
(fsync).
 Wide variety of writing to bookies in the case of
system degradation.
 Maximize bandwidth by scaling out bookies
 Improve latency by tuning the ack quorum.
 Replication supports durability
13
Apache BookKeeper – Consistency & Availability
 Consistency for log reads:
- An entry successfully written is immediately
readable.
- An entry read once is always readable.
- All entries written previously are also readable.
- The order of records is identical across all readers.
- Consistency accomplished via LastAddConfirmed
(LAC) – a spin on a two phase commit.
 Availability:
- Write can be performed as long as there are
enough bookies to satisfy the ack quorum.
- Read can be performed by any bookie in the
cluster.
14
Apache BookKeeper – I/O Isolation
 Three separate I/O paths implemented:
- Write (low latency)
- Tailing read (low latency)
- Catch up read (high latency)
Write
Read
Read
Read
15
Apache BookKeeper – Data Distribution
 Storage capacity for a single log stream
constrained by the capacity of the cluster,
never a single host.
 No stream rebalancing when capacity is added.
New bookies will be discovered, and available
for writing.
 Replica repair when failure detected is efficient
because it can be concurrently from multiple
hosts.
 All due to segmenting the streams.
16
Apache Pulsar – Tiered Storage
Broker
Bookies
Infinite Stream
 Infinite stream – most recent data stored on the
broker, rest stored in bookies, as capacity of
cluster allows
- Write
- Tailing Read
- Catchup Read
17
Apache Pulsar – Tiered Storage
 Infinite stream
- Offloader: move segments off the Pulsar
cluster and onto commodity storage.
- Can be triggered on time, size, or demand.
 Access
- Broker knows how to read data back, or
bypass bookies and read segments directly.
18
Apache Pulsar – Bringing It All Together
Producer
Subscriber
Segment
Reader
Unbounded stream
Bounded stream
19
Apache Pulsar – Bringing It All Together
Producer
Subscriber
Segment
Reader
Unbounded stream
Batch Processing Stream
Processing
20
Use Case – Improve Operational Reliability
 SWIM (System Wide Information Management)
- Real time FAA message feed describing the current and future state of the nation’s managed
airspace - traffic, weather, airport operations, etc.
- Publishers (such as airlines) push their operational information to an endpoint.
- Allows subscribers (such as airlines) on common published message interface.
 Airline needs:
- Connect the information in this feed up with their existing operational systems.
• Maintain current state on assets.
- Real time and historical analytics on this feed – traditional and predictive (ML / AI).
21
SWIM Overview
Phase of operation
FAA Topic
22
Sample SWIM Enroute TBFM Messages
{"carrier": "UAL”,
"flight number": 376,
"origin": "EWR",
"destination": "LAX",
"flight date": "2019-Mar-19”}
"Flight Plan": [{
"event_source": "TMA.ZOB.FAA.GOV",
"event_time": "2019-03-29T16:23:22.659Z",
"event_id": "422",
"tma_id": "C00926",
"Aircraft Id": "UAL376",
"Origin Airport": "EWR",
"Destination Airport": "LAX",
"Flight Plan": "ACTIVE",
"Aircraft Status": "TRACKED",
"Aircraft Type": "B752/L",
"Engine Type": "JET",
"Beacon Code": "2334",
"Flight Plan Speed": "483.0",
"Assigned Requested Altitude": "28000",
"Track Datasource": "ZNY",
"Coordination Fix": "KEWR",
"Coordination Time": "2019-03-29T16:14:00Z",
"Estimated Departure Clearance Status": "FAA”,
"Flight Plan Field 10A": "KEWR..COATE.Q436.RAAKK.Q438.RUBYY..MKG..BAE.J36.DUTYS..
KG78K..JORDY..OBH..GLL..DBL..CHESZ.Q88.HAKMN.ANJLL4.KLAX/2148",
"TMA Converted Route": "KEWR/0000 COATE/0000 LAAYK/0000 YYOST/0000 DGRAF/0000
KG78K/0000 JORDY/0000 OBH/0000 GLL/0000 DBL/0000 KLAX/0000}]
• Sample TBFM Messages. This specific flight generated 800 such messages
"Station Time of Arrival": [{
"event_source": "TMA.ZLA.FAA.GOV",
"event_time": "2019-03-29T20:38:28.148Z",
"event_id": "4664550",
"tma_id": "L03502”,
"Meter Fix Name": "CRCUS”,
"ETA Outer Meter Arc": "2019-03-29T21:42:45Z",
”ETA Meter Fix": "2019-03-29T21:46:35Z",
”ETA at Display Point": "2019-03-29T21:42:55Z",
"ETA at Scheduling Fix": "2019-03-29T21:42:55Z",
"ETA at Runway": "2019-03-29T21:57:23Z"}],
23
Architecture - Current State: Point to Point
Scheduling
Flight
Plans
Weather
Airport
Operations
FAA Systems:
Airspace
Operations
Scheduling
Flight
Plans
Weather
Airport
Operations
Airline Systems:
Airspace
Operations
24
Architecture - Target State: Pub / Sub
Scheduling
Flight
Plans
Weather
Airport
Operations
FAA Systems:
Airspace
Operations
Scheduling
Flight
Plans
Weather
Airport
Operations
Airline Systems:
Airspace
Operations
Producer
Subscriber
Topics
Producer
Subscriber
25
Architecture - Target State Considerations
Scheduling
Flight
Plans
Weather
Airport
Operations
Airline Systems:
Airspace
Operations
Producer
Subscriber
File
Connector
JDBC
Connector
API
Connector
 Connectivity to the operational
systems is mostly through file,
JDBC, and API interfaces.
 Most of these are not designed for
streaming interfaces (yet).
 How to connect up a topic with a
systems that are not designed to
work with streams?
26
Architecture - Target State Considerations
Scheduling
Flight
Plans
Weather
Airport
Operations
Airline Systems:
Airspace
Operations
Producer
Subscriber
File
Connector
JDBC
Connector
API
Connector
 What if there were both batch and
streaming interfaces?
 Use the batch interface until more
sophisticated streaming interfaces
come online.
 An API written around the segment
reader can help to close the last
mile.
 Treat as batch when needed, treat
as stream when needed.
Segment
Reader API
27
Apache Communities
 Twitter: @apache_pulsar
 Wechat: ApachePulsar
 Mailing Lists
- dev@pulsar.apache.org
- user@pulsar.apache.org
 Slack
- https://apache-pulsar.slack.com
 Localization
- http://crowdin.com/project/apache-pulsar
 Github
- https://github.com/apache/pulsar
 Twitter: @asfbookkeeper
 Mailing Lists
- dev@bookkeeper.apache.org
- user@bookkeeper.apache.org
- issues@bookkeeper.apache.org
 Slack
- http://apachebookkeeper.slack.com/
 Github
- https://github.com/apache/bookkeeper
Apache Pulsar Apache BookKeeper
Thank You!
We’re hiring!
- Data Engineers
- Data Scientists

Weitere ähnliche Inhalte

Was ist angesagt?

Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil DubeySwapnil Dubey
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introductionSyed Hadoop
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBaseCon
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketCloudera, Inc.
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardMatthew Blair
 
SAP OS/DB Migration using Azure Storage Account
SAP OS/DB Migration using Azure Storage AccountSAP OS/DB Migration using Azure Storage Account
SAP OS/DB Migration using Azure Storage AccountGary Jackson MBCS
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEkawamuray
 
Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexasArvind Prabhakar
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...Cloudera, Inc.
 
OpenText Archive Server on Azure
OpenText Archive Server on AzureOpenText Archive Server on Azure
OpenText Archive Server on AzureGary Jackson MBCS
 
Kafka Fundamentals
Kafka FundamentalsKafka Fundamentals
Kafka FundamentalsKetan Keshri
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBaseHBaseCon
 

Was ist angesagt? (20)

Cache simulator
Cache simulatorCache simulator
Cache simulator
 
Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil Dubey
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
ApacheCon-HBase-2016
ApacheCon-HBase-2016ApacheCon-HBase-2016
ApacheCon-HBase-2016
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
SAP OS/DB Migration using Azure Storage Account
SAP OS/DB Migration using Azure Storage AccountSAP OS/DB Migration using Azure Storage Account
SAP OS/DB Migration using Azure Storage Account
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
 
Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexas
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
Inside Flume
Inside FlumeInside Flume
Inside Flume
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
 
OpenText Archive Server on Azure
OpenText Archive Server on AzureOpenText Archive Server on Azure
OpenText Archive Server on Azure
 
Kafka Fundamentals
Kafka FundamentalsKafka Fundamentals
Kafka Fundamentals
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
 

Ähnlich wie Apache Pulsar as a Dual Stream / Batch Processor

Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLEdunomica
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataNaveen Korakoppa
 
Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basicsBrian S. Paskin
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar AppsTimothy Spann
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingDibyendu Bhattacharya
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptxKoiuyt1
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...GeeksLab Odessa
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarKarthik Ramasamy
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar clusterShivji Kumar Jha
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarKarthik Ramasamy
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...StreamNative
 

Ähnlich wie Apache Pulsar as a Dual Stream / Batch Processor (20)

Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast Data
 
Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basics
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Apps
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 
Intoduction to Apache Kafka
Intoduction to Apache KafkaIntoduction to Apache Kafka
Intoduction to Apache Kafka
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar cluster
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Kürzlich hochgeladen

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 

Kürzlich hochgeladen (20)

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Apache Pulsar as a Dual Stream / Batch Processor

  • 1. Apache Pulsar as a Dual Streaming / Batch Processor Joe Olson Senior Manager, Big Data Analytics Apache Road Show Chicago - May 2019
  • 2. Agenda United and the Airline Industry How Publish – Subscribe Compute Model Presents Opportunity Apache Pulsar & Apache Bookkeeper Use Case: FAA’s Real Time SWIM Feed
  • 3. 2 About United Airlines…..  1,348 aircraft (779 mainline, 569 regional) with 250+ on order (supply chain)  158M passengers in 2018 (public facing web site, mobile app, time / geospatial based inventory, loyalty program, surveys, ancillary sales)  4900 daily departures (scheduling, operations, weather, route planning)  355 airports served, in 48 countries (baggage claim, check-ins)  88,000 employees worldwide (scheduling, pay)  Constantly in motion! Future (and past) always changing.  A data scientist / data engineer dream. Source: https://hub.united.com/corporate-fact-sheet/
  • 4. 3 Business Goals  Improve Customer Experience - How can we reduce friction when booking a reservation? Maneuvering through an airport? - How can we deliver a consistent message across all channels? (mobile app, web site, social media etc)  Improve Employee Experience - How can we keep employees better informed of the current situation so they can relay it to the customers? - What are we learning from our surveys about what the customer bases says is / isn’t working?  Revenue Generation - What personalized offers can we make to our customers? - Are our offers competitive with the rest of the industry?  Improve Operational Reliability - How can we better prepare for weather or other operational interruptions? - How can we manage the fleet better and insure spare parts are where they need to be?
  • 5. 4 Industry Ideas – Customer Experience
  • 6. 5 Apache Pulsar – Key Points  “Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo and now part of the Apache Software Foundation” - Designed for low publish latency (< 5ms) at scale with strong durability guarantees - Persistent message storage based on Apache BookKeeper. - Tiered storage provides opportunity for batch and stream processing in the same platform. - Built from the ground up as a multi-tenant system: isolation, quotas, etc - Geo-replication designed in – across data centers or geographic regions. - Pulsar has run in production at Yahoo scale for over 3 years, with millions of messages per second across millions of topics. Can scale to hundreds of nodes. - Easily deploy lightweight compute logic without a separate stream processing engine. - REST Admin API for provisioning, administration, tools and monitoring. Deploy on bare metal or Kubernetes.
  • 7. 6 Apache Pulsar – Multi Tenancy  Pulsar was designed from the ground up to be a multi-tenant system. In Pulsar, tenants are the highest administrative unit within a Pulsar instance.  Capacity allocated to a tenant.  A namespace is the administrative unit nomenclature within a tenant. The configuration policies set on a namespace apply to all the topics created in that namespace
  • 8. 7 Apache Pulsar – Subscription Models  In exclusive mode, only a single consumer is allowed to attach to the subscription  In shared or round robin mode, multiple consumers can attach to the same subscription. Messages are delivered in a round robin distribution across consumers, and any given message is delivered to only one consumer. Ordering not guaranteed.  In failover mode, multiple consumers can attach to the same subscription. The first consumer will initially be the only one receiving messages. This consumer is called the master consumer.  When the master consumer disconnects, all (non-acked and subsequent) messages will be delivered to the next consumer in line
  • 9. 8 Apache Pulsar – Reference Architecture  One or more brokers handles and load balances incoming messages from producers, dispatches messages to consumers - Topic lookup + data transfer - Messages dispatched out of a managed ledger cache, or if under load from persistent storage (Bookkeeper) - Coordination with the local and global meta stores (Zookeeper)  A BookKeeper cluster consisting of one or more bookies handles persistent storage of messages  Local Zookeeper handles coordination tasks within a cluster, and a global cluster handles coordination instance wide (Georeplication)
  • 10. 9 Apache BookKeeper - Key Points  Apache BookKeeper is a scalable, fault tolerant, low latency log storage service delivering durability and consistency guarantees and can provide access to both historic and real time data - Atomic unit is an entry - A ledger is a bound set of entries, a stream is an unbound set of ledgers. - Individual servers storing ledgers are called bookies. - Entries are written to ledgers sequentially, and at most, once (append-only) - Each bookie handles fragments of ledgers as part of an ensemble. (striping) A stream of ledgers… entry
  • 11. 10 Apache BookKeeper – Reference Architecture  Two APIs: - Ledger API – allows direct interaction with ledgers, allowing you most flexibility in working with bookies. - Log stream API – allows you to interact with streams without dealing with lower level ledgers.  Bookies advertise themselves to the Zookeeper metadata cluster.
  • 12. 11 Apache BookKeeper – Storage Requirements  Clients should be able to write and read streams of entries with very low latency (under 5 milliseconds), even when providing strong durability  Data storage should be durable, consistent, and fault tolerant  The system should enable clients to stream or tail ledgers to propagate data as they’re written  The system should be able to store and provide access to both historic and real-time data
  • 13. 12 Apache BookKeeper – Durability  Example:bookies 1-5 are the ensemble for the ledger.  Entries are striped across the bookies.  Write quorum in this case is 3 (all entries written to 3 bookies)  Write is considered successful when the ack quorum (in this case 2) successfully acknowledge the write (fsync).  Wide variety of writing to bookies in the case of system degradation.  Maximize bandwidth by scaling out bookies  Improve latency by tuning the ack quorum.  Replication supports durability
  • 14. 13 Apache BookKeeper – Consistency & Availability  Consistency for log reads: - An entry successfully written is immediately readable. - An entry read once is always readable. - All entries written previously are also readable. - The order of records is identical across all readers. - Consistency accomplished via LastAddConfirmed (LAC) – a spin on a two phase commit.  Availability: - Write can be performed as long as there are enough bookies to satisfy the ack quorum. - Read can be performed by any bookie in the cluster.
  • 15. 14 Apache BookKeeper – I/O Isolation  Three separate I/O paths implemented: - Write (low latency) - Tailing read (low latency) - Catch up read (high latency) Write Read Read Read
  • 16. 15 Apache BookKeeper – Data Distribution  Storage capacity for a single log stream constrained by the capacity of the cluster, never a single host.  No stream rebalancing when capacity is added. New bookies will be discovered, and available for writing.  Replica repair when failure detected is efficient because it can be concurrently from multiple hosts.  All due to segmenting the streams.
  • 17. 16 Apache Pulsar – Tiered Storage Broker Bookies Infinite Stream  Infinite stream – most recent data stored on the broker, rest stored in bookies, as capacity of cluster allows - Write - Tailing Read - Catchup Read
  • 18. 17 Apache Pulsar – Tiered Storage  Infinite stream - Offloader: move segments off the Pulsar cluster and onto commodity storage. - Can be triggered on time, size, or demand.  Access - Broker knows how to read data back, or bypass bookies and read segments directly.
  • 19. 18 Apache Pulsar – Bringing It All Together Producer Subscriber Segment Reader Unbounded stream Bounded stream
  • 20. 19 Apache Pulsar – Bringing It All Together Producer Subscriber Segment Reader Unbounded stream Batch Processing Stream Processing
  • 21. 20 Use Case – Improve Operational Reliability  SWIM (System Wide Information Management) - Real time FAA message feed describing the current and future state of the nation’s managed airspace - traffic, weather, airport operations, etc. - Publishers (such as airlines) push their operational information to an endpoint. - Allows subscribers (such as airlines) on common published message interface.  Airline needs: - Connect the information in this feed up with their existing operational systems. • Maintain current state on assets. - Real time and historical analytics on this feed – traditional and predictive (ML / AI).
  • 22. 21 SWIM Overview Phase of operation FAA Topic
  • 23. 22 Sample SWIM Enroute TBFM Messages {"carrier": "UAL”, "flight number": 376, "origin": "EWR", "destination": "LAX", "flight date": "2019-Mar-19”} "Flight Plan": [{ "event_source": "TMA.ZOB.FAA.GOV", "event_time": "2019-03-29T16:23:22.659Z", "event_id": "422", "tma_id": "C00926", "Aircraft Id": "UAL376", "Origin Airport": "EWR", "Destination Airport": "LAX", "Flight Plan": "ACTIVE", "Aircraft Status": "TRACKED", "Aircraft Type": "B752/L", "Engine Type": "JET", "Beacon Code": "2334", "Flight Plan Speed": "483.0", "Assigned Requested Altitude": "28000", "Track Datasource": "ZNY", "Coordination Fix": "KEWR", "Coordination Time": "2019-03-29T16:14:00Z", "Estimated Departure Clearance Status": "FAA”, "Flight Plan Field 10A": "KEWR..COATE.Q436.RAAKK.Q438.RUBYY..MKG..BAE.J36.DUTYS.. KG78K..JORDY..OBH..GLL..DBL..CHESZ.Q88.HAKMN.ANJLL4.KLAX/2148", "TMA Converted Route": "KEWR/0000 COATE/0000 LAAYK/0000 YYOST/0000 DGRAF/0000 KG78K/0000 JORDY/0000 OBH/0000 GLL/0000 DBL/0000 KLAX/0000}] • Sample TBFM Messages. This specific flight generated 800 such messages "Station Time of Arrival": [{ "event_source": "TMA.ZLA.FAA.GOV", "event_time": "2019-03-29T20:38:28.148Z", "event_id": "4664550", "tma_id": "L03502”, "Meter Fix Name": "CRCUS”, "ETA Outer Meter Arc": "2019-03-29T21:42:45Z", ”ETA Meter Fix": "2019-03-29T21:46:35Z", ”ETA at Display Point": "2019-03-29T21:42:55Z", "ETA at Scheduling Fix": "2019-03-29T21:42:55Z", "ETA at Runway": "2019-03-29T21:57:23Z"}],
  • 24. 23 Architecture - Current State: Point to Point Scheduling Flight Plans Weather Airport Operations FAA Systems: Airspace Operations Scheduling Flight Plans Weather Airport Operations Airline Systems: Airspace Operations
  • 25. 24 Architecture - Target State: Pub / Sub Scheduling Flight Plans Weather Airport Operations FAA Systems: Airspace Operations Scheduling Flight Plans Weather Airport Operations Airline Systems: Airspace Operations Producer Subscriber Topics Producer Subscriber
  • 26. 25 Architecture - Target State Considerations Scheduling Flight Plans Weather Airport Operations Airline Systems: Airspace Operations Producer Subscriber File Connector JDBC Connector API Connector  Connectivity to the operational systems is mostly through file, JDBC, and API interfaces.  Most of these are not designed for streaming interfaces (yet).  How to connect up a topic with a systems that are not designed to work with streams?
  • 27. 26 Architecture - Target State Considerations Scheduling Flight Plans Weather Airport Operations Airline Systems: Airspace Operations Producer Subscriber File Connector JDBC Connector API Connector  What if there were both batch and streaming interfaces?  Use the batch interface until more sophisticated streaming interfaces come online.  An API written around the segment reader can help to close the last mile.  Treat as batch when needed, treat as stream when needed. Segment Reader API
  • 28. 27 Apache Communities  Twitter: @apache_pulsar  Wechat: ApachePulsar  Mailing Lists - dev@pulsar.apache.org - user@pulsar.apache.org  Slack - https://apache-pulsar.slack.com  Localization - http://crowdin.com/project/apache-pulsar  Github - https://github.com/apache/pulsar  Twitter: @asfbookkeeper  Mailing Lists - dev@bookkeeper.apache.org - user@bookkeeper.apache.org - issues@bookkeeper.apache.org  Slack - http://apachebookkeeper.slack.com/  Github - https://github.com/apache/bookkeeper Apache Pulsar Apache BookKeeper
  • 29. Thank You! We’re hiring! - Data Engineers - Data Scientists