SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Building an
Asynchronous
Application Framework
with Python and Pulsar
FEBRUARY 9 2022 Pulsar Summit
Zac Bentley
Lead Site Reliability Engineer
Boston, MA
2
2022 © Klaviyo Confidential
The Problem
What We Built
Challenges
What Worked Well
What’s Next?
01
02
03
04
05
3
2022 © Klaviyo Confidential
Segmentation
Reviews
Retail POS
Social
Surveys
Referrals
Logistics
Shipping
Customer service
Loyalty
On site
personalization
Forms
Ecommerce
Order confirmation
SMS
Email
Existing Architecture
5
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
6
2022 © Klaviyo Confidential
Problems
Reliability
RabbitMQ has reliability issues when
pushed too hard.
“Backpressure will find you”
Deep queues behave poorly.
Lots of outages and firefighting.
Scalability Ownership/Process Architectural
7
2022 © Klaviyo Confidential
Problems
Reliability Scalability
Scaling RabbitMQ is intrusive:
application code has to be aware of
topology changes at every level.
Geometry changes are painful.
Scale-out doesn’t bring
reliability/redundancy benefits.
Ownership/Process Architectural
8
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process
Individual team ownership is
expensive in:
- Roadmap time.
- Hiring/onboarding capacity.
- Coordination.
Per-team ownership creates
redundant expertise.
Architectural
9
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
Celery is pretty hostile to SOA.
Ordered consuming: not possible.
Processing more >1 message at a time:
not possible.
Pub/sub: difficult.
Replay/introspection: not possible.
Existing API: Producers
from app.tasks import mytask
# Synchronous call:
mytask("arg1", "arg2", kwarg1=SomeObject())
# Asynchronous call:
mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()})
@celery.task(acks_late=True)
def mytask(arg1, arg2, kwarg1=None):
...
@celery.task(acks_late=True)
def mytask2(*args, **kwargs):
...
Existing API: Consumer Workload Declaration
11
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
02 What We Built
1. Platform Services: a team
2. Pulsar: a broker deployment
3. StreamNative: a support relationship
4. Chariot: an asynchronous application framework
ORM for Pulsar Interactions
for tenant in Tenant.search(name="standalone"):
if tenant.allowed_clusters == ["standalone"]:
ns = Namespace(
tenant=tenant,
name="mynamespace",
acknowledged_quota=AcknowledgedMessageQuota(age=timedelta(minutes=10)),
)
ns.create()
topic = Topic(
namespace=ns,
name="mytopic"
)
topic.create()
subscription = Subscription(
topic=topic,
name="mysubscription",
type=SubscriptionType.KeyShared,
)
subscription.create()
assert Subscription.get(name="mysubscription") == subscription
consumer = subscription.consumer(name="myconsumer").connect()
while True:
message = consumer.receive()
consumer.acknowledge(message)
Declarative API for Schema Management & Migrations
from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto
my_topic = ChariotTopic(
name="demo",
durability=Durability.DURABILITY_REDUNDANT,
max_message_size="1kb",
max_producers=100,
max_consumers=10,
publish_rate_limits=(
RateLimit(
messages=1000,
period="1m",
actions=[RateLimitAction.RATE_LIMIT_ACTION_BLOCK],
),
),
thresholds=(
Threshold(
kind=ThresholdKind.THRESHOLD_KIND_UNACKNOWLEDGED,
size="200mb",
actions=[ThresholdAction.THRESHOLD_FAIL_PUBLISH],
),
),
consumer_groups=(ConsumerGroup(name="demo-consumer-group", type=SubscriptionType.KeyShared),),
payload=RegisteredPayloadFromClass(payload_class=PayloadProto),
)
Existing API: Producers
from app.tasks import mytask
# Synchronous call:
mytask("arg1", "arg2", kwarg1=SomeObject())
# Asynchronous call:
mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()})
@celery.task(acks_late=True)
def mytask(arg1, arg2, kwarg1=None):
...
@celery.task(acks_late=True)
def mytask2(*args, **kwargs):
...
Existing API: Consumer Workload Declaration
New API: Producers
class DemoExecutor(AsynchronousExecutor):
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_shutdown_requested(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_shutdown(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_startup(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_message_batch(self, messages: Sequence[PayloadProto]):
for idx, msg in enumerate(messages):
if idx % 2 == 0:
await self.chariot_ack(msg)
else:
await self.chariot_reject(msg)
from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto
from klaviyo_schema.registry.teamname.topics import my_topic
await my_topic.send(PayloadProto(...))
New API: Consumer Workload Declaration
Back-Of-Queue Retries
class DemoExecutor(AsynchronousExecutor):
@lifecycle_method(timeout=timedelta(seconds=10))
@requeue_retry(
batch_predicate=retry_on_exception_type(
RetryException, retry_log_level=logging.INFO
),
message_predicate=retry_until_approximate_attempt_count(10),
delay=wait_exponential(max=timedelta(seconds=5)) + timedelta(seconds=1),
)
async def on_message_batch(self, messages: Sequence[PayloadProto]):
raise RetryException("Expected retry")
~> chariot worker start --topic demo --consumer-group democg --parallel 10 
--start-executors-lazily --executor-class app.executors.demo:DemoExecutor 
--message-batch-assignment-behavior AnyKeyToAnyExecutor
~> chariot worker start --topic demo --consumer-group democg --parallel 10 
--start-executors-lazily --executor-class app.executors.demo:DemoExecutor 
--message-batch-assignment-behavior NoOverlapBestEffortKeyExecutorAffinity 
--message-batch-flush-after-items 1000 --message-batch-flush-after-time 10sec
Custom Batching and “Steering” for Parallel Execution without Reordering
18
2022 © Klaviyo Confidential
Problems Solutions
Reliability Scalability Ownership/Process Architectural
19
2022 © Klaviyo Confidential
Solutions
Reliability
To become a user is to express the
enforced maximum workload you’ll
run.
Pulsar’s redundancy helps weather
outages.
Deep backlogs are usable because
reads aren’t always writes.
Scalability Ownership/Process Architectural
20
2022 © Klaviyo Confidential
Solutions
Reliability Scalability
The “CEO” (Central Expert Owner)
can scale out pulsar to respond to
demand.
Teams express scalability need in the
form of elevated rate limits or partition
counts.
Consultation with the community and
StreamNative is invaluable.
Ownership/Process Architectural
21
2022 © Klaviyo Confidential
Solutions
Reliability Scalability Ownership/Process
Teams own producers/consumers.
Teams submit their contracts, in the
form of schema PRs, to the broker
owners.
Schema changes and backwards
compatibility aren’t simple but they
are now predictable.
Architectural
22
2022 © Klaviyo Confidential
Solutions
Reliability Scalability Ownership/Process Architectural
Many new patterns are now on the table:
- Pub-sub
- Ordered consume
- Batched consumption +
out-of-order acks
- Deduplication/debouncing
Reading topics at rest improves visibility.
Async interaction with the same stream
from multiple codebases is now possible.
03 Challenges
● Distribution as a library/framework
rather than an application
● Python/C++ Pulsar client maturity
● Combining advanced broker features
surfaced bugs
● Forking consumer daemons +
threaded clients + async/await style
is a costly combination
● Expectation management
● The “gap ledger”
● Management API quality
04 What Worked Well
Process:
● Support from above
● Managed rollout speed
● Solving 2025’s problems, not 2022’s
● “Steel-thread” style focus on specific
use-cases
● Willingness to commit to bring work
in-house and start fresh where it
made sense
Technology:
● Declarative schemas for messages
and dataflows
● Schema registry as code rather than
a SPOF
● Managed Pulsar allows us to learn
with less pain
● Isolating user code from consumer
code improves reliability
05 What’s Next?
Near Term:
● Manage internal adoption
● Scale to meet annual shopping
holidays’ needs
● Start work on a “publish gateway” for
connection pooling, circuit breaking,
etc.
Long Term:
● Online schema changes
● Key-local state
● Complex workflow support
● Make our work available to the
community
klaviyo.com/careers
zac@klaviyo.com

Weitere ähnliche Inhalte

Ähnlich wie Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022

Cloud world forum talk 062515
Cloud world forum talk 062515Cloud world forum talk 062515
Cloud world forum talk 062515Ajay Dankar
 
Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2 Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2 Igor Miniailo
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAconfluent
 
Cloud 12 08 V2
Cloud 12 08 V2Cloud 12 08 V2
Cloud 12 08 V2Pini Cohen
 
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The CloudPrimatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The CloudAmnon Raviv
 
End User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptxEnd User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptxCloudHesive
 
Container Technologies and Transformational value
Container Technologies and Transformational valueContainer Technologies and Transformational value
Container Technologies and Transformational valueMihai Criveti
 
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...David Currie
 
Cloud Computing Certification
Cloud Computing CertificationCloud Computing Certification
Cloud Computing CertificationVskills
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWSChristian Beedgen
 
Cloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptxCloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptxAvi Networks
 
Creating your Hybrid Cloud with AWS -Technical 201
Creating your Hybrid Cloud with AWS -Technical 201Creating your Hybrid Cloud with AWS -Technical 201
Creating your Hybrid Cloud with AWS -Technical 201Amazon Web Services
 
Moving existing apps to the cloud
 Moving existing apps to the cloud Moving existing apps to the cloud
Moving existing apps to the cloudRam Maddali
 
How to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortuneHow to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortuneCorecom Consulting
 
System Z Cloud Atlanta
System Z Cloud AtlantaSystem Z Cloud Atlanta
System Z Cloud AtlantaAndrea McManus
 
Should That Be a Microservice ?
Should That Be a Microservice ?Should That Be a Microservice ?
Should That Be a Microservice ?Rohit Kelapure
 
Webinar: Accelerate Your Cloud Business With CloudHealth
Webinar: Accelerate Your Cloud Business With CloudHealthWebinar: Accelerate Your Cloud Business With CloudHealth
Webinar: Accelerate Your Cloud Business With CloudHealthCloudHealth by VMware
 
Accenture: ACIC Rome & Red Hat
Accenture: ACIC Rome & Red HatAccenture: ACIC Rome & Red Hat
Accenture: ACIC Rome & Red HatAccenture Italia
 

Ähnlich wie Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022 (20)

Damodar_TIBCO
Damodar_TIBCODamodar_TIBCO
Damodar_TIBCO
 
Cloud world forum talk 062515
Cloud world forum talk 062515Cloud world forum talk 062515
Cloud world forum talk 062515
 
Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2 Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVA
 
Cloud 12 08 V2
Cloud 12 08 V2Cloud 12 08 V2
Cloud 12 08 V2
 
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The CloudPrimatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
 
End User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptxEnd User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptx
 
Container Technologies and Transformational value
Container Technologies and Transformational valueContainer Technologies and Transformational value
Container Technologies and Transformational value
 
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
 
Cloud Computing Certification
Cloud Computing CertificationCloud Computing Certification
Cloud Computing Certification
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS
 
Cloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptxCloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptx
 
Creating your Hybrid Cloud with AWS -Technical 201
Creating your Hybrid Cloud with AWS -Technical 201Creating your Hybrid Cloud with AWS -Technical 201
Creating your Hybrid Cloud with AWS -Technical 201
 
Moving existing apps to the cloud
 Moving existing apps to the cloud Moving existing apps to the cloud
Moving existing apps to the cloud
 
How to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortuneHow to move to the cloud, get it right, stay secure and not cost a fortune
How to move to the cloud, get it right, stay secure and not cost a fortune
 
System Z Cloud Atlanta
System Z Cloud AtlantaSystem Z Cloud Atlanta
System Z Cloud Atlanta
 
Should That Be a Microservice ?
Should That Be a Microservice ?Should That Be a Microservice ?
Should That Be a Microservice ?
 
The Best IBM Bluemix Training From myTectra in Bangalore
The Best IBM Bluemix Training From myTectra in BangaloreThe Best IBM Bluemix Training From myTectra in Bangalore
The Best IBM Bluemix Training From myTectra in Bangalore
 
Webinar: Accelerate Your Cloud Business With CloudHealth
Webinar: Accelerate Your Cloud Business With CloudHealthWebinar: Accelerate Your Cloud Business With CloudHealth
Webinar: Accelerate Your Cloud Business With CloudHealth
 
Accenture: ACIC Rome & Red Hat
Accenture: ACIC Rome & Red HatAccenture: ACIC Rome & Red Hat
Accenture: ACIC Rome & Red Hat
 

Mehr von StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021StreamNative
 

Mehr von StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
 

Kürzlich hochgeladen

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022

  • 1. Building an Asynchronous Application Framework with Python and Pulsar FEBRUARY 9 2022 Pulsar Summit Zac Bentley Lead Site Reliability Engineer Boston, MA
  • 2. 2 2022 © Klaviyo Confidential The Problem What We Built Challenges What Worked Well What’s Next? 01 02 03 04 05
  • 3. 3 2022 © Klaviyo Confidential Segmentation Reviews Retail POS Social Surveys Referrals Logistics Shipping Customer service Loyalty On site personalization Forms Ecommerce Order confirmation SMS Email
  • 5. 5 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural
  • 6. 6 2022 © Klaviyo Confidential Problems Reliability RabbitMQ has reliability issues when pushed too hard. “Backpressure will find you” Deep queues behave poorly. Lots of outages and firefighting. Scalability Ownership/Process Architectural
  • 7. 7 2022 © Klaviyo Confidential Problems Reliability Scalability Scaling RabbitMQ is intrusive: application code has to be aware of topology changes at every level. Geometry changes are painful. Scale-out doesn’t bring reliability/redundancy benefits. Ownership/Process Architectural
  • 8. 8 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Individual team ownership is expensive in: - Roadmap time. - Hiring/onboarding capacity. - Coordination. Per-team ownership creates redundant expertise. Architectural
  • 9. 9 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural Celery is pretty hostile to SOA. Ordered consuming: not possible. Processing more >1 message at a time: not possible. Pub/sub: difficult. Replay/introspection: not possible.
  • 10. Existing API: Producers from app.tasks import mytask # Synchronous call: mytask("arg1", "arg2", kwarg1=SomeObject()) # Asynchronous call: mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()}) @celery.task(acks_late=True) def mytask(arg1, arg2, kwarg1=None): ... @celery.task(acks_late=True) def mytask2(*args, **kwargs): ... Existing API: Consumer Workload Declaration
  • 11. 11 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural
  • 12. 02 What We Built 1. Platform Services: a team 2. Pulsar: a broker deployment 3. StreamNative: a support relationship 4. Chariot: an asynchronous application framework
  • 13. ORM for Pulsar Interactions for tenant in Tenant.search(name="standalone"): if tenant.allowed_clusters == ["standalone"]: ns = Namespace( tenant=tenant, name="mynamespace", acknowledged_quota=AcknowledgedMessageQuota(age=timedelta(minutes=10)), ) ns.create() topic = Topic( namespace=ns, name="mytopic" ) topic.create() subscription = Subscription( topic=topic, name="mysubscription", type=SubscriptionType.KeyShared, ) subscription.create() assert Subscription.get(name="mysubscription") == subscription consumer = subscription.consumer(name="myconsumer").connect() while True: message = consumer.receive() consumer.acknowledge(message)
  • 14. Declarative API for Schema Management & Migrations from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto my_topic = ChariotTopic( name="demo", durability=Durability.DURABILITY_REDUNDANT, max_message_size="1kb", max_producers=100, max_consumers=10, publish_rate_limits=( RateLimit( messages=1000, period="1m", actions=[RateLimitAction.RATE_LIMIT_ACTION_BLOCK], ), ), thresholds=( Threshold( kind=ThresholdKind.THRESHOLD_KIND_UNACKNOWLEDGED, size="200mb", actions=[ThresholdAction.THRESHOLD_FAIL_PUBLISH], ), ), consumer_groups=(ConsumerGroup(name="demo-consumer-group", type=SubscriptionType.KeyShared),), payload=RegisteredPayloadFromClass(payload_class=PayloadProto), )
  • 15. Existing API: Producers from app.tasks import mytask # Synchronous call: mytask("arg1", "arg2", kwarg1=SomeObject()) # Asynchronous call: mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()}) @celery.task(acks_late=True) def mytask(arg1, arg2, kwarg1=None): ... @celery.task(acks_late=True) def mytask2(*args, **kwargs): ... Existing API: Consumer Workload Declaration
  • 16. New API: Producers class DemoExecutor(AsynchronousExecutor): @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_shutdown_requested(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_shutdown(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_startup(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_message_batch(self, messages: Sequence[PayloadProto]): for idx, msg in enumerate(messages): if idx % 2 == 0: await self.chariot_ack(msg) else: await self.chariot_reject(msg) from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto from klaviyo_schema.registry.teamname.topics import my_topic await my_topic.send(PayloadProto(...)) New API: Consumer Workload Declaration
  • 17. Back-Of-Queue Retries class DemoExecutor(AsynchronousExecutor): @lifecycle_method(timeout=timedelta(seconds=10)) @requeue_retry( batch_predicate=retry_on_exception_type( RetryException, retry_log_level=logging.INFO ), message_predicate=retry_until_approximate_attempt_count(10), delay=wait_exponential(max=timedelta(seconds=5)) + timedelta(seconds=1), ) async def on_message_batch(self, messages: Sequence[PayloadProto]): raise RetryException("Expected retry") ~> chariot worker start --topic demo --consumer-group democg --parallel 10 --start-executors-lazily --executor-class app.executors.demo:DemoExecutor --message-batch-assignment-behavior AnyKeyToAnyExecutor ~> chariot worker start --topic demo --consumer-group democg --parallel 10 --start-executors-lazily --executor-class app.executors.demo:DemoExecutor --message-batch-assignment-behavior NoOverlapBestEffortKeyExecutorAffinity --message-batch-flush-after-items 1000 --message-batch-flush-after-time 10sec Custom Batching and “Steering” for Parallel Execution without Reordering
  • 18. 18 2022 © Klaviyo Confidential Problems Solutions Reliability Scalability Ownership/Process Architectural
  • 19. 19 2022 © Klaviyo Confidential Solutions Reliability To become a user is to express the enforced maximum workload you’ll run. Pulsar’s redundancy helps weather outages. Deep backlogs are usable because reads aren’t always writes. Scalability Ownership/Process Architectural
  • 20. 20 2022 © Klaviyo Confidential Solutions Reliability Scalability The “CEO” (Central Expert Owner) can scale out pulsar to respond to demand. Teams express scalability need in the form of elevated rate limits or partition counts. Consultation with the community and StreamNative is invaluable. Ownership/Process Architectural
  • 21. 21 2022 © Klaviyo Confidential Solutions Reliability Scalability Ownership/Process Teams own producers/consumers. Teams submit their contracts, in the form of schema PRs, to the broker owners. Schema changes and backwards compatibility aren’t simple but they are now predictable. Architectural
  • 22. 22 2022 © Klaviyo Confidential Solutions Reliability Scalability Ownership/Process Architectural Many new patterns are now on the table: - Pub-sub - Ordered consume - Batched consumption + out-of-order acks - Deduplication/debouncing Reading topics at rest improves visibility. Async interaction with the same stream from multiple codebases is now possible.
  • 23. 03 Challenges ● Distribution as a library/framework rather than an application ● Python/C++ Pulsar client maturity ● Combining advanced broker features surfaced bugs ● Forking consumer daemons + threaded clients + async/await style is a costly combination ● Expectation management ● The “gap ledger” ● Management API quality
  • 24. 04 What Worked Well Process: ● Support from above ● Managed rollout speed ● Solving 2025’s problems, not 2022’s ● “Steel-thread” style focus on specific use-cases ● Willingness to commit to bring work in-house and start fresh where it made sense Technology: ● Declarative schemas for messages and dataflows ● Schema registry as code rather than a SPOF ● Managed Pulsar allows us to learn with less pain ● Isolating user code from consumer code improves reliability
  • 25. 05 What’s Next? Near Term: ● Manage internal adoption ● Scale to meet annual shopping holidays’ needs ● Start work on a “publish gateway” for connection pooling, circuit breaking, etc. Long Term: ● Online schema changes ● Key-local state ● Complex workflow support ● Make our work available to the community