SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Building an
Asynchronous
Application Framework
with Python and Pulsar
FEBRUARY 9 2022 Pulsar Summit
Zac Bentley
Lead Site Reliability Engineer
Boston, MA
2
2022 © Klaviyo Confidential
The Problem
What We Built
Challenges
What Worked Well
What’s Next?
01
02
03
04
05
3
2022 © Klaviyo Confidential
Segmentation
Reviews
Retail POS
Social
Surveys
Referrals
Logistics
Shipping
Customer service
Loyalty
On site
personalization
Forms
Ecommerce
Order confirmation
SMS
Email
Existing Architecture
5
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
6
2022 © Klaviyo Confidential
Problems
Reliability
RabbitMQ has reliability issues when
pushed too hard.
“Backpressure will find you”
Deep queues behave poorly.
Lots of outages and firefighting.
Scalability Ownership/Process Architectural
7
2022 © Klaviyo Confidential
Problems
Reliability Scalability
Scaling RabbitMQ is intrusive:
application code has to be aware of
topology changes at every level.
Geometry changes are painful.
Scale-out doesn’t bring
reliability/redundancy benefits.
Ownership/Process Architectural
8
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process
Individual team ownership is
expensive in:
- Roadmap time.
- Hiring/onboarding capacity.
- Coordination.
Per-team ownership creates
redundant expertise.
Architectural
9
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
Celery is pretty hostile to SOA.
Ordered consuming: not possible.
Processing more >1 message at a time:
not possible.
Pub/sub: difficult.
Replay/introspection: not possible.
Existing API: Producers
from app.tasks import mytask
# Synchronous call:
mytask("arg1", "arg2", kwarg1=SomeObject())
# Asynchronous call:
mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()})
@celery.task(acks_late=True)
def mytask(arg1, arg2, kwarg1=None):
...
@celery.task(acks_late=True)
def mytask2(*args, **kwargs):
...
Existing API: Consumer Workload Declaration
11
2022 © Klaviyo Confidential
Problems
Reliability Scalability Ownership/Process Architectural
02 What We Built
1. Platform Services: a team
2. Pulsar: a broker deployment
3. StreamNative: a support relationship
4. Chariot: an asynchronous application framework
ORM for Pulsar Interactions
for tenant in Tenant.search(name="standalone"):
if tenant.allowed_clusters == ["standalone"]:
ns = Namespace(
tenant=tenant,
name="mynamespace",
acknowledged_quota=AcknowledgedMessageQuota(age=timedelta(minutes=10)),
)
ns.create()
topic = Topic(
namespace=ns,
name="mytopic"
)
topic.create()
subscription = Subscription(
topic=topic,
name="mysubscription",
type=SubscriptionType.KeyShared,
)
subscription.create()
assert Subscription.get(name="mysubscription") == subscription
consumer = subscription.consumer(name="myconsumer").connect()
while True:
message = consumer.receive()
consumer.acknowledge(message)
Declarative API for Schema Management & Migrations
from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto
my_topic = ChariotTopic(
name="demo",
durability=Durability.DURABILITY_REDUNDANT,
max_message_size="1kb",
max_producers=100,
max_consumers=10,
publish_rate_limits=(
RateLimit(
messages=1000,
period="1m",
actions=[RateLimitAction.RATE_LIMIT_ACTION_BLOCK],
),
),
thresholds=(
Threshold(
kind=ThresholdKind.THRESHOLD_KIND_UNACKNOWLEDGED,
size="200mb",
actions=[ThresholdAction.THRESHOLD_FAIL_PUBLISH],
),
),
consumer_groups=(ConsumerGroup(name="demo-consumer-group", type=SubscriptionType.KeyShared),),
payload=RegisteredPayloadFromClass(payload_class=PayloadProto),
)
Existing API: Producers
from app.tasks import mytask
# Synchronous call:
mytask("arg1", "arg2", kwarg1=SomeObject())
# Asynchronous call:
mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()})
@celery.task(acks_late=True)
def mytask(arg1, arg2, kwarg1=None):
...
@celery.task(acks_late=True)
def mytask2(*args, **kwargs):
...
Existing API: Consumer Workload Declaration
New API: Producers
class DemoExecutor(AsynchronousExecutor):
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_shutdown_requested(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_shutdown(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_executor_startup(self): ...
@lifecycle_method(timeout=timedelta(seconds=10))
async def on_message_batch(self, messages: Sequence[PayloadProto]):
for idx, msg in enumerate(messages):
if idx % 2 == 0:
await self.chariot_ack(msg)
else:
await self.chariot_reject(msg)
from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto
from klaviyo_schema.registry.teamname.topics import my_topic
await my_topic.send(PayloadProto(...))
New API: Consumer Workload Declaration
Back-Of-Queue Retries
class DemoExecutor(AsynchronousExecutor):
@lifecycle_method(timeout=timedelta(seconds=10))
@requeue_retry(
batch_predicate=retry_on_exception_type(
RetryException, retry_log_level=logging.INFO
),
message_predicate=retry_until_approximate_attempt_count(10),
delay=wait_exponential(max=timedelta(seconds=5)) + timedelta(seconds=1),
)
async def on_message_batch(self, messages: Sequence[PayloadProto]):
raise RetryException("Expected retry")
~> chariot worker start --topic demo --consumer-group democg --parallel 10 
--start-executors-lazily --executor-class app.executors.demo:DemoExecutor 
--message-batch-assignment-behavior AnyKeyToAnyExecutor
~> chariot worker start --topic demo --consumer-group democg --parallel 10 
--start-executors-lazily --executor-class app.executors.demo:DemoExecutor 
--message-batch-assignment-behavior NoOverlapBestEffortKeyExecutorAffinity 
--message-batch-flush-after-items 1000 --message-batch-flush-after-time 10sec
Custom Batching and “Steering” for Parallel Execution without Reordering
18
2022 © Klaviyo Confidential
Problems Solutions
Reliability Scalability Ownership/Process Architectural
19
2022 © Klaviyo Confidential
Solutions
Reliability
To become a user is to express the
enforced maximum workload you’ll
run.
Pulsar’s redundancy helps weather
outages.
Deep backlogs are usable because
reads aren’t always writes.
Scalability Ownership/Process Architectural
20
2022 © Klaviyo Confidential
Solutions
Reliability Scalability
The “CEO” (Central Expert Owner)
can scale out pulsar to respond to
demand.
Teams express scalability need in the
form of elevated rate limits or partition
counts.
Consultation with the community and
StreamNative is invaluable.
Ownership/Process Architectural
21
2022 © Klaviyo Confidential
Solutions
Reliability Scalability Ownership/Process
Teams own producers/consumers.
Teams submit their contracts, in the
form of schema PRs, to the broker
owners.
Schema changes and backwards
compatibility aren’t simple but they
are now predictable.
Architectural
22
2022 © Klaviyo Confidential
Solutions
Reliability Scalability Ownership/Process Architectural
Many new patterns are now on the table:
- Pub-sub
- Ordered consume
- Batched consumption +
out-of-order acks
- Deduplication/debouncing
Reading topics at rest improves visibility.
Async interaction with the same stream
from multiple codebases is now possible.
03 Challenges
● Distribution as a library/framework
rather than an application
● Python/C++ Pulsar client maturity
● Combining advanced broker features
surfaced bugs
● Forking consumer daemons +
threaded clients + async/await style
is a costly combination
● Expectation management
● The “gap ledger”
● Management API quality
04 What Worked Well
Process:
● Support from above
● Managed rollout speed
● Solving 2025’s problems, not 2022’s
● “Steel-thread” style focus on specific
use-cases
● Willingness to commit to bring work
in-house and start fresh where it
made sense
Technology:
● Declarative schemas for messages
and dataflows
● Schema registry as code rather than
a SPOF
● Managed Pulsar allows us to learn
with less pain
● Isolating user code from consumer
code improves reliability
05 What’s Next?
Near Term:
● Manage internal adoption
● Scale to meet annual shopping
holidays’ needs
● Start work on a “publish gateway” for
connection pooling, circuit breaking,
etc.
Long Term:
● Online schema changes
● Key-local state
● Complex workflow support
● Make our work available to the
community
klaviyo.com/careers
zac@klaviyo.com

Weitere ähnliche Inhalte

Was ist angesagt?

The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Data power Performance Tuning
Data power Performance TuningData power Performance Tuning
Data power Performance Tuning
KINGSHUK MAJUMDER
 

Was ist angesagt? (20)

Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Integrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect FrameworkIntegrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect Framework
 
Data power Performance Tuning
Data power Performance TuningData power Performance Tuning
Data power Performance Tuning
 
Developing real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and KafkaDeveloping real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Transforming Financial Services with Event Streaming Data
Transforming Financial Services with Event Streaming DataTransforming Financial Services with Event Streaming Data
Transforming Financial Services with Event Streaming Data
 
An Introduction to MongoDB Ops Manager
An Introduction to MongoDB Ops ManagerAn Introduction to MongoDB Ops Manager
An Introduction to MongoDB Ops Manager
 
MongoDB
MongoDBMongoDB
MongoDB
 
Financial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise ScaleFinancial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise Scale
 
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and ArchitecturesApache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
 
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
[135] 오픈소스 데이터베이스, 은행 서비스에 첫발을 내밀다.
 
Tomcat 마이그레이션 도전하기 (Jins Choi)
Tomcat 마이그레이션 도전하기 (Jins Choi)Tomcat 마이그레이션 도전하기 (Jins Choi)
Tomcat 마이그레이션 도전하기 (Jins Choi)
 

Ähnlich wie Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022

Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The CloudPrimatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Amnon Raviv
 

Ähnlich wie Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022 (20)

Architecting Microservices in .Net
Architecting Microservices in .NetArchitecting Microservices in .Net
Architecting Microservices in .Net
 
The Future of Service Mesh
The Future of Service MeshThe Future of Service Mesh
The Future of Service Mesh
 
Contino Webinar - Migrating your Trading Workloads to the Cloud
Contino Webinar -  Migrating your Trading Workloads to the CloudContino Webinar -  Migrating your Trading Workloads to the Cloud
Contino Webinar - Migrating your Trading Workloads to the Cloud
 
CNCF On-Demand Webinar_ LitmusChaos Project Updates.pdf
CNCF On-Demand Webinar_ LitmusChaos Project Updates.pdfCNCF On-Demand Webinar_ LitmusChaos Project Updates.pdf
CNCF On-Demand Webinar_ LitmusChaos Project Updates.pdf
 
Cloud monitoring overview
Cloud monitoring overviewCloud monitoring overview
Cloud monitoring overview
 
Mohideen Khader-122316
Mohideen Khader-122316Mohideen Khader-122316
Mohideen Khader-122316
 
External should that be a microservice
External should that be a microserviceExternal should that be a microservice
External should that be a microservice
 
7. introduction
7. introduction7. introduction
7. introduction
 
Damodar_TIBCO
Damodar_TIBCODamodar_TIBCO
Damodar_TIBCO
 
Cloud world forum talk 062515
Cloud world forum talk 062515Cloud world forum talk 062515
Cloud world forum talk 062515
 
Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2 Applying Code Customizations to Magento 2
Applying Code Customizations to Magento 2
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVA
 
Cloud 12 08 V2
Cloud 12 08 V2Cloud 12 08 V2
Cloud 12 08 V2
 
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The CloudPrimatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
 
End User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptxEnd User Computing at CloudHesive.pptx
End User Computing at CloudHesive.pptx
 
Container Technologies and Transformational value
Container Technologies and Transformational valueContainer Technologies and Transformational value
Container Technologies and Transformational value
 
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and P...
 
Cloud Computing Certification
Cloud Computing CertificationCloud Computing Certification
Cloud Computing Certification
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS
 
Cloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptxCloud_controllers_public_webinar_aug31_v1.pptx
Cloud_controllers_public_webinar_aug31_v1.pptx
 

Mehr von StreamNative

Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 

Mehr von StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
 
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Building an Asynchronous Application Framework with Python and Pulsar - Pulsar Summit SF 2022

  • 1. Building an Asynchronous Application Framework with Python and Pulsar FEBRUARY 9 2022 Pulsar Summit Zac Bentley Lead Site Reliability Engineer Boston, MA
  • 2. 2 2022 © Klaviyo Confidential The Problem What We Built Challenges What Worked Well What’s Next? 01 02 03 04 05
  • 3. 3 2022 © Klaviyo Confidential Segmentation Reviews Retail POS Social Surveys Referrals Logistics Shipping Customer service Loyalty On site personalization Forms Ecommerce Order confirmation SMS Email
  • 5. 5 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural
  • 6. 6 2022 © Klaviyo Confidential Problems Reliability RabbitMQ has reliability issues when pushed too hard. “Backpressure will find you” Deep queues behave poorly. Lots of outages and firefighting. Scalability Ownership/Process Architectural
  • 7. 7 2022 © Klaviyo Confidential Problems Reliability Scalability Scaling RabbitMQ is intrusive: application code has to be aware of topology changes at every level. Geometry changes are painful. Scale-out doesn’t bring reliability/redundancy benefits. Ownership/Process Architectural
  • 8. 8 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Individual team ownership is expensive in: - Roadmap time. - Hiring/onboarding capacity. - Coordination. Per-team ownership creates redundant expertise. Architectural
  • 9. 9 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural Celery is pretty hostile to SOA. Ordered consuming: not possible. Processing more >1 message at a time: not possible. Pub/sub: difficult. Replay/introspection: not possible.
  • 10. Existing API: Producers from app.tasks import mytask # Synchronous call: mytask("arg1", "arg2", kwarg1=SomeObject()) # Asynchronous call: mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()}) @celery.task(acks_late=True) def mytask(arg1, arg2, kwarg1=None): ... @celery.task(acks_late=True) def mytask2(*args, **kwargs): ... Existing API: Consumer Workload Declaration
  • 11. 11 2022 © Klaviyo Confidential Problems Reliability Scalability Ownership/Process Architectural
  • 12. 02 What We Built 1. Platform Services: a team 2. Pulsar: a broker deployment 3. StreamNative: a support relationship 4. Chariot: an asynchronous application framework
  • 13. ORM for Pulsar Interactions for tenant in Tenant.search(name="standalone"): if tenant.allowed_clusters == ["standalone"]: ns = Namespace( tenant=tenant, name="mynamespace", acknowledged_quota=AcknowledgedMessageQuota(age=timedelta(minutes=10)), ) ns.create() topic = Topic( namespace=ns, name="mytopic" ) topic.create() subscription = Subscription( topic=topic, name="mysubscription", type=SubscriptionType.KeyShared, ) subscription.create() assert Subscription.get(name="mysubscription") == subscription consumer = subscription.consumer(name="myconsumer").connect() while True: message = consumer.receive() consumer.acknowledge(message)
  • 14. Declarative API for Schema Management & Migrations from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto my_topic = ChariotTopic( name="demo", durability=Durability.DURABILITY_REDUNDANT, max_message_size="1kb", max_producers=100, max_consumers=10, publish_rate_limits=( RateLimit( messages=1000, period="1m", actions=[RateLimitAction.RATE_LIMIT_ACTION_BLOCK], ), ), thresholds=( Threshold( kind=ThresholdKind.THRESHOLD_KIND_UNACKNOWLEDGED, size="200mb", actions=[ThresholdAction.THRESHOLD_FAIL_PUBLISH], ), ), consumer_groups=(ConsumerGroup(name="demo-consumer-group", type=SubscriptionType.KeyShared),), payload=RegisteredPayloadFromClass(payload_class=PayloadProto), )
  • 15. Existing API: Producers from app.tasks import mytask # Synchronous call: mytask("arg1", "arg2", kwarg1=SomeObject()) # Asynchronous call: mytask.apply_async(args=("arg1", "arg2"), kwargs={"kwarg1": SomeObject()}) @celery.task(acks_late=True) def mytask(arg1, arg2, kwarg1=None): ... @celery.task(acks_late=True) def mytask2(*args, **kwargs): ... Existing API: Consumer Workload Declaration
  • 16. New API: Producers class DemoExecutor(AsynchronousExecutor): @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_shutdown_requested(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_shutdown(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_executor_startup(self): ... @lifecycle_method(timeout=timedelta(seconds=10)) async def on_message_batch(self, messages: Sequence[PayloadProto]): for idx, msg in enumerate(messages): if idx % 2 == 0: await self.chariot_ack(msg) else: await self.chariot_reject(msg) from klaviyo_schema.registry.teamname.data.payload_pb2 import PayloadProto from klaviyo_schema.registry.teamname.topics import my_topic await my_topic.send(PayloadProto(...)) New API: Consumer Workload Declaration
  • 17. Back-Of-Queue Retries class DemoExecutor(AsynchronousExecutor): @lifecycle_method(timeout=timedelta(seconds=10)) @requeue_retry( batch_predicate=retry_on_exception_type( RetryException, retry_log_level=logging.INFO ), message_predicate=retry_until_approximate_attempt_count(10), delay=wait_exponential(max=timedelta(seconds=5)) + timedelta(seconds=1), ) async def on_message_batch(self, messages: Sequence[PayloadProto]): raise RetryException("Expected retry") ~> chariot worker start --topic demo --consumer-group democg --parallel 10 --start-executors-lazily --executor-class app.executors.demo:DemoExecutor --message-batch-assignment-behavior AnyKeyToAnyExecutor ~> chariot worker start --topic demo --consumer-group democg --parallel 10 --start-executors-lazily --executor-class app.executors.demo:DemoExecutor --message-batch-assignment-behavior NoOverlapBestEffortKeyExecutorAffinity --message-batch-flush-after-items 1000 --message-batch-flush-after-time 10sec Custom Batching and “Steering” for Parallel Execution without Reordering
  • 18. 18 2022 © Klaviyo Confidential Problems Solutions Reliability Scalability Ownership/Process Architectural
  • 19. 19 2022 © Klaviyo Confidential Solutions Reliability To become a user is to express the enforced maximum workload you’ll run. Pulsar’s redundancy helps weather outages. Deep backlogs are usable because reads aren’t always writes. Scalability Ownership/Process Architectural
  • 20. 20 2022 © Klaviyo Confidential Solutions Reliability Scalability The “CEO” (Central Expert Owner) can scale out pulsar to respond to demand. Teams express scalability need in the form of elevated rate limits or partition counts. Consultation with the community and StreamNative is invaluable. Ownership/Process Architectural
  • 21. 21 2022 © Klaviyo Confidential Solutions Reliability Scalability Ownership/Process Teams own producers/consumers. Teams submit their contracts, in the form of schema PRs, to the broker owners. Schema changes and backwards compatibility aren’t simple but they are now predictable. Architectural
  • 22. 22 2022 © Klaviyo Confidential Solutions Reliability Scalability Ownership/Process Architectural Many new patterns are now on the table: - Pub-sub - Ordered consume - Batched consumption + out-of-order acks - Deduplication/debouncing Reading topics at rest improves visibility. Async interaction with the same stream from multiple codebases is now possible.
  • 23. 03 Challenges ● Distribution as a library/framework rather than an application ● Python/C++ Pulsar client maturity ● Combining advanced broker features surfaced bugs ● Forking consumer daemons + threaded clients + async/await style is a costly combination ● Expectation management ● The “gap ledger” ● Management API quality
  • 24. 04 What Worked Well Process: ● Support from above ● Managed rollout speed ● Solving 2025’s problems, not 2022’s ● “Steel-thread” style focus on specific use-cases ● Willingness to commit to bring work in-house and start fresh where it made sense Technology: ● Declarative schemas for messages and dataflows ● Schema registry as code rather than a SPOF ● Managed Pulsar allows us to learn with less pain ● Isolating user code from consumer code improves reliability
  • 25. 05 What’s Next? Near Term: ● Manage internal adoption ● Scale to meet annual shopping holidays’ needs ● Start work on a “publish gateway” for connection pooling, circuit breaking, etc. Long Term: ● Online schema changes ● Key-local state ● Complex workflow support ● Make our work available to the community