In this talk, we will go over the configurations knobs and all the virtual isolation features to run multiple clients in their assigned resource space without interfering with one another.
At nutanix we run a single physical pulsar cluster shared by multiple use cases and applications. The biggest challenge in such a setup is an unintended denial of service for all client just because one of the clients breached their quotas and exceeded load expectations.
In traditional distributed apps, one would do that with multiple physical clusters which is easier on day 1 but the operational complexity on day 2 or 3 can very quickly overwhelm a small team. Adding to its multi-tenant architecture, Pulsar provides a lot of features to restrict usage from different clients ranging from quotas, ttls, retention with simple configurations to more advanced features in namespace isolation, failure domains, anti affinity for namespaces, bookie groups and affinity between those etc.
In this talk we will first present the available options followed by our experience with them. As a bonus we will add a story where we migrated our whole cluster and the learnings in the journey.
2. Who are We?
● Leading Cloud data Platform Team
● Loves distributed Systems, open Source
● Data geek (data stores, stream, analytics etc)
● Pulsar & MySql contributor
● Developer by passion
● Love Distributed Systems, Streaming Platform
● Pulsar Enthusiast
Shivji Jha
SME for Pulsar,
Nutanix
Sourabh Agrawal
Pulsar Ninja,
Nutanix
2
3. Catalogue
● Quick Intro: Apache Pulsar
● Pulsar Isolation
○ What
○ Why
○ How
● Isolating by clusters
● Isolating brokers
● Isolating bookies
● Demo
3
4. About Pulsar
Apache Pulsar is a cloud-native, distributed, open-source pub-sub messaging and streaming
platform. Originally developed by Yahoo and contributed to the Apache Software Foundation in
2016
Provides :
● Stateless Brokers
● Horizontally Scalable
● Isolate Read and Writes
● Multi-tenant System
● Geo-Replication
● Active Community Support
4
6. What is Pulsar Isolation ?
● Scale storage and Serving layer independently.
● Can separate resources (brokers & bookies) for specific use cases.
● Scale up/down resources as per need for specific namespaces by keeping rest of the
cluster untouched.
● Localized resource allocation
6
7. Why Pulsar Isolation ?
● Avoid Sharing resources within teams
○ Prevent unexpected consequences.
● Dedicated resources for a namespace.
● Allocate Localized resource in multi-region setup.
● Scale up/down nodes as per load on a namespace.
● More Secure (Permissions managed at namespace level)
● Easy setup
7
8. Isolation Categories
● Hard Isolation
○ Different physical Pulsar clusters (or nodes) for your isolation units
■ Failure Domains
■ Anti affinity Groups
■ Broker Isolation using Namespace Isolation Policy
■ Bookie Isolation using Affinity Groups
● Soft Isolation
○ Same Physical clusters but restrictions on tenants / namespaces by configuration
■ Disk Quotas
■ Throttling
8
23. Bookie Affinity Groups
23
● Isolate storage at namespace
● Leverage data durability configs in
isolated bookies
● Leverage
ZkIsolatedBookieEnsemblePlacement
Policy
● Scale bookies for namespace
https://streamnative.io/uploads/images/blogs/isolation-3.png
24. Bookie Isolation using
Affinity Group
Setup Bookie Rack
Bookie Affinity Group
./pulsar-admin bookies set-bookie-rack --bookie 127.0.0.1:3181 --hostname 127.0.0.1:3181
--group group1 --rack rack1
./pulsar-admin namespaces set-bookie-affinity-group <tenant/namespace> --primary-group
<anti-affinity-group-name> --secondary-group <anti-affinity-group-name>
./pulsar-admin namespaces set-bookie-affinity-group public/default --primary-group group1
24
Configure Bookie Group
Configure Bookie Group with secondary
25. Scaling Up/Down
Broker
● When scaling up brokers, change the isolation groups and add newly added broker as
primary or secondary broker for required namespaces.
● When scaling down, make sure existing isolation groups have enough brokers.
Bookie
● When scaling up bookies, change the bookie affinity group and add newly added bookies as
primary or secondary group for required namespaces.
● When scaling down, make sure existing affinity groups have enough bookies.
25