Pulsar Architectural Patterns for CI/CD Automation and Self-Service

Pulsar Architectural Patterns for CI/CD
Every pattern shown here has been developed and implemented with my
team at Overstock
Email: dbost@overstock.com
Twitter: DevinBost
LinkedIn: https://www.linkedin.com/in/devinbost/
By Devin Bost, Senior Data Engineer at Overstock
Data-Driven CI/CD Automation for Pulsar Function Flows and Pub/Sub
+
Includes on-prem, AWS, and GCP architectures

Legend & Referenced Technologies
Pulsar Beam
Pulsar Topic
Pulsar Brokers
Kubernetes
Golang
Amazon S3
CouchDB
ReactJS
Docker
AWS IAM
GCP Cloud Build
GCP IAM
GCP Cloud Storage
Google Cloud Functions
Sonotype Nexus

Data + Contact = Modular design
+

Modular Design
Reusable functions

Might need to manually satisfy contract at first

Might need to manually satisfy contract at firstUntil you can get to where the data is originated

{
" t ype" : " f unct i on" ,
" ar t i f act Pat hOr Ur l " : " ht t p: / / pat h- t o- ar t i f act / exampl e- i gni t e- f unct i on- 1. 0. 1- 20200125. 003935- 3-
j ar - wi t h- dependenci es. j ar " ,
" t enant " : " exampl eTenant " ,
" namespace" : " exampl eNamespace" ,
" name" : " exampl eI gni t eFunct i on- backf i l l ” ,
" cl assName" : " com. your company. pul sar . f unct i ons. Exampl eI gni t eFunct i on" ,
" user Conf i g" : {
" user name" : " i gni t eUser " ,
" passwor d" : " exampl eHashedPass" ,
" cache_name" : " exampl e- i gni t e- cache- backf i l l ” ,
" host s_wi t h_por t s" : " i gni t eser ver 1. domai n. com: 10800, i gni t eser ver 2. domai n. com: 10800,
i gni t eser ver 3. domai n. com: 10800, i gni t eser ver 4. domai n. com: 10800
} ,
" i nput s" : [
" per si st ent : / / f eeds/ exampl ePr oj ect / dat a- t o- dump- i nt o- i gni t e - backf i l l ”
] ,
" out put " : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- enr i ched- f r om- i gni t e - backf i l l ” ,
" l ogTopi c" : " per si st ent : / / publ i c/ def aul t / f unct i on- l og- t opi c - backf i l l ”
}

Using the Java Admin API to consume from a Pulsar topic
Pulsar REST
Admin API
Consumer/Producer
{
" ar t i f act Pat hOr Ur l " : " ht t p: / / pat h- t o- ar t i f act / exampl e- i gni t e-
f unct i on- 1. 0. 1- 20200125. 003935- 3- j ar - wi t h- dependenci es. j ar " ,
" name" : " exampl eI gni t eFunct i on" ,
" cl assName" :
" com. your company. pul sar . f unct i ons. Exampl eI gni t eFunct i on" ,
" cache_name" : " exampl e- i gni t e- cache" ,
" host s_wi t h_por t s" : " i gni t eser ver 1. domai n. com: 10800,
i gni t eser ver 2. domai n. com: 10800, i gni t eser ver 3. domai n. com: 10800,
i gni t eser ver 4. domai n. com: 10800
} ,
" i nput s" : [
" per si st ent : / / f eeds/ exampl ePr oj ect / dat a- t o- dump- i nt o- i gni t e"
] ,
" out put " : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a-
enr i ched- f r om- i gni t e" ,
" l ogTopi c" : " per si st ent : / / publ i c/ def aul t / f unct i on- l og- t opi c"
}
Pulsar Brokers
via Java
Admin API

More direct, faster, cleaner, and half the code volume
Pulsar REST
Admin API
Consumer/Producer
{
" cl assName" :
} ,
" i nput s" : [
] ,
}
Pulsar Brokers

Higher-availability option
Consumer/Producer
Consumer/Producer
Consumer/Producer
Pulsar REST
Admin API
{
" cl assName" :
} ,
" i nput s" : [
] ,
}
Pulsar Brokers
via Java Admin API
via Java Admin API
via Java Admin API

Fast-deploy
Pulsar REST
Admin API
{
" cl assName" :
} ,
" i nput s" : [
] ,
}
Pulsar Brokers
Or, as a Pulsar function

The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
Creates /ops/deployment-automation/[environment]

The Router Function
From the message below, the router creates:
/ops/deployment-automation/test
and routes the message there

The Router Function
Creates /ops/deployment-automation/[generator-type]

{
" envi r onment " : " t est " ,
" conf i gs" : [ {
" ar t i f act Pat hOr Ur l " : " ht t p: / / r epo- name/ pr oj ect - name/ exampl e- i gni t e- f unct i on- 1. 0. 1- 3- j ar - wi t h- dependenci es. j ar " ,
" cl assName" : " com. your company. pul sar . f unct i ons. Exampl eI gni t eFunct i on" ,
" i nput s" : [
" per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- t o- dump- i nt o- i gni t e"
] ,
" out put " : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- enr i ched- f r om- i gni t e" ,
" l ogTopi c" : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- enr i ched- f r om- i gni t e- l og”
} ,
{
" ar t i f act Pat hOr Ur l " : " ht t p: / / r epo- name/ pr oj ect - name/ exampl e- f i l t er - f unct i on- 1. 0. 0- 7- j ar - wi t h- dependenci es. j ar " ,
" name" : " exampl eFi l t er Funct i on" ,
" cl assName" : " com. your company. pul sar . f unct i ons. Exampl eFi l t er Funct i on" ,
" i nput s" : [
" per si st ent : / / f eeds/ exampl ePr oj ect / r aw- dat a”
] ,
" out put " : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- t o- dump- i nt o- i gni t e" ,
" l ogTopi c" : " per si st ent : / / exampl eTenant / exampl eNamespace/ dat a- t o- dump- i nt o- i gni t e- l og”
}
]
}

Synchronous Artifact
Download/Upload
(1)
(2)
Push for real-
time updates
Pull to get
all data
UI Tool
Server Sent Events (SSE’s)
Artifact URL +
identifying
metadata
from build tool
Keep track of configs here
Note: In this option, you must use the UI to merge the artifact with the configs.
Ensure brokers can download from where you store the artifact!

UI Tool
Download/Upload
(1)
(2)
Query to get all places
where the artifact has
been used.
Enrich the JSON with
this data.
Update configs
to use new
artifact
(1) Update configs in
CouchDB by writing as
staged
Once staged configs are approved,
push into test or prod environments
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pull to get all data
Artifact URL +
identifying
metadata
from build tool

UI Tool
Download/Upload
(1)
(2)
been used.
this data.
Update conﬁgs
to use new
artifact
staged
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pass command
Synchronously
execute
CouchDB
command
Be careful to avoid creating security
risks with how you implement this
e.g.
“merge-stage-sets”,
“commit-staged-to-test”,
“commit-staged-to-prod”,
“un-stage”,
“rollback”,
“get-all-data”,
etc.
(in a JSON object with any
additional parameters)
(1)
(2) Return result
Artifact URL +
identifying
metadata
from build tool

Build System Storage
Get our
artifact URL
(and any
necessary
metadata, if
applicable)
WebHook Filter/Transform

Build System Storage
Build/storage data
Get our
artifact URL
(and any
necessary
metadata, if
applicable)
AWS CodePipeline S3
Github Web Hook (1)
(2)
Passes metadata and reference to S3 artifact
Pulsar Beam
or equivalent HTTP Endpoint for Pulsar
Pulsar Brokers
Granting access to download artifacts in S3
. . .
Write JSON to Pulsar

Github Web Hook
(2)
Passes metadata and reference to S3 artifact
Pulsar Beam
or equivalent HTTP Endpoint for Pulsar
Pulsar Brokers
Granting access to download artifacts in S3
. . .
Write JSON to Pulsar
GCP Cloud Build
GCP IAM
(1)
Build System
Storage
Build/storage data
Get our
artifact URL
(and any
necessary
metadata, if
applicable)

Filter/Transform
This was best done in Scala
You could do the download asynchronously at a different point in the
ﬂow, but then you will need to ensure it’s fully downloaded before
pushing the deployment from the UI
Download/Upload
(1)
(2)
Security checking logic, such as package
vulnerability checks
Option 1 - Basic function CI/CD ﬂow
Push for real-
time updates
Pull to get
all data
Deploy to test Deploy to prod
fast-deploy-go
Test Pulsar REST Admin API Prod Pulsar REST Admin API
fast-deploy-go
Router
UI Tool
WebHook
Download artifact to store in CouchDB

fast-deploy-go
fast-deploy-go
Router
UI Tool
Download/Upload
(1)
(2)
been used.
this data.
Update conﬁgs
to use new
artifact
staged
Once staged conﬁgs are approved,
push into test or prod environments
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pull to get all data
Filter/Transform
WebHook

fast-deploy-go
fast-deploy-go
Router
UI Tool
Download/Upload
(1)
(2)
been used.
this data.
Update conﬁgs
to use new
artifact
staged
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pass command
Synchronously
execute
CouchDB
command
Be careful to avoid creating security
risks with how you implement this
e.g.
“merge-stage-sets”,
“commit-staged-to-test”,
“commit-staged-to-prod”,
“un-stage”,
“rollback”,
“get-all-data”,
etc.
(in a JSON object with any
additional parameters)
(1)
(2) Return result
Filter/Transform
WebHook

User
Request new topic for SNOW Request feed
Request datasource
Approval Gate
ACL approver DataEng
Saves back to SNOW table
(workflow is triggered on write)
Generate
function configs
Generate role
configs
Generate token
configs
Generate
source tap
function configs
Generate
validation tap
function configs
Generate
passthrough
function configs
SNOW = Service Now
Fast-Deploy
Report functions
deployed for topic
Role Generator
Report roles
created for topic
Token Generator
Report tokens
created for topic
Flink keyBy request ID
window with 60 second timeout
Save configs of what was created
Add into single
JSON array of
function configs
Router
SNOW Request
Could be modified to use custom UI instead
Populates template for configs for request ID
Be sure to pass the request ID
with each JSON object to
allow all configs to be joined
to the user request after
deployment!
Note: One request ID represents all configs produced by this template
Router removes the routing envelope since it won’t be needed downstream
Note: We created the token generator
as a producer/consumer due to a lack
of available API to generate tokens. So,
we needed to use the Pulsar CLI, which
meant that we needed a disk location to
save the token.
Check if all required objects were created
or if anything is missing.
Report any problems to DataEng. Else,
notify user that their topic is ready and
provide them with the tokens and
connection details.
Notification function that sends Email, UI,
and/or Slack notification.

Request new topic for SNOW Request feed
Request datasource
Approval Gate
ACL approver DataEng
Saves back to SNOW table
(workﬂow is triggered on write)
SNOW = Service Now
SNOW Request
Could be modiﬁed to use custom UI instead
User

{
"project": "<team-or-project-or-category>",
"name": "<name-of-the-datasource>",
"backfill": <true or false>
}

Generate
function configs
Generate role
configs
Generate token
configs
Generate
source tap
function configs
Generate
validation tap
function configs
Generate
passthrough
function configs
Add into single
JSON array of
function configs
Populates template for configs for request ID
Be sure to pass the request ID
with each JSON object to
allow all configs to be joined
to the user request after
deployment!
Note: One request ID represents all configs produced by this template

Fast-Deploy
Report functions
deployed for topic
Role Generator
Report roles
created for topic
Token Generator
Report tokens
created for topic
Flink keyBy request ID
window with 60 second timeout
Router
Router removes the routing envelope since it won’t be needed downstream
Note: We created the token generator
as a producer/consumer due to a lack
of available API to generate tokens. So,
we needed to use the Pulsar CLI, which
meant that we needed a disk location to
save the token.

Save configs of what was created
Check if all required objects were created
or if anything is missing.
Report any problems to DataEng. Else,
notify user that their topic is ready and
provide them with the tokens and
connection details.
Notification function that sends Email, UI,
and/or Slack notification.

Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz-
FruOGoo&feature=youtu.be
Performance Architecture Deep Dive:
https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/
How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works
2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be
Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be
(my most popular video yet!)
You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/

Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz-
FruOGoo&feature=youtu.be
2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be
Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be
(my most popular video yet!)
You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
Performance Architecture Deep Dive:
https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/
How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works

Pulsar Architectural Patterns for CI/CD Automation and Self-Service

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Pulsar Architectural Patterns for CI/CD Automation and Self-Service

Similar to Pulsar Architectural Patterns for CI/CD Automation and Self-Service (20)

More from Devin Bost

More from Devin Bost (6)

Recently uploaded

Recently uploaded (20)

Pulsar Architectural Patterns for CI/CD Automation and Self-Service

Editor's Notes