Indeed is consciously transforming our monolith applications to microservices. Moving monoliths from on-premise to a hybrid architecture is a non-trivial endeavor. It is as we know a marathon and never never a race when we refactor not all of our applications but, incrementally progress onward to resilience with cloud.
By partnering with Confluent we were able to procedurally migrate many of our workloads both critical and non-critical primarily using Kafka by adopting a data domain driven approach. In this talk, you will learn,
1. How to piece complex puzzles when you have bits of information
2. What questions to ask to prioritize feature improvements
3. How to enumerate impact
4. How to let your vendor know what is valuable
With over 20 years of experience working with various databases and datastores, I will share real examples of success and failures and lessons we learned when working with Confluent Cloud by:
- Implementing strategies
- Addressing short and long term value - for both technical and business
- The very methodical methods to form roadmaps
If you’re in discussions surrounding engineering platforms at your organization then this talk is for you. If you are a data driven engineering organization with solid leadership with sound decisions behind it, join us for this talk and let’s have a discussion.
2. Indeed
1. Indeed is the #1 job site in the world in
60 countries on 30 languages with
250M unique monthly visitors1
, 225M
resumes, 700M+ total ratings and
reviews
2. Headquartered at Austin, Texas.
Subsidiary of Recruit Holding Co., Ltd.
15+ office locations across the globe
including “Remote” offices everywhere
3. A metasearch engine for job listings
including company career web pages
and recruiting firms
1
ComScore, Total Visits, March 2021
5. ● Start with “I don’t know”
● Rely on data
● Experiment
● Measure
● Rely on experimental evidence
● Qualitative metrics
● Quantitative metrics
● Make better decisions
Data driven, a core value
“If we can measure it, we can improve it” - Donal McMahon, VP Data Science
6. Platform first
Society
Job Seekers
Employers
A lot of possibilities when
platform is first!
1. What segment are we
focusing on?
2. What products?
3. What priorities?
4. Which features?
5. What resources do we need?
6. How big are the
deployments?
7. How should we scale them?
Segmentation
7. Developer first
Self Serve
Continuous
delivery
Push on
green*
* Push on Green
● Self service
○ Infrastructure as a
Service
○ Environment as a
Service
● Canary deploys
● Automation
● CI/CD
● Coexist with legacy
code
● Built-in observability
● Security
Break Things
Faster (But Not in
Prod)
8. Use cases
We crawl job sites/posts and:
1. obtain normalized form
2. enrich gathered data
3. aggregate the jobs
We want data access to be:
1. Simple
2. Fast
3. Comprehensive
4. Relevant
9. Use cases
We have change data capture
pipelines that batch and share data
between different:
● Domains
● Integrations
● Data stores
● Exchanges
● Warehouses
● Analytics
10. Use cases
We microbatch events in streams
1. Stateful processing
2. Enrichment
3. Distributed processing
4. Rolling deployments
5. Persistent state stores
6. Faster recovery
11. Scale
On average, we process:
> 25,000 distinct event types
> 4.5 trillion events per month
~ 4.5 PB uncompressed per month
Logrepo
12. Why Kafka?
High performance
low latency data
pipelines
Stream analytics
Business critical
applications
Open-source
Streaming
Platform
Data integration
13. Why Kafka?
Open-source
Streaming
Platform
A distributed and replicated
structured commit log
Write ahead log
Transaction log
Recovery log
2 phase commit
Audit log
Logs: the heart of any data driven system
Get the Ebook!
Kafka Streams
Kafka Connect
Replication,
MirrorMaker
Stream Governance
3 phase commit
Append only Durable Immutable
15. We started with SOA● Standard deployables,
libraries and frameworks
● Separate control and data
plane
● Separate deployment
groups (for each
datacenter, site and
environment)
● Instances (daemons) in
multiple environments
● Integrations to frontend
Projects
Deployment
groups
Instances
Frontend
Result: Repeatable maintenance, procedures, deployments and
some monoliths
16. What kind of Monoliths?
1. Single process serving all
functionality of a system
2. Modular processes serving
particular functions and
coupled together for a
system
3. Third party systems
Code that
changes
together stays
together
Synchronous
calls
Encapsulation
17. Did we just make modular Monoliths?
Tightly coupled,
1. Services
2. Implementation
3. Temporal
4. Deployment
5. Domain
Kristen Westeinde’s talk is very insightful
18. Service mesh
Sidecar that guarantees
delivery of requests in a
complex topology of services
that comprise a modern,
cloud native application
Separate service application from
service communication logic
Pattern: Service Mesh, Phil Calçado
19. Some free benefits
● Service discovery - figure out IP addresses and ports of
instances of a service
● Load balancing - when you have multiple instances of a
service, distribute requests among them
● Circuit breaking - stop sending requests to an
unhealthy instance
● Rate limiting - only allow clients a certain number of
requests over a certain period of time
● Authentication - use API Keys and Allow lists
● Encryption - TLS across the wire
“Service Mesh? It’s a proxy” - Kafka and The Service Mesh - Gwen Shapira
20. Modern data flows are seams in the mesh
There’s data hidden in that mesh!
● Streaming platform
● Decentralized
● SaaS
● Governance
a. Discovery
b. Lineage
c. Policy
d. Classification
e. Observability
21. Sharing data better
We want systems to be responsive, resilient, elastic and message driven - The Reactive Manifesto
Events are
notifications
Messages are
data sent to an
addressable
recipient
ETL to Cloud
Publish
Consume
Data Domain
Event Store
Bounded context A Bounded context B
Bounded context C Bounded context D
App
Legacy
Get the Ebook!
APP
Sourcing
CQRS + Kafka Streams
We use Confluent Cloud, fully
managed Kafka platform
APP1 APP2
APP3 APP4
Queue
CDC + Event Streaming
APP5
22. Event driven
You said Kafka is a:
- Queue
- Database
Do we still need schema?
Do we still
need
common
structures?
● A monolith is a legitimate
solution
● You can have a centralized
database
● It’s ok to have many databases
Synchronous coupling between APIs of loosely
coupled services interacting with a database
Datastores are views of events in
our durable immutable event log
23. An old
reminder
about
pipes…
Doug McIlroy, the inventor of Unix pipes in
Unix philosophy* states,
(i) Make each program do one thing well
(ii) Expect the output of every program to
become the input to another
*The Bell System Technical Journal. Bell Laboratories. M. D. McIlroy, E. N. Pinson, and B. A. Tague. “Unix Time-Sharing
System Forward”. 1978. 57 (6, part 2). p. 1902.
Filter Filter
Filter
Pipe Pipe
Pipe
24. Smart endpoints, dumb pipes
● Collect events between domains
● Integrate
○ Asynchronous routing
○ Cross regions and datacenter
○ Protocols
■ gRPC2
■ HTTP2
■ REST
○ Language agnostic
○ APIs for each service
○ SLO per API
○ Schema adherence
● Business Logic or Machine Learning
Kafka protocol filter for Envoy
25. Bridge to Cloud
Winter 2021 gave us added opportunities for
platformization – “reduce dependency on Texas as
a single site for serving critical production
applications”
Yes, this happened. It was not all fun
and games and Kubernetes…
Measure
throughput
and latency
Specs
Migration
Strategies
kafka-producer-perf-test
kafka-consumer-perf-test
Cluster Type
CKU
Single AZ
Multi AZ
Network Type
Lift, tinker and shift
Active/active-consumer offset sync
Migrated 116 business critical
clusters in multiple environments in
7 different regions to Confluent
Cloud
MirrorMaker*
Confluent
replicator
Cluster
Linking
26. Migration Strategies
On-premise
Producers
Cloud
Consumers
1. Produce to both on-premise and cloud
2. Migrate consumers over time to cloud
3. Run Mirror Maker at each local region
1. Deduplication
2. Aggregation approach
3. Compression
a. Zstd for best
compression
b. Lz4 for throughput
4. Partitioning
5. Enrichment/filtering
6. Ingress/ Egress
Aggregation
Mirror Makers
On-premise
Consumers
Cloud
Producers
1. Produce to both on-premise and cloud
2. Migrate consumers over time to cloud
3. Run Mirror Maker at aggregation region
Aggregation
Mirror Makers
Consume remote, produce
local (Best practice)
Consume local, produce remote
Consumers
Producers
27. Kafka Streams
Filter
&
join
Flat
map
1. Topology
2. Stateful processing
a. Aggregation
b. Joins
c. Windowing
d. Transformation
3. Persistent state stores
a. Assignment
b. Standby replicas
c. Membership
4. Performance tuning
a. Consumer sizing
b. RocksDB
c. Session timeout
Don’t do Kafka Streams
without:
- Partitioning strategy
- Schema
management
- Strategy to manage
state
- Considering
scalability
- Recovery lag
30. Trust and Security
- Transport
- TLS v1.2+
- SASL_SSL
- Private network
- Encryption at rest
- Access control
- Authorization
- Audit logs
- Compliance
- TLS v1.2+
- API Keys -
applications
- SAML/SSO - Users
- OAuth/ OIDC
Access
Data protection
Threat protection
Network security
31. Producers
Consumers
Data as a product
Data Quality
APIs
Event
Stores
CDC
Streams
What do data scientists do?
- Build predictive models
- Machine learning
- Data Clustering
- Trend detection
Mining
Management
Scale
32. Discovery and Governance
Data Mesh
Data Sharing
Disaster Recovery
Geo-replication
Edge Aggregation
Schema
registry
Cluster
Linking
Stream
Governance
Data Catalog
Data Lineage
Data Policies
Security
33. We are running a marathon
Official Terraform Provider
Security - OAuth and
OIDC
Kafka Streams enhancements
Observability improvements
Professional Services
Stream Governance
34. Building strategic partnership
Quarterly business reviews
Deep dive sessions
Regular office hours
Support syncup hours
Professional Services
Routine audits
Forecasting growth Technical architecture reviews