Haggai Philip Zagury gave a presentation about planning data pipelines at scale. He discussed challenges like unpredictable scale, increasing complexity as services grow, and high network traffic from logs. His proposed solutions included using managed cloud services to reduce complexity, optimizing data storage for read/write performance, using multi-tenancy to track costs, and leveraging container orchestration tools like Kubernetes to simplify management of compute resources. He also advocated for measuring performance and costs to continuously improve efficiency.
1. FULLSTACK TECH RADAR DAY
The #2nd_half
Scaling to the #next2
Haggai Philip Zagury, DevOps Group & Tech Lead
2. FULLSTACK TECH RADAR DAY
Who we are
Tikal helps ISV’s in Israel & abroad in their technological
challenges.
Our Engineers are Fullstack Developers with expertise in Android,
DevOps, Java, JS, Python, ML
We are passionate about technology and specialise in
OpenSource technologies.
Our Tech and Group leaders help establish & enhance existing
software teams with innovative & creative thinking.
https://www.meetup.com/full-stack-developer-il/
3. FULLSTACK TECH RADAR DAY
Haggai Philip Zagury - DevOps Group & Tech lead
My open thinking and open techniques ideology is driven by
Open Source technologies and the collaborative manner defining
my M.O.
My solution driven approach is strongly based on hands-on and
deep understanding of Operating Systems, Applications stacks and
Software languages, Networking, Cloud in general and today more
and more Cloud Native solutions.
4.
5. FULLSTACK TECH RADAR DAY
Planning a data pipeline (with constraints)
● An event driven application
6. FULLSTACK TECH RADAR DAY
Planning a data pipeline (with constraints)
● An event driven application
● High Resolution Video Streams from multiple locations
● Plan for 10s of simultaneous sources …
Digest
7. FULLSTACK TECH RADAR DAY
Planning a data pipeline (with constraints)
● An event driven application
● High Resolution Video Streams from multiple locations
● Plan for 10s of simultaneous sources …
● Data enrichment pipelines
Digest Enrich
8. FULLSTACK TECH RADAR DAY
Planning a data pipeline (with constraints)
● An event driven application
● High Resolution Video Streams from multiple locations
● Plan for 10s of simultaneous sources …
● Data enrichment pipelines
● Output an HD display in near real-time (+- 8sec)
Digest Enrich
9. FULLSTACK TECH RADAR DAY
@scale - point #1
● The board emphasises a perquisite that
we have unlimited resources to
complete the task …
● Which isn’t the case in real-life …
10. FULLSTACK TECH RADAR DAY
@scale - point #2
● If you have a consistent measurable unit
● you can plan your journey to the next
square either by:
● Being more Efficient
● Doubling* your compute power
(not accurate will explain later why)
11. FULLSTACK TECH RADAR DAY
Planning a data pipeline - starting point
● The scale is not predictable …
● As you grow so does your supporting
services …
● Dealing with debug logs in a section
of a cluster means 6 times network
traffic
12. FULLSTACK TECH RADAR DAY
Planning a data pipeline - reality
● The scale is not predictable …
● As you grow so does your supporting
services …
● Dealing with debug logs in a section
of a cluster means 6 times network
traffic
● Deployment complexity increases …
14. FULLSTACK TECH RADAR DAY
The cloud is your friend
● Managed Services
● Solving Complexity of operation
● Increases/Decreases Price (in most
cases)
● Pay as You go …
● Polyglottism / “Architecture captivity“
16. FULLSTACK TECH RADAR DAY
Solution
How to store data for
optimised
r/w procedures
55,000 read requests per second
Hitting S3 Limit on >=10 simultaneous streams
17. FULLSTACK TECH RADAR DAY
Solution
How to store data for
optimised
r/w procedures
55,000 read requests per second
Hitting S3 Limit on >=10 simultaneous streams
# Complexitynext2
28. FULLSTACK TECH RADAR DAY
Resource Tags are your friend !
● Tagging resources
● Putting price tags on operations !
● How much did pipeline1 cost ?
● Can I do it faster ? [ Time ]
● Each enrichment service has it’s own
cluster optimised for Compute unit sizes
staging
stream
infra
30. FULLSTACK TECH RADAR DAY
#2nd_half or # issues …next2
Justin Ryan
https://qconsf.com/sf2019/presentation/scaling-patterns-netflixs-edge
Scaling Patterns for Netflix's Edge
31. FULLSTACK TECH RADAR DAY
#2nd_half or # issues …next2
https://www.infoq.com/presentations/efficiency-better-software-faster/
Todd Montgomery
Todd Montgomery explores the everyday things that those with an eye to
performance and efficiency do that can be leveraged by anyone to build better software faster.
36. FULLSTACK TECH RADAR DAY
The #2nd half/ # solution(s)next2
● What is the 2nd half of the table problem(s)
● ***** real-life scenario !
● Scale ability -> Capacity planning in the “Container Orchestration Era”
● (in the past Marathon) Kubernetes
● FaaS (e.g. AWS Lambda)
● Nomad
37. FULLSTACK TECH RADAR DAY
Lego for the masses
● Solves Complexity (i know debatable …)
● 1 size fit all
background image by @hagzag
42. FULLSTACK TECH RADAR DAY
Time Days Minuets Seconds Milliseconds
Cost
Agility
Scalability
https://www.slideshare.net/BryanMcAninch/the-faas-and-the-curious-86874211
43. FULLSTACK TECH RADAR DAY
Make the best of what you have ?
● Be more accurate / precise in what you do !
● Measuring techniques (that scale)