SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Scaling Data Infrastructure
@ Spotify
matti@spotify.com
kalvans@spotify.com
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
spotify-data-infrastructure
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
Mārtiņš Kalvāns
kalvans@spotify.com
Matti Pehrs
matti@spotify.com
Agenda
1. Data at Spotify
2. Summer of 2015
3. Challenges & Victory
○ Datamon
○ Styx
○ GABO
Spotify big-data context
● Over 100 million monthly active users
● Over 30 million song
● Over 2 billion playlists
● Active in 60 markets
Data is at the heart of Spotify
In 2007
- Monthly Royalty Report
In 2016
- Monthly Royalty Report
- Weekly Billboard
- Daily reports to partners
- ...
- AB-Testing
- Discover weekly
- Daily Mix
- ...
Our growth in Data
Users
+50 TB/day
+100M Users
Developers
+60 TB/day
+10k M/R jobs
Hadoop
Autonomy & Dependencies
Team A
Team B
Team C
Autonomy & Dependencies
Autonomy & Dependencies
Autonomy & Dependencies
Summer of Incidents
● A strain of incidents
Summer of Incidents
● A strain of incidents
● War-room
Summer of Incidents
● A strain of incidents
● War-room
● Hadoop on it’s knees
Summer of Incidents
● A strain of incidents
● War-room
● Hadoop on it’s knees
● Event Delivery Catch up
Summer of Incidents
● A strain of incidents
● War-room
● Hadoop on it’s knees
● Event Delivery Catch up
● Reprocessing of data
Summer of Incidents
● A strain of incidents
● War-room
● Hadoop on it’s knees
● Event Delivery Catch up
● Reprocessing of data
● Hard to debug data issues
Summer of Incidents
Challenges and the path to victory...
1. Early Warning Datamon - Data monitoring
Challenges and the path to victory...
1. Early Warning Datamon - Data monitoring
2. Debuggability & Control Styx - Scheduling and control
Challenges and the path to victory...
1. Early Warning Datamon - Data monitoring
2. Debuggability & Control Styx - Scheduling and control
3. Automate Capacity GABO - Event Delivery
Challenges and the path to victory...
1. Early Warning Datamon - Data monitoring
2. Debuggability & Control Styx - Scheduling and control
3. Automate Capacity GABO - Event Delivery
Challenges and the path to victory...
Early Warning - Datamon
● Unified view
○ Alignment between teams
● Ownership
○ Clear ownership of data
● SLA
○ Alert on late data
Early Warning - Datamon
● Define terminology
● Provide metadata language
● Implement a Datamon service
Early Warning - Datamon
1. Early Warning Datamon - Data monitoring
2. Debuggability & Control Styx - Scheduling and control
3. Automate Capacity GABO - Event Delivery
Challenges and the path to victory...
- Execution control
- Self service for data users
- Execution information
- Expose debug information
- Execution isolation
- Docker for data jobs
Debuggability & Control - Styx
The river Styx
● Execution control
○ Centralized execution API
Debuggability & Control - Styx
Debuggability & Control - Styx
● Execution control
○ Centralized execution API
○ Backfilling and reprocessing
● Execution control
● Execution information
○ Timeline
Debuggability & Control - Styx
Debuggability & Control - Styx
● Execution control
● Execution information
○ Timeline
○ Google Cloud Logging
Debuggability & Control - Styx
● Execution control
● Execution information
● Execution isolation
○ Docker
1. Early Warning Datamon - Data monitoring
2. Debuggability & Control Styx - Scheduling and control
3. Automate Capacity GABO - Event Delivery
Challenges and the path to victory...
● Complex and manual config
Automate Capacity - GABO/Event Delivery
● Complex and manual config
● Pubsub & Dataflow streaming
Automate Capacity - GABO/Event Delivery
● Complex and manual config
● Pubsub & Dataflow streaming
● Pubsubs at scale
Automate Capacity - GABO/Event Delivery
● Complex and manual config
● Pubsub & Dataflow streaming
● Pubsubs at scale
● Dataflow streaming
Automate Capacity - GABO/Event Delivery
● Complex and manual config
● Pubsub & Dataflow streaming
● Pubsubs at scale
● Dataflow streaming :-(
● 2 micro services + 1 Map/Reduce job
Automate Capacity - GABO/Event Delivery
● Complex and manual config
● Pubsub & Dataflow streaming
● Pubsubs at scale
● Dataflow streaming :-(
● 2 micro services + 1 Map/Reduce job
● Autoscaling & The Stuffer
Automate Capacity - GABO/Event Delivery
● Handles at least 10x our load
● Darkloading
● Autoscale everything
● Self service
GABO - WIP
● Make sure you have the right
tools to deal with data incidents
○ Make sure you have time to
implement the tools you need
● Remember that your capacity
model can fail at larger scale
○ Keep track of your scale and
Automate, automate, automate...
Summary
Thank you!
kalvans@spotify.com
matti@spotify.com
Want to join the band?
http://spoti.fi/jobs
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
spotify-data-infrastructure

Weitere ähnliche Inhalte

Andere mochten auch

Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Peter Antman
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...Kevin Goldsmith
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)JavaScript @ Spotify (Felipe Ribeiro Technology Stream)
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)IT Arena
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)Kinshuk Mishra
 
Playlists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsPlaylists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsJimmy Mårdell
 
Africa DevOps Day 2015
Africa DevOps Day 2015Africa DevOps Day 2015
Africa DevOps Day 2015Danielle Jabin
 
Docker at Spotify - Dockercon14
Docker at Spotify - Dockercon14Docker at Spotify - Dockercon14
Docker at Spotify - Dockercon14dotCloud
 
How the economist with cloud BI and Looker have improved data-driven decision...
How the economist with cloud BI and Looker have improved data-driven decision...How the economist with cloud BI and Looker have improved data-driven decision...
How the economist with cloud BI and Looker have improved data-driven decision...Looker
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At SpotifyVidhya Murali
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotifyAli Sarrafi
 
Spotify Business Case
Spotify Business CaseSpotify Business Case
Spotify Business CaseDavid Gorgan
 
Empowering Engineering Talent - an update from Spotify
Empowering Engineering Talent - an update from SpotifyEmpowering Engineering Talent - an update from Spotify
Empowering Engineering Talent - an update from SpotifyKevin Goldsmith
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyJosh Baer
 
Spotify Company presentation
Spotify Company presentationSpotify Company presentation
Spotify Company presentationalifost
 
Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling DropboxC4Media
 
Making Better Mistakes Tomorrow
Making Better Mistakes TomorrowMaking Better Mistakes Tomorrow
Making Better Mistakes TomorrowDanielle Jabin
 
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)Nordic APIs
 

Andere mochten auch (20)

Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)JavaScript @ Spotify (Felipe Ribeiro Technology Stream)
JavaScript @ Spotify (Felipe Ribeiro Technology Stream)
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
Spotify Teknikdagarna
Spotify TeknikdagarnaSpotify Teknikdagarna
Spotify Teknikdagarna
 
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
 
Playlists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsPlaylists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objects
 
Africa DevOps Day 2015
Africa DevOps Day 2015Africa DevOps Day 2015
Africa DevOps Day 2015
 
Docker at Spotify - Dockercon14
Docker at Spotify - Dockercon14Docker at Spotify - Dockercon14
Docker at Spotify - Dockercon14
 
How the economist with cloud BI and Looker have improved data-driven decision...
How the economist with cloud BI and Looker have improved data-driven decision...How the economist with cloud BI and Looker have improved data-driven decision...
How the economist with cloud BI and Looker have improved data-driven decision...
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotify
 
Spotify Business Case
Spotify Business CaseSpotify Business Case
Spotify Business Case
 
Empowering Engineering Talent - an update from Spotify
Empowering Engineering Talent - an update from SpotifyEmpowering Engineering Talent - an update from Spotify
Empowering Engineering Talent - an update from Spotify
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At Spotify
 
Spotify Company presentation
Spotify Company presentationSpotify Company presentation
Spotify Company presentation
 
Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling Dropbox
 
Making Better Mistakes Tomorrow
Making Better Mistakes TomorrowMaking Better Mistakes Tomorrow
Making Better Mistakes Tomorrow
 
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)
How Spotify Payments Creates APIs to Manage Complexity (Horia Jurcut)
 

Mehr von C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Mehr von C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Kürzlich hochgeladen

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Scaling the Data Infrastructure @Spotify

  • 1. Scaling Data Infrastructure @ Spotify matti@spotify.com kalvans@spotify.com
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ spotify-data-infrastructure
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 5. Agenda 1. Data at Spotify 2. Summer of 2015 3. Challenges & Victory ○ Datamon ○ Styx ○ GABO
  • 6. Spotify big-data context ● Over 100 million monthly active users ● Over 30 million song ● Over 2 billion playlists ● Active in 60 markets
  • 7. Data is at the heart of Spotify In 2007 - Monthly Royalty Report In 2016 - Monthly Royalty Report - Weekly Billboard - Daily reports to partners - ... - AB-Testing - Discover weekly - Daily Mix - ...
  • 8. Our growth in Data Users +50 TB/day +100M Users Developers +60 TB/day +10k M/R jobs
  • 14. ● A strain of incidents Summer of Incidents
  • 15. ● A strain of incidents ● War-room Summer of Incidents
  • 16. ● A strain of incidents ● War-room ● Hadoop on it’s knees Summer of Incidents
  • 17. ● A strain of incidents ● War-room ● Hadoop on it’s knees ● Event Delivery Catch up Summer of Incidents
  • 18. ● A strain of incidents ● War-room ● Hadoop on it’s knees ● Event Delivery Catch up ● Reprocessing of data Summer of Incidents
  • 19. ● A strain of incidents ● War-room ● Hadoop on it’s knees ● Event Delivery Catch up ● Reprocessing of data ● Hard to debug data issues Summer of Incidents
  • 20. Challenges and the path to victory...
  • 21. 1. Early Warning Datamon - Data monitoring Challenges and the path to victory...
  • 22. 1. Early Warning Datamon - Data monitoring 2. Debuggability & Control Styx - Scheduling and control Challenges and the path to victory...
  • 23. 1. Early Warning Datamon - Data monitoring 2. Debuggability & Control Styx - Scheduling and control 3. Automate Capacity GABO - Event Delivery Challenges and the path to victory...
  • 24. 1. Early Warning Datamon - Data monitoring 2. Debuggability & Control Styx - Scheduling and control 3. Automate Capacity GABO - Event Delivery Challenges and the path to victory...
  • 25. Early Warning - Datamon
  • 26. ● Unified view ○ Alignment between teams ● Ownership ○ Clear ownership of data ● SLA ○ Alert on late data Early Warning - Datamon
  • 27. ● Define terminology ● Provide metadata language ● Implement a Datamon service Early Warning - Datamon
  • 28. 1. Early Warning Datamon - Data monitoring 2. Debuggability & Control Styx - Scheduling and control 3. Automate Capacity GABO - Event Delivery Challenges and the path to victory...
  • 29. - Execution control - Self service for data users - Execution information - Expose debug information - Execution isolation - Docker for data jobs Debuggability & Control - Styx The river Styx
  • 30. ● Execution control ○ Centralized execution API Debuggability & Control - Styx
  • 31. Debuggability & Control - Styx ● Execution control ○ Centralized execution API ○ Backfilling and reprocessing
  • 32. ● Execution control ● Execution information ○ Timeline Debuggability & Control - Styx
  • 33. Debuggability & Control - Styx ● Execution control ● Execution information ○ Timeline ○ Google Cloud Logging
  • 34. Debuggability & Control - Styx ● Execution control ● Execution information ● Execution isolation ○ Docker
  • 35. 1. Early Warning Datamon - Data monitoring 2. Debuggability & Control Styx - Scheduling and control 3. Automate Capacity GABO - Event Delivery Challenges and the path to victory...
  • 36. ● Complex and manual config Automate Capacity - GABO/Event Delivery
  • 37. ● Complex and manual config ● Pubsub & Dataflow streaming Automate Capacity - GABO/Event Delivery
  • 38. ● Complex and manual config ● Pubsub & Dataflow streaming ● Pubsubs at scale Automate Capacity - GABO/Event Delivery
  • 39. ● Complex and manual config ● Pubsub & Dataflow streaming ● Pubsubs at scale ● Dataflow streaming Automate Capacity - GABO/Event Delivery
  • 40. ● Complex and manual config ● Pubsub & Dataflow streaming ● Pubsubs at scale ● Dataflow streaming :-( ● 2 micro services + 1 Map/Reduce job Automate Capacity - GABO/Event Delivery
  • 41. ● Complex and manual config ● Pubsub & Dataflow streaming ● Pubsubs at scale ● Dataflow streaming :-( ● 2 micro services + 1 Map/Reduce job ● Autoscaling & The Stuffer Automate Capacity - GABO/Event Delivery
  • 42. ● Handles at least 10x our load ● Darkloading ● Autoscale everything ● Self service GABO - WIP
  • 43. ● Make sure you have the right tools to deal with data incidents ○ Make sure you have time to implement the tools you need ● Remember that your capacity model can fail at larger scale ○ Keep track of your scale and Automate, automate, automate... Summary
  • 44. Thank you! kalvans@spotify.com matti@spotify.com Want to join the band? http://spoti.fi/jobs
  • 45. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ spotify-data-infrastructure