SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Pressing play



                                        Niklas Gustavsson
                                               ngn@spotify.com
                                                    @protocol7

Tuesday, April 17, 12
Who am I?
      • ngn@spotify.com
      • @protocol7
      • Spotify backend dev based in Göteborg
      • Mainly from a JVM background, working on
        various stuff over the years
      • Apache Software Foundation member




Tuesday, April 17, 12
What’s Spotify all about?
      •       A big catalogue, tons of music
      •       Available everywhere
      •       Great user experience
      •       More convenient than piracy
      •       Fast, reliable, always available
      •       Scalable for many, many users
      •       Ad-supported or payed-for service




Tuesday, April 17, 12
Pressing	
  play


Tuesday, April 17, 12
Where’s Spotify?
      • Let’s start the client, but where should it connect
        to?




Tuesday, April 17, 12
Aside: SRV records
      • Example SRV
      _spotify-mac-client._tcp.spotify.com. 242 IN    SRV 10   8      4070 C8.spotify.com.
      _spotify-mac-client._tcp.spotify.com. 242 IN    SRV 10   16     4070 C4.spotify.com.
      name                                  TTL class     prio weight port host




      • GeoDNS used




Tuesday, April 17, 12
What does that record really point to?
      • accesspoint
      • Handles authentication state, logging, routing,
        rate limiting and much more
      • Protocol between client and AP uses a single,
        encrypted multiplexed socket over TCP
      • Written in C++




Tuesday, April 17, 12
Tuesday, April 17, 12
Find something to play
      • Let’s search




Tuesday, April 17, 12
Services
      • Probably close to 100 backend services, most
        small, handling a single task
      • UNIX philosophy
      • Many autonomous
      • Deployed on commodity servers
      • Always redundant




Tuesday, April 17, 12
Services
      • Mostly written in Python, a few in Java and C
      • Storage optimized for each service, mostly
        PostgreSQL, Cassandra and Tokyo Cabinet
      • Many service uses in-memory caching using for
        example /dev/shm or memcached
      • Usually a small daemon, talking HTTP or Hermes
        • Got our own supervisor which keeps services
           running




Tuesday, April 17, 12
Aside: Hermes
      •       ZeroMQ for transport, protobuf for envelope and payload
      •       HTTP-like verbs and caching
      •       Request-reply and publish/subscribe
      •       Very performant and introspectable




Tuesday, April 17, 12
How does the accesspoint find search?
      • Everything has an SRV DNS record:
        • One record with same name for each service
          instance
        • Clients resolve to find servers providing that
          service
        • Lowest priority record is chosen with weighted
          shuffle
        • Clients retry other instances in case of failures




Tuesday, April 17, 12
Read-only services
      •       Stateless
      •       Writes are hard
      •       Simple to scale, just add more servers
      •       Services can be restarted as needed
      •       Indexes prefabricated, distributed to live servers




Tuesday, April 17, 12
Read-write services
      • User generated content, e.g. playlists
      • Hard to ensure consistence of data across instances

      Solutions:
      • Eventual consistency:
         • Reads of just written data not guaranteed to be up-to-date
      • Locking, atomic operations
          • Creating globally unique keys, e.g. usernames
          • Transactions, e.g. billing


Tuesday, April 17, 12
Sharding
      • Some services use Dynamo inspired DHTs
        • Each request has a key
        • Each service node is responsible for a range of
          hash keys
        • Data is distributed among service nodes
        • Redundancy is ensured by writing to replica
          node
        • Data must be transitioned when ring changes




Tuesday, April 17, 12
DHT example




Tuesday, April 17, 12
search
      • Java service
      • Lucene storage
        • New index published daily
      • Doesn’t store any metadata in itself, returns a list
        of identifiers

      • (Search suggestions are served from a separate
        service, optimized for speed)




Tuesday, April 17, 12
Metadata services
      •       Multiple read-only services
      •       60 Gb indices
      •       Responds to metadata requests
      •       Decorates metadata onto other service responses
              • We’re most likely moving away from this model




Tuesday, April 17, 12
Tuesday, April 17, 12
Another aside: How does stuff get into Spotify?
      • >15 million tracks, we can’t maintain all that
        ourselves
      • Ingest audio, images and metadata from labels
        • Receive, transform, transcode, merge
      • All ends up in a metadata database from which
        indices are generated and distributed to services




Tuesday, April 17, 12
Tuesday, April 17, 12
The Kent bug
      • Much of the metadata lacks identifiers which
        leaves us with heuristics.




Tuesday, April 17, 12
Play


Tuesday, April 17, 12
Audio encodings and files
      • Spotify supports multiple audio encodings
        • Ogg Vorbis 96 (-q2), 160 (-q5) and 320 000 (-
            q9)
        • MP3 320 000 (downloads)
      • For each track, a file for each encoding/bitrate is
        listed in the returned metadata
      • The client picks an appropriate choice




Tuesday, April 17, 12
Get the audio data
      • The client now must fetch the actual audio data
      • Latency kills




Tuesday, April 17, 12
Cache
      •       Player caches tracks it has played
      •       Caches are large (56% are over 5 GB)
      •       Least Recently Used policy for cache eviction
      •       50% of data comes from local cache
      •       Cached files are served in P2P overlay




Tuesday, April 17, 12
Streaming
      • Request first piece from Spotify storage
      • Meanwhile, search peer-to-peer (P2P) for
        remainder
      • Switch back and forth between Spotify storage
        and peers as needed
      • Towards end of a track, start prefetching next one




Tuesday, April 17, 12
P2P
      • All peers are equals (no supernodes)
      • A user only downloads data she needs
      • tracker service keeps peers for each track
      • P2P network becomes (weakly) clustered by
        interest
      • Oblivious to network architecture
      • Does not enforce fairness
      • Mobile clients does not participate in P2P



                        h.p://www.csc.kth.se/~gkreitz/spo9fy/kreitz-­‐spo9fy_kth11.pdf
Tuesday, April 17, 12
Tuesday, April 17, 12
Tuesday, April 17, 12
Success!




Tuesday, April 17, 12
YAA: Hadoop
      • We run analysis using Hadoop which feeds back
        into the previously described process, e.g. track
        popularity is used for weighing search results and
        toplists




Tuesday, April 17, 12
Tuesday, April 17, 12
Development at Spotify
      • Uses almost exclusively open source software
        • Git, Debian, Munin, Zabbix, Puppet, Teamcity...
      • Developers use whatever development tools they are
        comfortable with
      • Scrum or Kanban in three week iterations
      • DevOps heavy. Freaking awesome ops
      • Monitor and measure all the things!




Tuesday, April 17, 12
Development at Spotify
      •        Development hubs in Stockholm, Göteborg and NYC
      •        All in all, >220 people in tech
      •        Very talented team
      •        Hackdays and system owner days in each iteration
      •        Hangs out on IRC
      •        Growing and hiring




Tuesday, April 17, 12
Languages at Spotify




Tuesday, April 17, 12
Questions?



Tuesday, April 17, 12
Thank you

                           Want to work at Spotify?
                        http://www.spotify.com/jobs/


Tuesday, April 17, 12

Weitere ähnliche Inhalte

Was ist angesagt?

A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
Acquia
 
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Lucas Jellema
 

Was ist angesagt? (20)

From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Spotify: P2P music streaming
Spotify: P2P music streamingSpotify: P2P music streaming
Spotify: P2P music streaming
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
 
Spotify: Data center & Backend buildout
Spotify: Data center & Backend buildoutSpotify: Data center & Backend buildout
Spotify: Data center & Backend buildout
 
HTTP Live Streaming
HTTP Live StreamingHTTP Live Streaming
HTTP Live Streaming
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
 
5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and Pain
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYCSpotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Transactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric LiangTransactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric Liang
 

Ähnlich wie Spotify architecture - Pressing play

The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the Internet
Andrew Morris
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation
Michael Bohlig
 
20130714 php matsuri - highly available php
20130714   php matsuri - highly available php20130714   php matsuri - highly available php
20130714 php matsuri - highly available php
Graham Weldon
 

Ähnlich wie Spotify architecture - Pressing play (20)

Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
 
Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
 
Data at Spotify
Data at SpotifyData at Spotify
Data at Spotify
 
Is Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon ToigoIs Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon Toigo
 
The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the Internet
 
DNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and ResponseDNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and Response
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
 
Puppet Keynote
Puppet KeynotePuppet Keynote
Puppet Keynote
 
Compression talk
Compression talkCompression talk
Compression talk
 
ION Krakow - A Global IPv6 Deployment Update
ION Krakow - A Global IPv6 Deployment UpdateION Krakow - A Global IPv6 Deployment Update
ION Krakow - A Global IPv6 Deployment Update
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation
 
How to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the WorldHow to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the World
 
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision ProblemUsing ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
 
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
 
20130714 php matsuri - highly available php
20130714   php matsuri - highly available php20130714   php matsuri - highly available php
20130714 php matsuri - highly available php
 
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenchesInternet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
 
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
 

Mehr von Niklas Gustavsson (11)

Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014
 
Spotify services (SDC 2013)
Spotify services (SDC 2013)Spotify services (SDC 2013)
Spotify services (SDC 2013)
 
Real-time web
Real-time webReal-time web
Real-time web
 
RESTful web services
RESTful web servicesRESTful web services
RESTful web services
 
Not only SQL
Not only SQL Not only SQL
Not only SQL
 
HTML5
HTML5HTML5
HTML5
 
The future is bright
The future is brightThe future is bright
The future is bright
 
CouchDB
CouchDBCouchDB
CouchDB
 
Oredev 2009 JAX-RS
Oredev 2009 JAX-RSOredev 2009 JAX-RS
Oredev 2009 JAX-RS
 
Apachecon Eu 2008 Mina
Apachecon Eu 2008 MinaApachecon Eu 2008 Mina
Apachecon Eu 2008 Mina
 
REST made simple with Java
REST made simple with JavaREST made simple with Java
REST made simple with Java
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Spotify architecture - Pressing play

  • 1. Pressing play Niklas Gustavsson ngn@spotify.com @protocol7 Tuesday, April 17, 12
  • 2. Who am I? • ngn@spotify.com • @protocol7 • Spotify backend dev based in Göteborg • Mainly from a JVM background, working on various stuff over the years • Apache Software Foundation member Tuesday, April 17, 12
  • 3. What’s Spotify all about? • A big catalogue, tons of music • Available everywhere • Great user experience • More convenient than piracy • Fast, reliable, always available • Scalable for many, many users • Ad-supported or payed-for service Tuesday, April 17, 12
  • 5. Where’s Spotify? • Let’s start the client, but where should it connect to? Tuesday, April 17, 12
  • 6. Aside: SRV records • Example SRV _spotify-mac-client._tcp.spotify.com. 242 IN SRV 10 8 4070 C8.spotify.com. _spotify-mac-client._tcp.spotify.com. 242 IN SRV 10 16 4070 C4.spotify.com. name TTL class prio weight port host • GeoDNS used Tuesday, April 17, 12
  • 7. What does that record really point to? • accesspoint • Handles authentication state, logging, routing, rate limiting and much more • Protocol between client and AP uses a single, encrypted multiplexed socket over TCP • Written in C++ Tuesday, April 17, 12
  • 9. Find something to play • Let’s search Tuesday, April 17, 12
  • 10. Services • Probably close to 100 backend services, most small, handling a single task • UNIX philosophy • Many autonomous • Deployed on commodity servers • Always redundant Tuesday, April 17, 12
  • 11. Services • Mostly written in Python, a few in Java and C • Storage optimized for each service, mostly PostgreSQL, Cassandra and Tokyo Cabinet • Many service uses in-memory caching using for example /dev/shm or memcached • Usually a small daemon, talking HTTP or Hermes • Got our own supervisor which keeps services running Tuesday, April 17, 12
  • 12. Aside: Hermes • ZeroMQ for transport, protobuf for envelope and payload • HTTP-like verbs and caching • Request-reply and publish/subscribe • Very performant and introspectable Tuesday, April 17, 12
  • 13. How does the accesspoint find search? • Everything has an SRV DNS record: • One record with same name for each service instance • Clients resolve to find servers providing that service • Lowest priority record is chosen with weighted shuffle • Clients retry other instances in case of failures Tuesday, April 17, 12
  • 14. Read-only services • Stateless • Writes are hard • Simple to scale, just add more servers • Services can be restarted as needed • Indexes prefabricated, distributed to live servers Tuesday, April 17, 12
  • 15. Read-write services • User generated content, e.g. playlists • Hard to ensure consistence of data across instances Solutions: • Eventual consistency: • Reads of just written data not guaranteed to be up-to-date • Locking, atomic operations • Creating globally unique keys, e.g. usernames • Transactions, e.g. billing Tuesday, April 17, 12
  • 16. Sharding • Some services use Dynamo inspired DHTs • Each request has a key • Each service node is responsible for a range of hash keys • Data is distributed among service nodes • Redundancy is ensured by writing to replica node • Data must be transitioned when ring changes Tuesday, April 17, 12
  • 18. search • Java service • Lucene storage • New index published daily • Doesn’t store any metadata in itself, returns a list of identifiers • (Search suggestions are served from a separate service, optimized for speed) Tuesday, April 17, 12
  • 19. Metadata services • Multiple read-only services • 60 Gb indices • Responds to metadata requests • Decorates metadata onto other service responses • We’re most likely moving away from this model Tuesday, April 17, 12
  • 21. Another aside: How does stuff get into Spotify? • >15 million tracks, we can’t maintain all that ourselves • Ingest audio, images and metadata from labels • Receive, transform, transcode, merge • All ends up in a metadata database from which indices are generated and distributed to services Tuesday, April 17, 12
  • 23. The Kent bug • Much of the metadata lacks identifiers which leaves us with heuristics. Tuesday, April 17, 12
  • 25. Audio encodings and files • Spotify supports multiple audio encodings • Ogg Vorbis 96 (-q2), 160 (-q5) and 320 000 (- q9) • MP3 320 000 (downloads) • For each track, a file for each encoding/bitrate is listed in the returned metadata • The client picks an appropriate choice Tuesday, April 17, 12
  • 26. Get the audio data • The client now must fetch the actual audio data • Latency kills Tuesday, April 17, 12
  • 27. Cache • Player caches tracks it has played • Caches are large (56% are over 5 GB) • Least Recently Used policy for cache eviction • 50% of data comes from local cache • Cached files are served in P2P overlay Tuesday, April 17, 12
  • 28. Streaming • Request first piece from Spotify storage • Meanwhile, search peer-to-peer (P2P) for remainder • Switch back and forth between Spotify storage and peers as needed • Towards end of a track, start prefetching next one Tuesday, April 17, 12
  • 29. P2P • All peers are equals (no supernodes) • A user only downloads data she needs • tracker service keeps peers for each track • P2P network becomes (weakly) clustered by interest • Oblivious to network architecture • Does not enforce fairness • Mobile clients does not participate in P2P h.p://www.csc.kth.se/~gkreitz/spo9fy/kreitz-­‐spo9fy_kth11.pdf Tuesday, April 17, 12
  • 33. YAA: Hadoop • We run analysis using Hadoop which feeds back into the previously described process, e.g. track popularity is used for weighing search results and toplists Tuesday, April 17, 12
  • 35. Development at Spotify • Uses almost exclusively open source software • Git, Debian, Munin, Zabbix, Puppet, Teamcity... • Developers use whatever development tools they are comfortable with • Scrum or Kanban in three week iterations • DevOps heavy. Freaking awesome ops • Monitor and measure all the things! Tuesday, April 17, 12
  • 36. Development at Spotify • Development hubs in Stockholm, Göteborg and NYC • All in all, >220 people in tech • Very talented team • Hackdays and system owner days in each iteration • Hangs out on IRC • Growing and hiring Tuesday, April 17, 12
  • 39. Thank you Want to work at Spotify? http://www.spotify.com/jobs/ Tuesday, April 17, 12