Learn how Spotify uses Puppet to manage the large and growing amount of servers used to stream music to millions of users. The presenter will also give an introduction to other technologies used to power Spotify.
Erik Dalén
System Engineer, Spotify
Erik is a system engineer within the site reliability engineering at Spotify with a focus on Puppet and automation. He is also a community contributor to Puppet and author of the puppetdbquery tool. Can be found at IRC and Github as dalen.
2. 2Section name
● Over 24 million monthly active users
● Launched in 28 countries
● Over 20 million songs
● More than 1 billion playlists
Growing quickly
Spotify
3. SystemEngineer in Site Reliability Engineering at Spotify
Operational systemowner for Puppet, playlist systemand Cassandra
Community contributor to Puppet
whoami
4. ● Morethan 450 changes per month
● 220committers to ourPuppet git repository
● 325puppet modules
Codereview by SRE team using Gerrit
Puppet users since 3 years
5. Puppet Infrastructure
● Roughly 5500
nodes
● 3 different Puppet
installations
● Each with their own
CAand PuppetDB
● One or more
puppetmasters per
data centre
● Run using cron
6. Gitbranch = = Puppet environment
Everyonecanpush to private branches and run puppet against those
Codereview mandatory to pushto “production” branch
Puppet Infrastructure
7. Built as Debian Packages
Deployed using Puppet
Backend services
8. ●Client connections are proxied through the accesspoints
●Most other backend services are stateless
●Storage in Cassandra, PostgreSQL or Tokyo Cabinet
Architecture overview
Accesspoint
Service 1 DB
Service 2 DB
Service 3
9. ●Puppet module for the service
●Deployed and tested in test environment
●Hardware requested from SRE team and service deployed in
production
Backend service deployment
10. UsingSRV records to discoverservices
Puppet module dalen-dnsquery canbe used tolookup them from inside
Puppet manifests.
Service Discovery
11. Atthe moment using different ENCs in different Puppet installations
Will be switching to using Hiera for node classification
Node Classification
13. 25pull requests from Spotify merged to corePuppet andFacter
inlast 12 months.
Many improvements to puppetlabs modules sent upstream.
Puppet contributions
15. Finding nodes using dalen-puppetdbquery:
$ puppet query nodes ‘Class[Cassandra]{version=‚1.1‛} and site=lon’
$ puppet query nodes ‘processorcount > 16 and manufacturer ~‛Dell.*‛’
github.com/dalen/puppet-puppetdbquery
Querying PuppetDB
16. UsePuppetDB as a backendto the datamapper ORM
Node.get(’foo.example.com’).facts.each do |fact|
puts “#{fact.name}: #{fact.value}”
end
dm-puppetdb-adapter
17. APuppet face to list files managed by puppet
# puppet ls /etc/systemd/system
nagios-nrpe-server.service
declared in /etc/puppet/environments/production/modules/systemd/manifests/unit.pp:15
content from a "content" parameter
puppet ls
18. ● Splitting the repo out
● RemoveSRE review requirement on large parts
● Support testing using vagrant
● Building images using amasterless puppet apply
The future
19. Consists of anode terminus and a forge implementation
Builds a per node environment dynamically on demand
Will be open sourcedReal Soon Now™
Spikor