Niko Kurtti talks about the challenges Shopify saw in moving from a traditional host-based infrastructure to a cloud native one, moving not only their core app to Kubernetes but also hundreds of other apps at the same time. He focuses on the cluster tooling solutions they've built, such as controllers, cluster creators, and deploy tools. Filmed at qconnewyork.com.
Niko Kurtti is a production engineer at Shopify. He started out as a software developer doing web apps with Java, but since then fell in love with container technologies. He was part of the effort to roll out Docker in production at Shopify in 2014 and is still working around the same domain, but today the focus is on Shopify’s internal PaaS based on k8s.
2. InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
shopify-kubernetes
3. Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
9. • Manual / Artisanal processes
• Slow things/processes that make people wait
• Rusty knobs that don’t work when needed
• Wobbly things that don’t work first-time, every-time
Things that won’t scale
10. • Tested infrastructure
• Automation that works as expected, every time
• Give devs ability to self-serve with safety
• Train people to be experts in the systems they operate
Things that will scale
19. • Best traction of the open source projects
• Platform agnostic
• One of the most extendable solutions
• Written in Go
• Offered as a service in Google Cloud
Why Kubernetes?
20.
21. • How to specify your apps runtime
• How to build your app
• How to deploy your app
• How to set up your dependencies
Building blocks of running an application
22. Creating application environment
• Web UI for developers
• Application catalog
• Generation of Kubernetes manifests
• Configures builds and CI
Services DB
• Go app living on clusters
• Creates k8s namespace
• Creates encryption keys
• Service accounts
Groundcontrol
23.
24. • Buildkite acts as coordinator for Pipa
• Pipa agent builds Docker images
• Herokuish, Dockerfile, or custom
build pipelines
Buildkite + PIPA
26. • Pass/fail results on deploys
• Pre-deploy for ConfigMap/Secrets
• Protecting namespaces
• Pluggable
kubernetes-deploy
27. • Create DNS records
• Fetch SSL certificates
• Create buckets, databases, services etc
• Set user editable quotas
• Set security rules
• Delete bad nodes
Cloudbuddies
32. • API's are well documented (if not super stable)
• Client libraries are high quality (at least on client-go)
• We can both extend functionality of current concepts
(deployments, endpoints etc) but also create our own (CRDs)
• Distributed systems primitives (leader election, latches ...)
• These apps are be pure Go so they are unit testable, running
and deployed as normal apps etc.
Extending k8s
33. An active state reconciliation process
• Watch desired and current state
• Try to mutate desired to current
Kubernetes Controllers
for {
desired := getDesiredState()
current := getCurrentState()
if desired != current {
reconc(desired, current);
}
}
34. Workflow is always the same
• Authenticate to the cluster
• Create a watcher for events of specified type
• Implement functions to handle ADD/DELETE/UPDATE
• Profit!
Writing a controller
35. • Extend native k8s objects with your own abstractions
• Eg. Memcache, Redis, Mail, MyFancyThingy
• Used by your own controllers to consume configuration
params and doing something based on it
• Just like normal k8s resources like Deployment or Service
Custom Resource Definitions
43. "The turn around time to getting an app
running on cloud platform is unreal,
you folks have really nailed it."
44. • How does my builds/deploys/everything work?
• How do I scale ?
• How do I debug?
• Is this worth it?
Challenges for developers
45. • Giving up control over underlying infrastructure
• Container-only world and new tooling
• Customising the one platform to fit all needs
• Constant pressure to migrate apps
• Learning
Challenges for SREs
46. • Target hitting eg. 80% of use cases
• Create patterns and hide complexity (but don’t restrict)
• Educate
• Get people excited
• Be conscious of vendor lock in
Takeaways for building
your own PaaS
47. • Polishing our tooling
• Making sure our platform keeps scaling and stable
• Optimising cost
• Multi cloud
• Service mesh
Future