Ed Schouten from Prodrive Technologies gave a talk on building a Kubernetes cluster for a large organization. He discussed Prodrive's move to using Kubernetes to provide a standardized development environment for all teams. This included setting up a multi-tenant Kubernetes cluster with simplified role-based access control, automatic namespace and network policies for groups, and resource quotas. Future work may include integrating Kubernetes into more of Prodrive's products and open source contributions.
Building a Kubernetes cluster for a large organisation 101
1. Slide 1 of 38
Reference: Meetup Kubernetes
a passion for technology
Date:
Reference:
Author(s):
Distribution:
Attendees marked with *
Template date: 06-07-2017
Template PN: 6001-1246-5511
Building a Kubernetes cluster
for a large organisation 101
Meetup Kubernetes
Ed Schouten
Prodrive Tech Talk
2018-06-27
2. Slide 2 of 38
Reference: Meetup Kubernetes
Prodrive Technologies
- Who are we and what do we do?
- What do I do?
Kubernetes
- What is it?
Kubernetes at Prodrive Technologies
- Why do we want it?
- What kind of automation have we designed to go along?
- What are we going to work on next?
Outline of today’s talk
3. Slide 3 of 38
Reference: Meetup Kubernetes
Prodrive Technologies
4. Slide 4 of 38
Reference: Meetup Kubernetes
Ready-to-use products
Technology solutions
Manufacturing services
Design of electronics, software
and mechanics
Manufacturing
Added value services
We focus on autonomous growth and a solid preservation of our
company culture
5. Slide 5 of 38
Reference: Meetup Kubernetes
Company presence
Prodrive Technologies Netherlands (HQ)
- R&D
- Manufacturing
- Service
- Sales
China
- Sales
- Supply Chain
- Manufacturing
- Service
USA
- Sales
- Manufacturing (Q4-2018)
Germany
- Sales
Israel
- Sales
Expansion of NL facility (Q2-2018)
USA manufacturing (2019)
Suzhou facility
6. Slide 6 of 38
Reference: Meetup Kubernetes
‣ Area- and line-scan cameras
‣ Integrated optics, scintillators, FOPs
‣ VIS, NIR, DUV, LWIR, E-beam
‣ Particle Measurement Systems
‣ PMP (Prodrive Motion Platform)
‣ High Performance Actuators
‣ Various motion drives
‣ Mechatronic systems
‣ Variable frequency drives
‣ Wireless Energy (20kW)
‣ Power Converters (100kW+)
‣ Intelligent Power Distribution
‣ Motion control platforms
‣ Image processing platforms
‣ Industrial PC/server
‣ Home automation systems
‣ Gateways and cloud solutions
‣ Smart Thermostats
‣ Professional Audio/Video Distribution
‣ Controllers and I/O
‣ Displays, touch screens, HMI
‣ Automated production
equipment
‣ AGV’s
7. Slide 8 of 38
Reference: Meetup Kubernetes
Part of the IT Services team.
• Responsible for Linux-based infrastructure (source control, builds).
• Involved in shaping the future direction of our internal IT.
Part of the High-End Computing (HEC) program.
• Software architecture for our industrial computing products.
What do I do at Prodrive Technologies?
8. Slide 9 of 38
Reference: Meetup Kubernetes
Kubernetes
9. Slide 10 of 38
Reference: Meetup Kubernetes
Kubernetes is…
• A cluster management/orchestration tool.
• Based on the design of Borg (Google).
• Capable of scheduling Docker containers.
• Open Source: Apache 2.0 licensed.
Kubernetes
10. Slide 11 of 38
Reference: Meetup Kubernetes
Kubernetes in a nutshell
11. Slide 12 of 38
Reference: Meetup Kubernetes
kubectl: command line tool for managing clusters.
• Talks to the API server over a documented REST API.
• Exposes the cluster as objects of different classes.
• Each object has a JSON/YAML configuration/state.
• Easy to learn syntax: kubectl ${verb} ${class} ${instance}
• ‘Bottom half’ is also available as a Golang library.
Managing a Kubernetes cluster: kubectl
12. Slide 13 of 38
Reference: Meetup Kubernetes
• Node: Linux (or Windows) server that is part of the cluster.
• Pod: Unit of work that is scheduled by Kubernetes on a node.
• Consists of one or more containers. (‘Sidecars’)
• Deployment: Template for starting a fixed number of identical pods.
• Good for starting stateless web frontends.
• Service: Places a (TCP) load balancer address in front of pods.
• Other interesting ones: StatefulSet, DaemonSet, CronJob.
Commonly used object classes
13. Slide 14 of 38
Reference: Meetup Kubernetes
• List all nodes (i.e., servers) in a cluster:
$ kubectl get nodes
• Start an Nginx pod that is automatically restarted/migrated:
$ kubectl create deployment nginx --image=nginx:1.15
• Edit the deployment to increase the number of replicas:
$ kubectl edit deployment
• Delete a single pod that is misbehaving (and create a new one):
$ kubectl delete pod nginx-9db896598-pj5lz
Example invocations of kubectl
14. Slide 15 of 38
Reference: Meetup Kubernetes
Objects in the cluster are partitioned into namespaces.
• No referencing across namespaces.
• Deployments start pods in the same namespace.
• Services can only match pods in the same namespace.
• Use case #1: Production vs. development setups.
• kubectl edit deployment -n wiki-prod nginx
• kubectl edit deployment -n wiki-dev nginx
• Reduces risk of accidentally mixing up traffic.
• Use case #2: Multi-tenant clusters.
• Exception: nodes don’t have a namespace.
Namespaces
15. Slide 16 of 38
Reference: Meetup Kubernetes
From outside of the cluster:
• Basic authentication (username & password).
• SSL client certificates.
• OpenID Connect.
From within the cluster:
• Every pod runs under a ServiceAccount.
• Built-in CA generates SSL client certificates for pods.
• Accessible through /var/run/secrets/… inside containers.
API server authentication
16. Slide 17 of 38
Reference: Meetup Kubernetes
Kubernetes implements Role-Based Access Controls (RBAC):
• Subject: either an external user or a ServiceAccount.
• Roles: gives a name to a set of rights: ${verb} ${class}.
• ‘release-pusher’ = {‘edit deployments’, ‘get pods’}.
• RoleBinding: grants a subject access to a role in a namespace.
• Grant ‘jenkins’ the role ‘release-pusher’ in namespace ‘wiki-prod’.
• ClusterRoleBinding: grants access to a role in all namespaces.
• Grant ‘prometheus‘ the role ‘node-viewer’.
API server authorisation
17. Slide 18 of 38
Reference: Meetup Kubernetes
Authorisation class diagram
18. Slide 19 of 38
Reference: Meetup Kubernetes
Kubernetes at
Prodrive Technologies
19. Slide 20 of 38
Reference: Meetup Kubernetes
Whole fleet of systems to which people currently SSH/X11/….
• Every project has different requirements for development tools.
• Many tickets for IT to install software.
• Separate systems to deal with contradicting version requirements.
• Strong imbalance in system utilisation.
• Lack of reproducible build environments when reviving old projects.
The existing Prodrive development infrastructure
20. Slide 21 of 38
Reference: Meetup Kubernetes
1. IT creates a Docker image called ‘Prodrivian’.
- Debian with common Prodrive configuration on top.
2. Development team inherits and creates a custom Docker image.
- Adds project specific tools (cross compilers, commercial tools) on top.
3. Development team fires up containers on Kubernetes.
- For interactive use.
- Through CI systems for automated builds.
The future Prodrive development infrastructure?
21. Slide 22 of 38
Reference: Meetup Kubernetes
Goal:
• Develop a centralised cluster usable by all developers.
• For both batch/interactive use, but also for running services.
Problem:
• Lots of material on using single-tenant Kubernetes clusters.
• Some material on multi-tenant clusters.
• No material on easy, maintainable multi-tenant clusters.
Crux
22. Slide 23 of 38
Reference: Meetup Kubernetes
Problem: Full Kubernetes RBAC is too hard to get right.
Solution: Subset it to a simplified authorisation model.
• Every employee has a personal namespace.
• Rights for user == rights of namespace’s default service account.
• kubectl on your laptop behaves the same way as on the cluster.
• Employees can create groups.
• Every group automatically gets a namespace in the cluster.
• Group members have read/write access to the namespace.
• Transitive: edsch ➜ it-services-linux ➜ wiki-{prod,qa,dev}
Authorisation
23. Slide 24 of 38
Reference: Meetup Kubernetes
Group management tool
24. Slide 25 of 38
Reference: Meetup Kubernetes
$ kubectl describe clusterrole corpdb:group-membership
…
configmaps [create delete get list patch update watch]
cronjobs.batch [create delete get list patch update watch]
deployments.apps [create delete get list patch update watch]
deployments.extensions [create delete get list patch update watch]
…
$ kubectl describe rolebinding -n wiki-prod corpdb:group-membership
…
User https://login.prodrive-technologies.com/#edsch
ServiceAccount default edsch
…
RBAC configuration for a group
25. Slide 26 of 38
Reference: Meetup Kubernetes
Problem: By default, all pods can connect to each other.
• Ideally, should be restricted by having auth at the application level
(e.g., let everything use credentials or SSL client/server certs).
• Too complex to realise at our scale for now.
• (Interesting project: Istio)
Solution: Also use groups to automatically set up in-cluster firewalling.
• Transitive: edsch ➜ it-services-linux ➜ wiki-{prod,qa,dev}
Network policies
26. Slide 27 of 38
Reference: Meetup Kubernetes
$ kubectl describe namespace edsch
Labels: …
corpdb-can-access-namespace-wiki-prod=true
…
$ kubectl describe networkpolicy -n wiki-prod corpdb-allow-members-ingress
Spec:
Allowing ingress traffic:
To Port: <any> (traffic allowed to all ports)
From NamespaceSelector: corpdb-can-access-namespace-wiki-prod=true
…
Network policy for a group
27. Slide 28 of 38
Reference: Meetup Kubernetes
Problem: Nodes in cluster become overloaded due to containers
consuming too much CPU & memory.
Solution: Configure resource limits.
• Give employees a ‘freebie quota’.
• For group namespaces, have a resource allocation procedure.
Resource quotas #1
28. Slide 29 of 38
Reference: Meetup Kubernetes
$ kubectl describe resourcequota -n edsch corpdb
Name: corpdb
Namespace: edsch
Resource Used Hard
-------- ---- ----
requests.cpu 0 1
requests.memory 0 1Gi
Resource quota for a namespace
29. Slide 30 of 38
Reference: Meetup Kubernetes
Problem: Providing resource limits for containers is optional.
Omitting them will create containers that are exempt from quotas.
Solution: Automatically inject resource limits for such containers.
• Provided LimitRanger admission controller can do this.
Resource quotas #2
30. Slide 31 of 38
Reference: Meetup Kubernetes
$ kubectl describe limitrange -n edsch corpdb
Name: corpdb
Namespace: edsch
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio
---- -------- --- --- --------------- ------------- -----------------------
Container memory 16Mi - 16Mi 64Mi 4
Container cpu 10m - 10m 40m 4
LimitRange configuration
31. Slide 32 of 38
Reference: Meetup Kubernetes
Problem: We want to allow people to create services that are available
inside the cluster, but oftentimes not ones that are public.
• Type ‘ClusterIP’ vs. ‘LoadBalancer’.
• Kubernetes RBAC is too weak to solve this.
Solution: Set up a ValidatingAdmissionWebhook.
• Lets the API server send ‘kubectl create service’ requests through a
helper process using HTTP calls.
• Helper process can accept/reject the request.
• Groups that should create load balancers can be whitelisted.
Preventing security foot-shooting
33. Slide 34 of 38
Reference: Meetup Kubernetes
Problem: How can users easily configure kubectl on their system?
• Setting it up initially (API server hostname, default namespace).
• Obtaining a temporary access token.
Solution: Build a ‘kubeaccess’ web page.
• Generates commands that can be copy-pasted to a terminal.
• Generates OpenID Connect tokens with 20 hour validity on the fly.
• Relies on an OpenID Identity Provider on the same dataset.
External access
34. Slide 35 of 38
Reference: Meetup Kubernetes
Kubeaccess web page
35. Slide 36 of 38
Reference: Meetup Kubernetes
CorpDB overview
36. Slide 37 of 38
Reference: Meetup Kubernetes
• Distributed storage: Ceph/Rook.
• Extend CorpDB to track storage resource limits as well.
• Prodrive globalisation effort: automatically set up replicated storage?
• Continuous Build/Integration: Let Bamboo spawn pods.
• Atlassian offers a per-build-container extension.
• Integrate Kubernetes into High-End Computing products.
• Scale of computing needed in industrial automation is increasing.
• Work along with the Open Source community.
• Contribute in both directions (e.g., Traefik OIDC support).
Future work
37. Slide 38 of 38
Reference: Meetup Kubernetes
• Lots of awesome projects at Prodrive.
• Nice company culture.
• Chat with us after this talk!
• https://prodrive-technologies.com/careers/
Help!
38. Slide 39 of 38
Reference: Meetup Kubernetes
a passion for technology
Prodrive Technologies
T +31 40 2676200
E contact@prodrive-technologies.com
I www.prodrive-technologies.com