SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Clustree production on GKE
Romain Vrignaud - VP Engineering - romain@clustree.com
We
• Full python microservices (~30 / env)
• Elasticsearch
• REST API for synchronous calls
• Rabbitmq for asynchronous calls
Clustree stack
• 12 factor
• Git commit as docker tag
• docker-compose and kubernetes
• develop branch vs master branch
Engineering practices
• ~ 12 people in tech team
• Developers in charge of their app up to production
• Infrastructure team provides:
• Tools
• Guidelines
• Expertise
Organization
• Why GKE ?
• GKE Cluster (15 nodes)
• ~280 pods
• 200GB / 225 GB of memory allocated
• Inside Kubernetes
• All stateless applications for all environments
• All stateful applications for integration environments
• Outside Kubernetes
• staging / production stateful app
• Infrastructure
• Spark
Infrastructure
• Namespaces to isolate environments
• RC everywhere (even for single pods)
• Service discovery
• Secrets
• Volumes
• Jobs
Features used
Production tooling
• 1 heapster with Google Cloud Monitoring sink (not used)
• 1 heapster with an influxdb sink
• telegraf
• prometheus inputs for all nodes
• custom python script to gather cluster wide metrics
• 1 telegraf instance running on each node
• Influxdb 0.10.x and Grafana
Metrics
• 1 fluentd per node to push to Google Cloud Logging (not used)
• 200 MB per node
• 1 logstash per node to push to elasticsearch
• 500 MB per node …
• kubernetes plugin (container name, namespaces, pod, rc, etc..)
• interlaced logs => structured logs !
• OOM pattern detection (ram limits are difficult to find !)
Logging system
• Auto healing cluster
• Pods hooks + nagios + consul + consul-template => failed
• Sentry
• Still need to decide push vs pull monitoring
• pull : prometheus
• push : kapacitor / watcher
• Google Cloud Monitoring ?
• How to monitor kubernetes events ?
Monitoring
• Migration 1.0 -> 1.1 : DNS discovery outage (#18171)
• Loss of 1/3 of node cluster (… yesterday) (#13346)
• Volumes (#14642)
• Memory pressure on nodes
A handful of issues
# references github issues numbers
• private services access outside cluster (#14545)
• No public IP from public Load Balancers
• IAM
• network isolation
• kubectl exec (timeout / TERM) (#12179, #13585)
• node resizing on GKE
A few painful points
• Spawn a new environment in a few minutes (to test a new
feature)
• Super easy rolling-upgrade and rollback
• Fully declarative infrastructure
But a lot of joy !
• Ubernetes
• ScheduledJob
• PetSet
• HPA on custom metrics
• Network policy
The future is bright
• Google-container ML
• Slack
• Github (Read the proposals !)
• GKE support
Great community
• So much to do / discover / learn but really exciting !
• Docker evolutions are way less important for us than kubernetes new
features
• Kubernetes is really a powerful abstraction and enable team autonomy
and velocity
• Still a young project / ecosystem but evolving really quickly
Conclusion
Thank you
• Logstash kubernetes plugin : https://github.com/vaijab/logstash-filter-kubernetes
• Network policy : http://www.projectcalico.org/a-sneak-peek-at-kubernetes-policy/
• Kubernetes proposals : https://github.com/kubernetes/kubernetes/tree/master/docs/proposals
References

Weitere ähnliche Inhalte

Was ist angesagt?

Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionStefan Schimanski
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)lestrrat
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with KubernetesDeivid Hahn Fração
 
Kubernetes intro public - kubernetes meetup 4-21-2015
Kubernetes intro   public - kubernetes meetup 4-21-2015Kubernetes intro   public - kubernetes meetup 4-21-2015
Kubernetes intro public - kubernetes meetup 4-21-2015Rohit Jnagal
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupStefan Schimanski
 
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Mario Ishara Fernando
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesTimothy St. Clair
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes IntroductionEric Gustafson
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewLei (Harry) Zhang
 
Kubernetes automation in production
Kubernetes automation in productionKubernetes automation in production
Kubernetes automation in productionPaul Bakker
 
Kubernetes architecture
Kubernetes architectureKubernetes architecture
Kubernetes architectureJanakiram MSV
 
Kubernetes meetup 101
Kubernetes meetup 101Kubernetes meetup 101
Kubernetes meetup 101Jakir Patel
 
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondCoreOS
 
Platform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerPlatform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerJulian Strobl
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerSteve Watt
 
Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Sam Zheng
 

Was ist angesagt? (20)

Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and Introduction
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
 
Kubernetes intro public - kubernetes meetup 4-21-2015
Kubernetes intro   public - kubernetes meetup 4-21-2015Kubernetes intro   public - kubernetes meetup 4-21-2015
Kubernetes intro public - kubernetes meetup 4-21-2015
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
 
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to Kubernetes
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
 
Kubernetes automation in production
Kubernetes automation in productionKubernetes automation in production
Kubernetes automation in production
 
Kubernetes architecture
Kubernetes architectureKubernetes architecture
Kubernetes architecture
 
Kubernetes meetup 101
Kubernetes meetup 101Kubernetes meetup 101
Kubernetes meetup 101
 
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
 
Platform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerPlatform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and Docker
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and Docker
 
Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24
 

Ähnlich wie Clustree production on GKE with 280 pods across 15 nodes

Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018David Stockton
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Fwdays
 
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltStack
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Martin Schmidt
 
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?panagenda
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudDatadog
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE wayShawn Zhu
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices worldKarol Chrapek
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017Jakob Karalus
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesQAware GmbH
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014Puppet
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsRightScale
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPassPeter Vandenabeele
 
Cloud native applications
Cloud native applicationsCloud native applications
Cloud native applicationsreallavalamp
 
Queick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonQueick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonRyota Suenaga
 

Ähnlich wie Clustree production on GKE with 280 pods across 15 nodes (20)

Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
 
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?
 
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE way
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices world
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in Kubernetes
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud Applications
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Cloud native applications
Cloud native applicationsCloud native applications
Cloud native applications
 
Ansible docker
Ansible dockerAnsible docker
Ansible docker
 
Queick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonQueick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for Python
 
Moby KubeCon 2017
Moby KubeCon 2017Moby KubeCon 2017
Moby KubeCon 2017
 

Kürzlich hochgeladen

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Kürzlich hochgeladen (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

Clustree production on GKE with 280 pods across 15 nodes

  • 1. Clustree production on GKE Romain Vrignaud - VP Engineering - romain@clustree.com
  • 2. We
  • 3. • Full python microservices (~30 / env) • Elasticsearch • REST API for synchronous calls • Rabbitmq for asynchronous calls Clustree stack
  • 4.
  • 5. • 12 factor • Git commit as docker tag • docker-compose and kubernetes • develop branch vs master branch Engineering practices
  • 6.
  • 7. • ~ 12 people in tech team • Developers in charge of their app up to production • Infrastructure team provides: • Tools • Guidelines • Expertise Organization
  • 8. • Why GKE ? • GKE Cluster (15 nodes) • ~280 pods • 200GB / 225 GB of memory allocated • Inside Kubernetes • All stateless applications for all environments • All stateful applications for integration environments • Outside Kubernetes • staging / production stateful app • Infrastructure • Spark Infrastructure
  • 9.
  • 10. • Namespaces to isolate environments • RC everywhere (even for single pods) • Service discovery • Secrets • Volumes • Jobs Features used
  • 12. • 1 heapster with Google Cloud Monitoring sink (not used) • 1 heapster with an influxdb sink • telegraf • prometheus inputs for all nodes • custom python script to gather cluster wide metrics • 1 telegraf instance running on each node • Influxdb 0.10.x and Grafana Metrics
  • 13.
  • 14.
  • 15. • 1 fluentd per node to push to Google Cloud Logging (not used) • 200 MB per node • 1 logstash per node to push to elasticsearch • 500 MB per node … • kubernetes plugin (container name, namespaces, pod, rc, etc..) • interlaced logs => structured logs ! • OOM pattern detection (ram limits are difficult to find !) Logging system
  • 16. • Auto healing cluster • Pods hooks + nagios + consul + consul-template => failed • Sentry • Still need to decide push vs pull monitoring • pull : prometheus • push : kapacitor / watcher • Google Cloud Monitoring ? • How to monitor kubernetes events ? Monitoring
  • 17.
  • 18. • Migration 1.0 -> 1.1 : DNS discovery outage (#18171) • Loss of 1/3 of node cluster (… yesterday) (#13346) • Volumes (#14642) • Memory pressure on nodes A handful of issues # references github issues numbers
  • 19. • private services access outside cluster (#14545) • No public IP from public Load Balancers • IAM • network isolation • kubectl exec (timeout / TERM) (#12179, #13585) • node resizing on GKE A few painful points
  • 20. • Spawn a new environment in a few minutes (to test a new feature) • Super easy rolling-upgrade and rollback • Fully declarative infrastructure But a lot of joy !
  • 21. • Ubernetes • ScheduledJob • PetSet • HPA on custom metrics • Network policy The future is bright
  • 22. • Google-container ML • Slack • Github (Read the proposals !) • GKE support Great community
  • 23. • So much to do / discover / learn but really exciting ! • Docker evolutions are way less important for us than kubernetes new features • Kubernetes is really a powerful abstraction and enable team autonomy and velocity • Still a young project / ecosystem but evolving really quickly Conclusion
  • 25. • Logstash kubernetes plugin : https://github.com/vaijab/logstash-filter-kubernetes • Network policy : http://www.projectcalico.org/a-sneak-peek-at-kubernetes-policy/ • Kubernetes proposals : https://github.com/kubernetes/kubernetes/tree/master/docs/proposals References