Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Truemotion Adventures in Containerization

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Swarm migration
Swarm migration
Wird geladen in …3
×

Hier ansehen

1 von 34 Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Truemotion Adventures in Containerization (20)

Anzeige

Aktuellste (20)

Truemotion Adventures in Containerization

  1. 1. Adventures in Containerization
  2. 2. Ryan Hunter ● SRE Lead @ TrueMotion ○ First Backend Engineer 3 years ago ○ Moved to operations in search of new challenges! ● I’m an automation fanatic! ● When I’m not working to make on-call a thing of the past, I enjoy: ○ Diving ○ Hiking ○ Building drones and other useless contraptions in my basement
  3. 3. Pets Cattle A Herd Servers as... Infrastructure Evolves with a Company
  4. 4. Why did we Switch to Containers? November 2016 3.05%
  5. 5. Why did we Switch to Containers? ● Debian based deploys o Great so long as all you dependencies were in debians too ● Ansible: Build the server from scratch o External dependency hell ● Neither flexible or reliable ● Minimum provisioning size was too large ● A more flexible build artifact ● Decouple instance size from application software ● A common, preloaded AMI could be used to run all (most) services Where we started... Where we wanted to go...
  6. 6. What did Docker give us? ●A flexible, portable, runtime artifact ■ Described runtime requirements ■ Memory/CPU requirements ●An ecosystem of tools to manage, version, and develop these containers
  7. 7. What Docker didn’t give us ●Really nice match for stateless services ●Stateful containers ARE possible, but significantly complicates scheduling
  8. 8. What Docker didn’t give us ●How do you… ...these containers? ○ schedule ○ provision ○ discover (and monitor) ○ configure
  9. 9. Schedule
  10. 10. Scheduling
  11. 11. Scheduling - Why ECS? ● Very basic (unopinionated) ● Amazon Support ● Amazon platform integration ○ IAM Roles ○ Cloudformation
  12. 12. Provision
  13. 13. Provision
  14. 14. Provision - Why Cloudformation ●Well integrated with AWS ●We can provision both docker containers and infrastructure in one template (because we use ECS) ●AWS Supported ●Parameter Validation
  15. 15. Provision - Why Cloudformation Application CodeDependencies Docker Container Cloudformation Template Lambda Code Lambda Zip Package Versioned Cloudformation Template Deployed Cloudformation Stack Develop Build Package Deploy Stamp Template Each Service is deployed via a Cloudformation stack
  16. 16. Provision - Why Cloudformation stacks: - name: prod template: prod-env region: us-east-1 version: prod parameters: EIPList: <redacted> EnvCIDR: 16 EnvMaturity: prod PagerDutyKey: {{ pagerduty_key }} RDSPassword: {{ rds_password }} - name: prod-etl template: dw-etl region: us-east-1 version: "92" parameters: DesiredInstanceCount: 6 EnvironmentName: prod EnvMaturity: prod ... ● Each service pushes a template with a name and a version to S3 ● That template has all the application dependencies hardcoded (docker container version, lambdas, etc) ● Each environment has its own repo containing a deploy.yaml
  17. 17. Discover (and monitor)
  18. 18. Discover (and monitor) ●We use Registrator to join new containers to consul ●Custom version that supports services without exposed ports ●Loadbalancers (internal and external) are configured via consul to route traffic to the appropriate container
  19. 19. Monitor (Is my service up?) ●Consul Docker exec health checks are very powerful ●Docker also has a new health check API! ●Configured via Registrator Consul Agent My Service Container health-check.py My Service Check Docker Host
  20. 20. Monitor (Logging) ●Sumo provides a docker log collector ●Wrote a script that fetches containers and assigns source category based on the container type ●Runs as a container on each docker host _sourceCategory = <Environment name>/<Service Name>/<Environment Maturity>
  21. 21. Monitor (Whitebox) ●Traffic - Requests per second, trips per second ●Errors - Rate of status codes and error logs ●Latency - How long does the service take to do a unit of work ●Saturation - How do I know I need to scale out? ●Consul Check (is it up?)
  22. 22. Monitor (Whitebox) ●We have very similar services ■ Webservice (http) ■ Data pipeline (etl, trip processing) ●TruMonitor library ■ Common monitoring tools library ■ UNVERSIONED - controversial
  23. 23. Configure
  24. 24. Last Mile Configuration ●Cloudformation provides a parameter interface ■ Pass on to container via Environment Variables ■ AWS infrastructure can be passed in directly ●Per Company Configs ■ Consul K/V + consul-template stacks: - name: prod template: prod-env region: us-east-1 version: prod parameters: EIPList: <redacted> EnvCIDR: 16 EnvMaturity: prod PagerDutyKey: {{ pagerduty_key }} RDSPassword: {{ rds_password }} ...
  25. 25. Consul + Consul Template Consul Cluster Consul Template Config File Application Process Exec PublishEntrypoint Docker Container ● Great for configs to complex for params ● Git2consul will sync configs in VCS with cluster ● Parameter validation matters! ■ Wrote SOME test coverage using JSONSchema
  26. 26. What about secrets storage? ●Initially used KMS Encrypted values decrypted with consul- template plugin ●DO NOT write consul template plugins with blocking/high latency calls
  27. 27. What we did instead ●Borrowed from the ansible-vault concept ●Encrypted “privates” file inside environment repo ●Populate cloudformation parameters using Jinja2 ●Works well enough… will not work for per company config values
  28. 28. Conclusions ●Developer training is hard: example repos work REALLY well ●Secrets management requires some forethought ●Jenkins Pipelines is very powerful… ●Spend time automating creating and removing ECS nodes ●Auto Scaling a docker cluster is nuanced!
  29. 29. Want to Help? We’re Hiring! ●I’m looking for backend software engineers with a passion for automation ●Talk to me! ●… or https://gotruemotion.com/careers/
  30. 30. THANK YOU!
  31. 31. Reference
  32. 32. ECSScheduler Consul Registrator Public LB (Nginx) Private LB (Nginx) Worker Nodes Public API Internal API ASG High Level Arch
  33. 33. EC2 Instance Today’s Pipeline Build Scripts Debian Pip Gemfury Ansible EC2 Instance provision.py ● Inflexible ● Jobs managed through UI ● Restricted versioning convention ● Supports only specific distro/version ● Pip doesn’t enforce dependencies for crap! ● Gemfury goes down! ● Instance config is in a separate repo from service code ● We can’t version configuration against services ● Lots of tight coupling between service roles ● Fails a LOT! ● Services tied to instance ● Instance type for a service defined globally ● Manual process to provision instances and other AWS resources ● AWS instance provisioning is entirely manual ● Difficult to automate ● Too easy to create and forget about instances
  34. 34. EC2 Instance Cloudformation/Docker Pipeline Jenkins Pipelines Docker CF Template CF Pipeline ECS Cluster Environment Config ● Resources defined per service ● Configs validated per service ● Leverage docker as a common runtime framework ● Build process definition lives in service repo ● Common processes can be defined via global library ● Use docker to provide build dependencies ● Cloudformation templates are used as the deployment artifact ● Environment updates via code review ● Tight coupling between resource requirements and resources provisioned ● Ability to use spot fleet/spot instances

×