Challenges with monolithic software
Long
Build/Test/Release
Cycles
(who broke the build?)
Operations
is a nightmare
(module X is failing,
who’s the owner?)
Difficult to
scale
New releases
take months
Long time to add
new features
Architecture is
hard to maintain
and evolve
Lack of innovation
Frustrated customers
Lack of agility
Challenges with monolithic software
Long
Build/Test/Release
Cycles
(who broke the build?)
Operations
is a nightmare
(module X is failing,
who’s the owner?)
Difficult to
scale
New releases
take months
Long time to add
new features
Architecture is
hard to maintain
and evolve
Lack of innovation
Frustrated customers
Lack of agility
Challenges with monolithic software
Long
Build/Test/Release
Cycles
(who broke the build?)
Operations
is a nightmare
(module X is failing,
who’s the owner?)
Difficult to
scale
New releases
take months
Long time to add
new features
Architecture is
hard to maintain
and evolve
Lack of innovation
Frustrated customers
Lack of agility
“20080219BonMorningDSC_0022B” by Sunphol Sorakul . No alterations other than cropping. https://www.flickr.com/photos/83424882@N00/3483881705/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Monolith development lifecycle
releasetestbuild
delivery pipeline
app
(aka the“monolith”)developers
Photo by Sage Ross. No alterations other than cropping. https://www.flickr.com/photos/ragesoss/2931770125/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Evolving towards microservices
“IMG_1760” by Robert Couse-Baker. No alterations other than cropping. https://www.flickr.com/photos/29233640@N07/14859431605/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
“IMG_1760” by Robert Couse-Baker. No alterations other than cropping. https://www.flickr.com/photos/29233640@N07/14859431605/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Services communicate with
each other over the
network
“service-oriented
architecture
composed of
loosely coupled
elements
that have
bounded contexts”
Adrian Cockcroft (former Cloud Architect at Netflix,
now Technology Fellow at Battery Ventures)
“service-oriented
architecture
composed of
loosely coupled
elements
that have
bounded contexts”
Adrian Cockcroft (former Cloud Architect at Netflix,
now Technology Fellow at Battery Ventures)
You can update the services
independently; updating
one service doesn’t require
changing any other services.
“service-oriented
architecture
composed of
loosely coupled
elements
that have
bounded contexts”
Adrian Cockcroft (former Cloud Architect at Netflix,
now Technology Fellow at Battery Ventures)
Self-contained; you can
update the code without
knowing anything about the
internals of other
microservices
“Do one thing, and do it well”
“Swiss Army” by by Jim Pennucci. No alterations other than cropping. https://www.flickr.com/photos/pennuja/5363518281/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
“Tools” by Tony Walmsley: No alterations other than cropping. https://www.flickr.com/photos/twalmsley/6825340663/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
“Do one thing, and do it well”
= 50 million deployments a year
Thousands of teams
× Microservice architecture
× Continuous delivery
× Multiple environments
(5708 per hour, or every 0.63 second)
Principle 1
Micro-services only rely
on each other’s public API
“Contracts” by NobMouse. No alterations other than cropping.
https://www.flickr.com/photos/nobmouse/4052848608/
Image used with permissions under Creative Commons license 2.0, Attribution Generic
License (https://creativecommons.org/licenses/by/2.0/)
Micro-service A Micro-service B
public API public API
Principle 1: Microservices only rely on each other’s public API
DynamoDB
Micro-service A Micro-service B
public API public API
Principle 1: Microservices only rely on each other’s public API
(Hide Your Data)
DynamoDB
Micro-service A Micro-service B
public API public API
Principle 1: Microservices only rely on each other’s public API
(Hide Your Data)
Nope!
DynamoDB
Micro-service A Micro-service B
public API public API
Principle 1: Microservices only rely on each other’s public API
(Hide Your Data)
DynamoDB
Micro-service A
public API
Principle 1: Microservices only rely on each other’s public API
(Evolve API in backward-compatible way…and
document!)
storeRestaurant (id, name, cuisine)
Version 1.0.0
Micro-service A
public API
Principle 1: Microservices only rely on each other’s public API
(Evolve API in backward-compatible way…and
document!)
storeRestaurant (id, name, cuisine)
Version 1.0.0
storeRestaurant (id, name, cuisine)
storeRestaurant (id, name,
arbitrary_metadata)
addReview (restaurantId, rating, comments)
Version 1.1.0
Micro-service A
public API
Principle 1: Microservices only rely on each other’s public API
(Evolve API in backward-compatible way…and
document!)
storeRestaurant (id, name, cuisine)
Version 1.0.0
storeRestaurant (id, name, cuisine)
storeRestaurant (id, name,
arbitrary_metadata)
addReview (restaurantId, rating, comments)
Version 1.1.0
storeRestaurant (id, name,
arbitrary_metadata)
addReview (restaurantId, rating, comments)
Version 2.0.0
Principle 2
Use the right tool for the
job
“Tools #2” by Juan Pablo Olmo. No alterations other than cropping.
https://www.flickr.com/photos/juanpol/1562101472/
Image used with permissions under Creative Commons license 2.0, Attribution Generic
License (https://creativecommons.org/licenses/by/2.0/)
Principle 2: Use the right tool for the job
(Embrace polyglot persistence)
Micro-service A Micro-service B
public API public API
DynamoDB
Principle 2: Use the right tool for the job
(Embrace polyglot persistence)
Micro-service A Micro-service B
public API public API
DynamoDB
Amazon
Elasticsearch
Service
Principle 2: Use the right tool for the job
(Embrace polyglot persistence)
Micro-service A Micro-service B
public API public API
Amazon
Elasticsearch
Service
RDS
Aurora
Principle 2: Use the right tool for the job
(Embrace polyglot programming frameworks)
Micro-service A Micro-service B
public API public API
Amazon
Elasticsearch
Service
RDS
Aurora
Principle 2: Use the right tool for the job
(Embrace polyglot programming frameworks)
Micro-service A Micro-service B
public API public API
Amazon
Elasticsearch
Service
RDS
Aurora
• Prototype in less than 2 months
• Deployment time: hours minutes
• Each team can now develop its
respective applications
independently
Coursera
13 million users from 190 countries
1,000 courses from 119 institutions
Lambda
automatically
scales
Upload your code
(Java, JavaScript,
Python)
Pay for only the
compute time
you use
(sub-second
metering)
Set up your code to
trigger from other AWS
services, webservice
calls, or app activity
Create a unified
API frontend for
multiple
micro-services
Authenticate and
authorize
requests
Create a unified
API frontend for
multiple
micro-services
Authenticate and
authorize
requests
Handles DDoS
protection and
API throttling
Create a unified
API frontend for
multiple
micro-services
…as well as
monitoring,
logging, rollbacks,
client SDK
generation…
Authenticate and
authorize
requests
Handles DDoS
protection and
API throttling
Highly Scalable
• Inherently scalable
Secure
• API Gateway acts as “front door”
• Can add authN/authZ; or throttle API if needed
• S3 bucket policies
• IAM Roles for Lambda invocations
Cost-efficient
• Only pay for actual microservice usage
Principle 3
Secure Your Services
“security” by Dave Bleasdale. No alterations other than cropping.
https://www.flickr.com/photos/sidelong/3878741556/
Image used with permissions under Creative Commons license 2.0,
Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Principle 3: Secure Your Services
• Defense-in-depth
• Network level (e.g. VPC, Security Groups, TLS)
• Server/container-level
• App-level
• IAM policies
• Gateway (“Front door”)
• API Throttling
• Authentication & Authorization
• Client-to-service, as well as service-to-service
• API Gateway: custom Lambda authorizers
• IAM-based Authentication
• Token-based auth (JWT tokens, OAuth 2.0)
• Secrets management
• S3 bucket policies + KMS + IAM
• Open-source tools (e.g. Vault, Keywhiz)
API Gateway
Principle 4
Be a good citizen
within the ecosystem
“Lamington National Park, rainforest” by Jussarian. No alterations other than cropping.
https://www.flickr.com/photos/kerr_at_large/87771074/
Image used with permissions under Creative Commons license 2.0,
Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Hey Sally, we need to
call your micro-
service to fetch
restaurants details.
Sure Paul. Which APIs you
need to call? Once I know
better your use cases I’ll give
you permission to register
your service as a client on
our service’s directory entry.
Micro-service A Micro-service B
public API public API
Principle 4: Be a good citizen within the ecosystem
Principle 4: Be a good citizen within the ecosystem
(Have clear SLAs)
Restaurant
Micro-service
15 TPS100 TPS5 TPS20 TPS
Before we let you call
our micro-service we
need to understand
your use case, expected
load (TPS) and accepted
latency
…and many,
many
others!
Distributed monitoring and tracing
• “Is the service meeting its SLA?”
• “Which services were involved in a request?”
• “How did downstream dependencies perform?”
Shared metrics
• e.g. request time, time to first byte
Distributed tracing
• e.g. Zipkin, OpenTracing
User-experience metrics
Principle 4: Be a good citizen within the ecosystem
(Distributed monitoring, logging and tracing)
Principle 5
More than just
technology transformation
“rowing on the river in Bedford” by Matthew Hunt. No alterations other than cropping.
https://www.flickr.com/photos/mattphotos/19189529/
Image used with permissions under Creative Commons license 2.0,
Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
“Any organization that designs a system will
inevitably produce a design whose structure is
a copy of the organization’s
communication structure.”
Melvin E. Conway, 1967
Conway’s Law
Decentralize governance and data management
Image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Decentralize governance and data management
Image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Silo’d functional teams silo’d application architectures
Image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Cross functional teams self-contained services
Image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Cross functional teams self-contained services
Image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Non-pizza image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Cross functional teams self-contained services
(“Two-pizza teams” at Amazon)
Full ownership
Full accountability
Aligned incentives
“DevOps”
Non-pizza image from Martin Fowler’s article on microservices, at
http://martinfowler.com/articles/microservices.html
No alterations other than cropping.
Permission to reproduce: http://martinfowler.com/faq.html
Cross functional teams self-contained services
(“Two-pizza teams” at Amazon)
Principle 6
Automate Everything
“Robot” by Robin Zebrowski. No alterations other than cropping.
https://www.flickr.com/photos/firepile/438134733/
Image used with permissions under Creative Commons license 2.0,
Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
Principle 6: Automate everything
AWS
CodeCommit
AWS
CodePipeline
AWS
CodeDeploy
EC2 ELB
Auto
ScalingLambdaECS
DynamoDBRDS ElastiCache SQS SWF
SES SNS
API GatewayCloudWatch Cloud Trail
KinesisElastic
Beanstalk
It’s a journey…
Expect challenges along the way…
• Understanding of business domains
• Coordinating txns across multiple services
• Eventual Consistency
• Service discovery
• Lots of moving parts requires increased
coordination
• Complexity of testing / deploying /
operating a distributed system
• Cultural transformation
Principles of Microservices
1. Rely only on the public API
Hide your data
Document your APIs
Define a versioning strategy
2. Use the right tool for the job
Polygot persistence (data layer)
Polyglot frameworks (app layer)
3. Secure your services
Defense-in-depth
Authentication/authorization
6. Automate everything
Adopt DevOps
4. Be a good citizen within the ecosystem
Have SLAs
Distributed monitoring, logging, tracing
5. More than just technology transformation
Embrace organizational change
Favor small focused dev teams
Benefits of microservices
Rapid
Build/Test/Release
Cycles
Clear ownership and
accountability
Easier to scale
each
individual
micro-service
New releases
take minutes
Short time to add
new features
Easier to
maintain and
evolve system
Faster innovation
Delighted customers
Increased agility
Additional AWS resources:
• Zombie Microservices Workshop:
https://github.com/awslabs/aws-lambda-zombie-workshop
• Serverless Webapp - Reference Architecture:
https://github.com/awslabs/lambda-refarch-webapp
• Microservices with ECS:
https://aws.amazon.com/blogs/compute/using-amazon-
api-gateway-with-microservices-deployed-on-amazon-ecs/
• Serverless Service Discovery:
https://aws.amazon.com/blogs/developer/
serverless-service-discovery-part-1-get-started/
• ECS Service Discovery:
https://aws.amazon.com/blogs/compute/
service-discovery-an-amazon-ecs-reference-architecture/
• Microservices without the Servers
https://aws.amazon.com/blogs/compute/
microservices-without-the-servers
Popular open-source tools:
• Serverless – http://serverless.com
• Apex - http://apex.run/
https://aws.amazon.com/devops/
Additional resources
Remind audience that this is a 200-level session
Cover a lot of ground – we can fill a multi-day workshop
Ask “by show of hands”, how many are familiar with the idea of microservices.
How many have built out microservices?
And we continue to innovate on behalf of our customers. 90-95% of roadmap is based on what customers ask for.
And one of the reasons why customers choose the AWS platform is the speed of innovation, as we build more features that customers want.
But it wasn’t always this way.
But it wasn’t always this way. Amazon used to be built using a monolith architecture.
I don’t know about you, but around the year 2001, we found our own biggest bottleneck at Amazon.
At that time, it became increasingly difficult for our software developers to integrate new features into the Amazon.com page, understand the data that came in through our eCommerce operations and react quickly to customer feedback.
The reason was that the Amazon.com homepage had turned into this large, complex and difficult to maintain thing:
The Monolith.
Obidos, $3.2B in revenue, C++
Over time, the set of features on the Amazon website grew and the Obidos codebase began to sprawl. The binary was multiple gigabytes and took a very long time to compile. So, in the early-mid 2000s, the company began to switch over to Gurupa.Gurupa was a webserver that could only talk to databases and filesystems through BSF (Amazon in-house equivalent of Thrift), and then when these services gave the data to the server, it would assemble them into a webpage using Mason templates, which is basically Perl interspersed with HTML (feels similar to PHP).Over the period of about 2 1/2 years (I think) around 2004-2005, the company migrated from Obidos to Gurupa completely.
So at Amazon, we have this practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.
And at Amazon, where it’s all about the customer, that just is not acceptable.
So at Amazon, one of our leadership principle is Dive Deep. practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.
And what we started to understand as the root cause were two key ones:
- when you're working with a monolithic app, you have many developers all pushing changes through a shared release pipeline
- this causes frictions at many points of the lifecycle
- upfront during development, engineers need to coordinate their changes to make sure they're not making changes that will break someone else's code
- if you want to upgrade a shared library to take advantage of a new feature, you need to convince everyone else to upgrade at the same time – good luck with that
- and if you want to quickly push an important fix for your feature, you still need to merge it in with everyone else's in process changes
- this leads to "merge Fridays", or worse yet "merge weeks", where all the developers have to compile their changes and resolve any conflicts for the next release
- even after development, you also face overhead when you're pushing the changes through the delivery pipeline
- you need to re-build the entire app, run all of the test suites to make sure there are no regressions, and re-deploy the entire app
- to give you an idea of this overhead, Amazon had a central team whose sole job it was to deploy this monolithic app into production
- even if you're just making a one-line change in a tiny piece of code you own, you still need to go through this heavyweight process and wait to catch the next train leaving the station
- for a fast growth company trying to innovate and compete, this overhead and sluggishness was unacceptable
- the monolith became too big to scale efficiently so we made a couple of big changes
- one was architectural, and the other was organizational
Building complex applications is inherently difficult
Teams depend on each other’s libraries + other common libraries
“We’re blocked as we cannot build our software until you hand us the latest version of your library”
Teams depend on each other’s database schema
“We’re afraid the changes we’re making to our database schema will break your code”
Teams depend on each other’s libraries + other common libraries
“We’re blocked as we cannot build our software until you hand us the latest version of your library”
Teams depend on each other’s database schema
“We’re afraid the changes we’re making to our database schema will break your code”
Teams depend on each other’s libraries + other common libraries
“We’re blocked as we cannot build our software until you hand us the latest version of your library”
Teams depend on each other’s database schema
“We’re afraid the changes we’re making to our database schema will break your code”
We knew there had to be a better way.
So around this time, we implemented a company-wide mandate to decompose our monolith and EVOLVE towards microservices.
We knew there had to be a better way.
So around this time, we implemented a company-wide mandate to decompose our monolith and EVOLVE towards microservices.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the succinct definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the succinct definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the concise definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the concise definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the concise definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
So what defines a microservice?
You can certainly look it up on Wikipedia, but the concise definition that I like the most, and that captures the essence of what a microservice is, is the definition by Adrian Cockcroft, one of the luminaries of cloud computing. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans.
We knew there had to be a better way.
So around this time, we implemented a company-wide mandate to move towards microservices.
We knew there had to be a better way.
So around this time, we implemented a company-wide mandate to move towards microservices.
Anatomy of a microservice
TODO: Change to restaurantAPI
Anatomy of a microservice
TODO: Change to restaurantAPI
Anatomy of a microservice
TODO: Change to restaurantAPI
Anatomy of a microservice
TODO: Change to restaurantAPI
Allows you to avoid software coupling, because you know longer have shared libraries or shared datastores. Microservices just talk to one another via their public APIs.
Web of Microservices
2009
- what does success look like
- there are a lot of ways that you can measure the process, and no one way is perfect
- but here's one data point
- when you have thousands of independent teams
- producing highly-factored microservices
- that are deployed across multiple dev, test, and production environments
- in a continuous delivery process
- you get a lot of deployments
- at Amazon in 2014, we ran over 50M deployments
- that's an average of 1.5 deployments every second
TODO: Summarize the benefits that Nike was able to see
Or to put more simply….
Use the right tool for the job.
Or to put more simply….
Use the right tool for the job.
So with those two core principles as foundation, let’s build a microservice
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Coursera
13 million users from 190 countries
1,000 courses from 119 institutions, everything from Programming in Python to Songwriting.
Coursera uses Amazon EC2 Container Service to manage a microservices -based architecture for its applications.
Now deploy software changes in minutes instead of hours in a resource-isolated environment.
Large monolithic application for processing batch jobs that was difficult to run, deploy, and scale.
A new thread was created whenever a new job needed to be completed, and each job took up different amounts of memory and CPU, continually creating inefficiencies.
A lack of resource isolation allowed memory-limit errors to bring down the entire application.
Each job is created as a container and Amazon ECS schedules the container across the cluster of EC2 instances. ECS handles all the cluster management and container orchestration, and containers provide the necessary resource isolation.
Prototype up and running in <2 months
Speed and agility: Software deployment time went from hours to minutes
Each team can now develop and update its respective applications independently
Operational efficiency: No extra infrastructure engineering time is spent installing software and maintaining a cluster—Amazon ECS handles everything from cluster management to container orchestration.
Go Serverless
Go Serverless
Go Serverless
Go Serverless
“Simplest way to build powerful, dynamic apps in the cloud”
“Simplest way to build powerful, dynamic apps in the cloud”
“Simplest way to build powerful, dynamic apps in the cloud”
“Simplest way to build powerful, dynamic apps in the cloud”
“Simplest way to build powerful, dynamic apps in the cloud”
“Simplest way to build powerful, dynamic apps in the cloud”
Abstracts the implementation so that you can switch from Lambda to EC2 or Combine multiple backends. Similarly you can use mapping templates to unify different versions of your APIs
Network protection is something we do very well and requires hyperscale, you won’t be able to auto-scale to meet an attack, let us do it
Centralize authorization decisions in a policy and remove the concern from the code in your backend, fewer bugs
Abstracts the implementation so that you can switch from Lambda to EC2 or Combine multiple backends. Similarly you can use mapping templates to unify different versions of your APIs
Network protection is something we do very well and requires hyperscale, you won’t be able to auto-scale to meet an attack, let us do it
Centralize authorization decisions in a policy and remove the concern from the code in your backend, fewer bugs
Abstracts the implementation so that you can switch from Lambda to EC2 or Combine multiple backends. Similarly you can use mapping templates to unify different versions of your APIs
Network protection is something we do very well and requires hyperscale, you won’t be able to auto-scale to meet an attack, let us do it
Centralize authorization decisions in a policy and remove the concern from the code in your backend, fewer bugs
Abstracts the implementation so that you can switch from Lambda to EC2 or Combine multiple backends. Similarly you can use mapping templates to unify different versions of your APIs
Network protection is something we do very well and requires hyperscale, you won’t be able to auto-scale to meet an attack, let us do it
Centralize authorization decisions in a policy and remove the concern from the code in your backend, fewer bugs
Demo: Build a microservice using Lambda and API gateway
Emphasize the versioning aspect of microservices
Test the service using Postman
Export out the Swagger definition
Use Swagger tool to generate documentation
Show that you can generate client libraries
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Go Serverless
Demo: Build a microservice using Lambda and API gateway
Emphasize the versioning aspect of microservices
Test the service using Postman
Export out the Swagger definition
Use Swagger tool to generate documentation
Show that you can generate client libraries
Go Serverless
All this promotes agile experimentation
Web of Microservices
Or to put more simply….
Use the right tool for the job.
Or to put more simply….
Use the right tool for the job.
This goes both ways:
Service registry
Service owners have a responsibility – publish your health, monitorClients have a responsibility to do exponential backoff – in case there are errors
This goes both ways:
Service owners have a responsibility – publish your health, monitorClients have a responsibility to do exponential backoff – in case there are errors
If it moves, it should be tracked and logged
Spend more time working on code that analyses the meaning of metrics than code that collects, moves, stores and displays metrics
Reduce key business metric latency to less than the human attention span (~10s)
Validate your measurement system has enough accuracy and precision. Collect histograms of response time
Monitoring systems need to be more available and scalable than the systems (and services) being monitored
Optimise for monitoring distributed, ephemeral, 'cloud-native', containerised microservices
Fit metrics to models in order to understand relationships (this is a new rule)
Or to put more simply….
Use the right tool for the job.
It’s about communication between teams.
So, it’s not only about changing
the software architecture,
but also changing the way
software teams are organized.
- in conjunction with breaking apart the architecture, we also broke apart the organization
- we split up the hierarchical org into small teams
- we called them 2-pizza teams, because if they got larger than you could feed with 2 pizzas, we'd break them up
- in reality, the target number is about 8 people per team, so I personally think the 2 pizza goal is maybe a little too frugal
- another important change that went along with this is cultural
- when we split up the org, we gave the teams full autonomy
- they became small startups that owned every aspect of their service
- they worked directly with their customers (internal or external), set their roadmap, designed their features, wrote the code, ran the tests, deployed to production, and operated it
- if there was pain anywhere in the process they felt it
- operational issue in the middle of the night, the team was paged
- lack of tests breaking customers, the team got a bunch of support tickets
- that motivation ensured the team focused on all aspects of the software lifecycle, broke down any barriers between the phases, and made the process flow as efficiently as possible
- we didn't have this term at the time, but this was the start of our "DevOps" culture
- in conjunction with breaking apart the architecture, we also broke apart the organization
- we split up the hierarchical org into small teams
- we called them 2-pizza teams, because if they got larger than you could feed with 2 pizzas, we'd break them up
- in reality, the target number is about 8 people per team, so I personally think the 2 pizza goal is maybe a little too frugal
- another important change that went along with this is cultural
- when we split up the org, we gave the teams full autonomy
- they became small startups that owned every aspect of their service
- they worked directly with their customers (internal or external), set their roadmap, designed their features, wrote the code, ran the tests, deployed to production, and operated it
- if there was pain anywhere in the process they felt it
- operational issue in the middle of the night, the team was paged
- lack of tests breaking customers, the team got a bunch of support tickets
- that motivation ensured the team focused on all aspects of the software lifecycle, broke down any barriers between the phases, and made the process flow as efficiently as possible
- we didn't have this term at the time, but this was the start of our "DevOps" culture
Or to put more simply….
Use the right tool for the job.
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
- with these new tools, we completed the puzzle
- the teams were decoupled and they had the tools necessary to efficiently release on their own
But more important than the technologies are the principles we talked about.
Put this perspective
- Operating a monolith may be easier
- When you can’t innovate quickly
- Understand your domain
So at Amazon, we have this practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.
So at Amazon, we have this practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.
So at Amazon, we have this practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.
There’s no shortage of resources
So at Amazon, we have this practice of doing the 5 Why’s…asking Why, to get to the root causes of the problem.
And we identified a few root causes.