3. Agenda
Introduction to Microservices
Microservices implementation on AWS
API implementation
Functions vs containers
Container based microservices
Serverless microservices (Lambda-based applications)
Service discovery and service mesh
Microservices communication
Orchestration
Data store and distributed data management
Distribute monitoring and tracing
Auditing
5. Microservices
Architectural and Organizational approach to software development
Software is composed of small services that communicate over well-
defined APIs that can be deployed independently
Services are owned by small autonomous teams
Three common patterns:
• API driven
• Event driven
• Data streaming
6. What microservices are not…
Defined by the number of lines of code
Services that only have subroutines (calculation or a validation) are
functions, not a service
Services that only expose CRUD operations are RPC call to a database,
not a service
REST / SOAP provide common interfaces for integration but don’t
change the logical responsivity of the implementation
7. Microservices applications on AWS
User interface Microservices Data store
Amazon
CloudFront
Amazon S3
Application
Load Balancing
Elastic Container
Service
Amazon Aurora
Amazon DynamoDB
Amazon ElastiCache
Amazon API Gateway AWS Lambda
9. Challenges in API implementation
• Architecting, deploying, monitoring, continuously improving, and
maintaining an API is a time-consuming task
• Different versions of APIs need to be run
• Different stages of the development cycle
• Authorization is a critical feature, but it is complex to build
• Monetize the ecosystem of third-party developers utilizing the APIs
10. Amazon API Gateway
Create a unified API
frontend for multiple
micro-services
Authenticate and
authorize requests to a
backend
DDoS protection and
throttling for your
backend
Throttle, meter, and
monetize API usage by
3rd party developers
12. Select …
Containers
When you need
• Lower startup latency
• Support for long running compute
jobs (> 15 minutes)
• Predictable, high traffic usage
• Persistence of data
When you want
• Complete control of compute
environment
Serverless
When you need
• To trigger action on an event
• Support for varying utilization
• Ability to handle unknown demand
When you want to
• Quickly prove business value
• Hand operational complexity (for
example, patching, scaling) to AWS
• Make fewer decisions
13. Comparison of operational responsibility
AWS Lambda
Serverless functions
AWS Fargate
Serverless containers
ECS/EKS
Container-management as a
service
EC2
Infrastructure-as-a-Service
More opinionated
Less opinionated
AWS manages Customer manages
• Data source integrations
• Physical hardware, software, networking,
and facilities
• Provisioning
• Application code
• Container orchestration, provisioning
• Cluster scaling
• Physical hardware, host OS/kernel,
networking, and facilities
• Application code
• Data source integrations
• Security config and updates, network config,
management tasks
• Container orchestration control
plane
• Physical hardware software,
networking, and facilities
• Application code
• Data source integrations
• Work clusters
• Security config and updates, network
config, firewall, management tasks
• Physical hardware software,
networking, and facilities
• Application code
• Data source integrations
• Scaling
• Security config and updates, network
config, management tasks
• Provisioning, managing scaling and
patching of servers
14. Runtime environment
compatible with AWS
Lambda?
.NET Core, Go, Java, Python,
or Node.js
Unknown
demand and
below RPS
breakeven
Inter-container
communication*
or storage-
intensive?
Desire orchestration
portability OR open
source fan?
Amazon EKS
Are you comfortable
managing your own
infrastructure?
Amazon ECSAWS Fargate
Deployment
Package size
<= 50MB
Desired Service
runtime <= 15
minutes?
AWS Lambda
What if I can’t decide? Decision Tree…
16. MANAGED BY AWS
No EC2 Instances to provision,
scale or manage
ELASTIC
Scale up & down seamlessly.
Pay only for what you use
INTEGRATED
with VPC Networking, Elastic Load
Balancing, IAM Permissions,
CloudWatch and more.
AWS Fargate
19. AWS Lambda
Bring your own code
• Node.js
• Java
• Python
• C#
• Go
Simple resource model
• Select power rating from
128 MB to 3 GB
• CPU and network
allocated proportionately
Stateless
• Persist data using
external storage
• No affinity or access to
underlying infrastructure
Flexible use
• Synchronous or
asynchronous
• Configurable Throttling
21. Service discovery
Challenges
Controll application-level
communication across :
- Compute environments
- Container orchestrators
- Clusters
Keep configuration synchronized with
dynamic state as:
- Communication circuits fail and recover
- As replicas scale out and back under
variable load
End-to-end observability
- Logs / metrics / distributed tracing
Needs
• Reliable communication between
service nodes
• Ability to control routing through
policy
• React autonomously and
responsively to dynamically
changing state
• Uniform, dependable, noninvasive
mechanism for observability
• Decoupled from application code
• Applied in a standardized,
declarative, and reliable fashion
22. Service discovery patterns
Server side service discovery
- Connections are proxied
- Discovery is abstracted away
- Availability and capacity impact
- Additional latency
Client side service discovery
- Clients connect directly to providers
- Fewer components in the system
- Clients must be registry-aware
- Client-side load balancing
23. Service discovery
ECS integrated service discovery
• Creates / manages service names using
the Route 53 Auto Naming API
• Names are automatically mapped to a
set of DNS records
• Services can be referred by name
• Health check conditions → only healthy
service endpoints are returned
Unified service discovery for services
managed by Kubernetes:
• AWS contributed to the External DNS
project (Kubernetes incubator)
Third-party software
• HashiCorp Consul
• Etcd
• Netflix Eureka
• ZooKeeper
24. Define convenient names
for all cloud resources
Discover resources
with specific attributes
Ensure only healthy
resources are discovered
Use highly available
DNS and regional API
28. The sidecar proxy pattern
All service-to-service
traffic ("east-west")
routed to out-of-process
sidecar proxy
Proxy
Microservice
All service
ingress/egress traffic
flows through the proxy
Proxy
Discovery
Routing
Monitoring
Microservice
App logic
Task or pod
29. This is what App Mesh does
OSS community project
Wide community support, numerous
integrations
Stable and production-proven
Graduated project in Cloud Native
Computing Foundation
Started at Lyft in 2016
30. AWS App Mesh and Cloud Map
Service registry for AWS App Mesh
AWS App Mesh
discoverInstances
AWS_INSTANCE_IPV4 AWS_INSTANCE_PORT
AVAILABILITY_ZONE REGION
ECS_SERVICE_NAME ECS_CLUSTER_NAME
EC2_INSTANCE_ID ECS_TASK_DEFINITION_FAMILY
32. Microservices communication
In microservices the communication between different parts of the application
must be implemented using network communication
RESTful API
• HTTP as a transport layer
• Relies on
• Stateless communication
• Uniform interfaces
• Standard methods
• Amazon API Gateway as a “front door”
• An API object is a group of resources and methods
• A resource is a typed object within the domain of an
API and may have associated a data model or
relationships to other resource
• Each resource can respond to one or more methods
(GET, POST, PUT)
• REST APIs can be deployed to different stages, and
versioned
Message passing
• Services communicate by exchanging
messages via a queue
• Major benefits:
• Service discovery is not necessary
• Services are loosely couple
• Different AWS services
• Amazon Simple Queue Service and Amazon Simple
Notification Service:
• SQS uses custom API → code changes are necessary
• Amazon MQ: if existing software is using open
standard APIs and protocols for messaging (JMS,
NMS, AMQP, STOMP, MQTT, and WebSocket)
35. Orchestration challenges
• Microservices makes it challenging to orchestrate
• Developers → add orchestration code into services directly
• This should be avoided as it
• Introduces tighter coupling
• Makes it harder to quickly replace individual services
• Use AWS Step Functions to build applications from
individual components
• Provides a state machine that hides the complexities of service
orchestration (error handling, serialization/parallelization)
• Uses the Amazon States Language (JSON based):
https://states-language.net/spec.html
38. Centralized database – anti-pattern
Monolithic applications typically have a monolithic data store:
• Difficult to make schema changes
• Technology lock-in
• Vertical scaling
• Single point of failure
user-svc account-svccart-svc
DB
39. Decentralized data stores
• Each service chooses its data store technologies
• Low impact schema changes
• Independent scalability
• Data is gated through the service API
account-svccart-svc
Dynam
oDB
RDS
user-svc
ElastiC
ache
RDS
40. Challenges
Transactional integrity
• Use a pessimistic model
• Handle it in the client
• Add a transaction manager /
distributed locking service
• Rethink your design
• Use an optimistic model
• Accept eventual consistency
• Retry (if idempotent)
• Fix it later
• Write it off
Aggregation
• Pull: Make the data available via
your service API
• Push: To Amazon S3, Amazon
CloudWatch, or another service
you create
• Pub/Sub: Via Amazon Kinesis
or Amazon SQS
41. Transactional integrity
Transactions span multiple microservices
Cannot leverage a single ACID transaction partial execution
Control logic to redo already processed transactions Saga pattern
AWS Step Function to implement Saga execution coordinator
42. Challenges
CAP Theorem
In a distributed system, you can only have two out of the following
three guarantees across a write/read pair:
• Consistency - A read is guaranteed to return the most recent write for a given client
• Availability - A non-failing node will return a reasonable response within a reasonable
amount of time (no error or timeout)
• Partition tolerance - The system will continue to function when network partitions occur
Networks and parts of networks go down frequently
and unexpectedly
You must tolerate partitions in a distributed system, period
43. CAP Theorem
Consistency/Partition tolerance
• Wait for a response from the
partitioned node
• Could result in a timeout error
• Choose Consistency when you
needs to garantee atomic reads
and writes
Availability/Partition tolerance
• Return the most recent version of
the data
• Writes can be processed later
• Choose Availability when some
flexibility on data
synchronization is allowed
• When the system needs to
continue to work in presence of
external errors
45. Distributed monitoring
• Use CloudWatch to gain system-wide visibility into:
• resource utilization
• application performance
• operational health
• Alternative option (especially for Amazon EKS): Prometheus
• An open-source monitoring and alerting toolkit
• Often used in combination with Grafana
46. Log analysis
Log analysis with Amazon Elasticsearch Service and Kibana
Log analysis with Amazon RedShift and Amazon QuickSight
Log analysis on Amazon S3
47. Distributed tracing – AWS X-Ray
• AWS X-Ray uses correlation IDs: unique
identifiers attached to all requests and
messages related to a specific event
chain.
• Trace ID is added to HTTP requests in
headers named X-Amzn-Trace-Id
• Via the X-Ray SDK, any microservice
can read, add or update this header
49. Auditing Challenges
• Ensuring visibility of user actions on each service
• Being able to get an overall view across all services at an
organizational level
• Audit both resource access as well as activities that lead to system
changes
• Changes: tracked at the individual service level, and across services running on the
wider system.
• Changes occur frequently in microservices auditing changes is more important
50. AWS CloudTrail
• Tracks changes in microservices
• Enables all API calls made in the AWS Cloud to be logged and
sent to either CloudWatch Logs (real time) or to Amazon S3
(several minutes)
51. Resource inventory and change management
• CloudTrail and CloudWatch Events: building blocks to track and respond
to infrastructure changes across microservices
• AWS Config rules: to define security policies with specific rules to
automatically detect, track, and alert you to policy violations
54. Simple Bookstore App
The Bookstore App is built on-top of AWS Full-Stack Template, which
provides:
• Foundational services
• Components
• Plumbing
needed to get basic web application up and running
https://github.com/awslabs/aws-full-stack-template
55. Simple Bookstore App
The app contains multiple services
1. Shopping cart
2. Product search
3. Recommendations
4. Top sellers list
For each of these services, the app makes use of a purpose-built database.