Man nehme ein bisschen BaaS (Backend as a Service), dazu noch ein wenig FaaS (Function as a Service) und fertig ist die Serverless Cloud Application. Was sich in der Theorie so einfach anhört, bringt in der Praxis die eine oder andere Herausforderung mit sich. Die versprochenen Benefits, wie Time-to-Market, Auto-Scaling, automatisches Failover oder Kostenreduzierung via „Pay per Use“ gibt es leider nicht umsonst. Eine passende Architektur muss her. Im Rahmen des Workshops werden wir uns verschiedene Anwendungsszenarien anschauen und für passende Architekturansätze für diese entwerfen. Wir werden dabei natürlich auch dem einen oder anderen Stolperstein begegnen. Aber das kann uns nicht aufhalten.
3. ABOUT ME
Who am i?
• CIO New Technologies
• Enterprise & Mobile
• Author, Speaker, Coach & Mentor
• Snowboard & MTB Enthusiast
• Traveller between the worlds
Lars Röwekamp (a.k.a. @mobileLarson)
#WISSENTEILEN
LR
20. „Kein Server ist
einfacher zu
verwalten, als
kein Server.“
(Werner Vogels, CTO Amazon)
out-of-the-box self-scaling
out-of-the-box
self-scaling
cloud-based
super-backend
I had a dream ...
54. „Kein Server ist
einfacher zu
verwalten, als
kein Server.“
(Werner Vogels, CTO Amazon)
out-of-the-box self-scaling
out-of-the-box
self-scaling
cloud-based
super-backend
Remember
„your“ dreams?
104. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads.
• Compute Layer
• Data Layer
• Messaging & Streaming Layer
• User Management & Identity Layer
• System Monitoring & Deployment
105. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads: Compute Layer
• Serverless Functions (stateless, business logic)
• API Gateway (non functional, cross-cutting concerns)
• Step Functions (function orchestration via state machine)
106. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads: Data Layer
• RDBMS
• NoSQL (including caching and streaming)
• Object Store (e.g. used by CDN for static content)
• (Elastic)Search Service (search and analytics)
107. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads: Messaging & Streaming Layer
• Notification Service (pub/sub for async event notification)
• Streaming Service (collect and analyze data in real-time)
• ETL Service (capture, transform & load data for near real-time BI)
108. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads: User Management & Identity Layer
• User Management Service (user and user attributes)
• Authentication & Authorization (sign-up, sign-in, openID Connect)
• Federal Identity Service (e.g. for Google, Facebook accounts)
109. #WISSENTEILEN
What do you mean by architecture?
We will use multiple components to design robust architectures for
serverless workloads: System Monitoring & Deployment Layer
• System Monitoring (system & custom metrics)
• Distributed Tracing (deep insights for analyzing and debugging)
• Cloud Application Model (infrastructure as code)
112. #WISSENTEILEN
Scenario #1: RESTful Microservice
Characteristics
• you want a secure, easy-to-operate framework, that is simple to
replicate and has high levels of resiliency and availability
• you want to log utilization and access patterns to continually
improve the backend to support customer usage
• you are seeking to leverage managed services as much as possible
which reduces the heavy lifting associated wiith managing common
platforms including security and scalability
113. #WISSENTEILEN
Scenario #1: RESTful Microservice
What could possibly go wrong?
• abnormalities, e.g. unexpected / invalid calls
• internal errors, latency, cache misses
• usage pattern evolves over time
• customer location changes
• security attacks (e.g. DoS/DDoS)
115. #WISSENTEILEN
Scenario #2: Mobile Backend
Characteristics
• you want to create a complete serverless architecture
without managing any instances and/or server
• you want your business logic to be decoupled from your
mobile application as much as possible
• you are looking to provide business functionalities as an
API to optimize development across multiple platforms
116. #WISSENTEILEN
Scenario #2: Mobile Backend
What could possibly go wrong?
• unexpected peaks of workload
• runtime cost explosion
• duplicated or lost events
• security attacks
• high latency
118. #WISSENTEILEN
Scenario #3: Stream Processing
Characteristics
• you want to create a complete serverless architecture without
managing any instances or server for processing stream data
• you want to use existing libraries to take care of data ingestion
from a data producer perspective
119. #WISSENTEILEN
Scenario #3: Stream Processing
What could possibly go wrong?
• peaks of data to process
• data occurence and throughput does not match
• processing fails (all or parts)
• processing is slow(er) (... than expected)
• duplicate records (retry? idempotent?)
• runtime cost explosion
121. #WISSENTEILEN
Scenario #4: Web Application
Characteristics
• you want a scalable web application that can go global in
minutes high levels of resilience and availability
• you want a consistent user experience with adequate
response times
• you want to optimize your costs based upon actual user
demand versus paying for idle resources
• ...
122. #WISSENTEILEN
Scenario #4: Web Application
Characteristics
• ...
• you are seeking to leverage managed services as much as
possible which reduces the heavy lifting associated with
managing common platforms including security and
scalability
• you want to set up a framework that is easy to set up and
operate, and that you can extend with limited impact later
123. #WISSENTEILEN
Scenario #4: Web Application
What could possibly go wrong?
• security attacks
• static content latency
• personalized SLAs / usage plans
• customer location changes
135. #WISSENTEILEN
Operational Excellence
of a well-architectured Serverless Application
„How are you monitoring and responding to anomalies in your
serverless application?“
• collect default metrics
• define and collect custom metrics (ops- and business-centric)
• enable distributed tracing
• define alarms at individual and aggregate level
139. #WISSENTEILEN
Operational Excellence
of a well-architectured Serverless Application
„How are you evolving your serverless application while minimizing
the impact?“
• seperate code from configuration via function env variables
• API Gateway stage variables and/or configuration service
• infrastructure as code templates to enable faster deployment
• seperate gateway endpoints, functions, and state machines per
stage over aliases and versions alone
• A/B-Testing and zero-downtime changes via weighted aliases
142. #WISSENTEILEN
Security (Identity & Access Management)
of a well-architectured Serverless Application
„How do you authorize and authenticate access to your serverless
API?“
• IAM authorization (e.g. AWS IAM & SDKs)
• API Gateway customer Identity Provider authorizer (for existing IdP)
• BaaS based user pools (e.g. AWS Cognito)
143. #WISSENTEILEN
Security (Identity & Access Management)
of a well-architectured Serverless Application
„How are you enforcing boundaries as to what cloud services your
serverless functions can access?“
• least-privileged access via specific roles to avoid opening up the
systems for abuse
• small(er) functions with scoped activities
• NOTE: API Gateway API Key feature is not for security but primarily
for consumer‘s usage tracking
144. #WISSENTEILEN
Security (Detective Controls)
of a well-architectured Serverless Application
„How are you analyzing serverless application logs?“
• track vulnerabilities
• use log filters to transform log in metrics via regex
• create alarms based on application custom metrics
• enable API Gateway logging for single methods* for troubleshooting
• encrypt any data traversing the serverless application
*make certain not to violate compliance requirements
145. #WISSENTEILEN
Security (Detective Controls)
of a well-architectured Serverless Application
„How do you monitor dependency vulnerabilities within your
serverless application?“
• use 3rd party solution (e.g. OWASP Dependency Check)
• integrate into your CI/CD pipeline
146. #WISSENTEILEN
Security (Infastructure Protection)
of a well-architectured Serverless Application
„For VPC access, how are you enforcing networking boundaries as to
what serverless functions can access?“
• configure serverless function for VPC via VPN
• use security groups and Network Acess Control Lists (NACL) as
basis
• use proxies for outbound traffic filtering due to compliance reasons
147. #WISSENTEILEN
Security (Data Protection)
of a well-architectured Serverless Application
„How are your protecting sensitive data within your serverless
application?“
• use TLS for all communication
• senstive data should be protected at all times in all layers
• use encryption at transport and at rest
148. #WISSENTEILEN
Security (Data Protection)
of a well-architectured Serverless Application
„What is your strategy on input validation?“
• set up basic API Gateway request validation (JSON + parameters)
• app-specific deep validation via serverless function, framework, ...
154. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„Have you considered serverless limits for peak workload?“
• avoid degradation and throtteling of services
• monitor usage and set alarms at 80% (e.g. via AWS Trusted Advisor)
• react context sensitive (e.g. raise limit temporary vs. throtteling)
• differ business-critical and non-business-critical functions*
• prefer asynchronous over synchronous communication
*keep max concurrent execution limit in mind
155. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„How are you regulating access rates to and within your serverless
applications?“
• enable throtteling at the API level
• return appropriate return code, e.g. 429, to consumers
• include predictive limit information in return header
• issue API keys to consumers for more granular throtteling (SLAs)
156. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„What is your strategy on asynchronous calls and events within your
serverless architecture?“
• use async calls and events as often as possible for decoupling ...
• to avoid time-outs and locked code
• to allow non-blocking I/O
• use external service for timeout handling if sync is needed*
• NOTE: async plus async equals sync
*e.g. step functions
157. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„What‘s your testing strategy for serverless applications?“
• separate logic from infrastructure to allow unit testing
• don‘t use mocks for services you can‘t control* for integration tests
• perfom acceptance or end-to-end tests in real life environment
*they may change and may result in unexpected results
158. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„How are you building resilience into your serverless application?“
Change Management
• put monitoring metrics in place
• monitor workload to be able to determine abnormalities
• use function and API versioning to be able to rollback
159. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„How are you building resilience into your serverless application?“
Failure Management
• know default back-off and retry logic of serverless framework
• tune back-off and retry logic to your needs if necessary
• build back-off and retry logic into serverless queries
• leverage error logging and capture log info as a custom metric
• ...
160. #WISSENTEILEN
Reliability
of a well-architectured Serverless Application
„How are you building resilience into your serverless application?“
Failure Management
• use Dead Letter Queues (DLQ) as dedicated resources
• use step-functions to avoid custom „try-catch“ blocks*
• inspect and handle responses for non-atomic requests (batch-alike)
• use SAGA Pattern to roll back distributed business transactions
*AWS Step Functions, IBM Sequences, Azure Logic Apps
161. #WISSENTEILEN
Serverless Reliability 101
TYPES OR ERROR
• 4xx Client Error:
Can be fixed by developer, e.g.
InvalidParameterValue (400),
ResourceNotFound (404),
RequestTooLarge (413), etc.
• 5xx Server Error:
Most can be fixed by admin,
e.g. EC2 ENI management
errors (502)
RETRY POLICY
• Stream-based event sources:
Automatically retried until data expires
• Asynchronous invocations:
Automatically retried 2 extra times,
then published to dead-letter-queue
• Synchronous invocations:
Invoking app receives an error code
and is responsible for retries
163. #WISSENTEILEN
Performance Efficiency
of a well-architectured Serverless Application
„How do you choose the most optimum capacity units (memory,
shards, r/w per seconds) within your serverless application?“
• take a data-driven approach selecting a performant architecture
• gather data on all aspects of the architecture
• review results on a cyclical basis
• make architectural trade-offs if needed (e.g. compression, caching)
• run performance and load testings including upstream services
• finetune serverless functions
168. #WISSENTEILEN
Performance Efficiency
of a well-architectured Serverless Application
„How have you optimized the performance of your serverless
application?“
• enable API Gateway caching
• enable in-memory DB caching (e.g. DAX)
• avoid full scan operations on NoSQL DBs via indexes
• test performance with accurate sized sample workload
• leverage global scope within functions to take advantage of
container reuse (e.g. DB Connections, Cloud Service Connections)
169. #WISSENTEILEN
Performance Efficiency
of a well-architectured Serverless Application
active container available
for this Lambda that isn‘t busy
processing another event?
YES NO
invocation
After new container is created:
• function code package downloaded
• Lambda runtime environment started
171. #WISSENTEILEN
Performance Efficiency
of a well-architectured Serverless Application
„How do you decide what components of your serverless application
should be deployed in a VPC?“
• check for cloud-risk data
• check for access to the VPC located components
• avoid VPC whenever possible
173. #WISSENTEILEN
Cost Optimization
of a well-architectured Serverless Application
„What is your strategy for deciding the most optimal serverless
function memory allocation?“
• fine-tune memory allocation due to costs based on gathered data
177. #WISSENTEILEN
Cost Optimization
of a well-architectured Serverless Application
„What is your strategy for code logging in your serverless functions?“
• NOTE: logging impacts costs (ingestion and storage)
• remove unnecessary print statements in code
• use log levels and environment variables
• define log retention periods
• export old logs to cost-effective „archive“-storage
178. #WISSENTEILEN
Cost Optimization
of a well-architectured Serverless Application
„Is your code architecture running unnecessary serverless functions
in order to reduce complexity?“
• use API Gateway service proxy
• prefer direct integration over custom functions
• optimze code due to execution time
179. #WISSENTEILEN
Cost Optimization
of a well-architectured Serverless Application
„How to optimize your code to run in the least amount of time
possible?“
• use step functions instead of serverless functions for orchestration
to avoid the serverless function waiting for a resource to become
available*
*pay per state change not per milliseconds