Real-time analytics have traditionally been analyzed using batch processing in DWH/Hadoop environments. Common use cases use data lakes, data science, and machine learning (ML). Creating serverless data-driven architecture and serverless streaming solutions with services like Amazon Kinesis, AWS Lambda, and Amazon Athena can solve real-time ingestion, storage, and analytics challenges, and help you focus on application logic without managing infrastructure. Learn design patterns and best practices for serverless stream processing.
48. Solution Overview
HSBC UK
Mainframes
Mapper
EMR
Spark
Kinesis
StreamsDirect
Connect
Customer Preferences
DynamoDB Lambda API Gateway
Data Service
AuroraEMRDynamoDBAPI GatewayKinesis
Streams
Event Engine
Kinesis
Streams
Lambda
Push Notifications
Notification Service
API GatewayKinesis
Streams
Lambda
Message Service
API GatewayDynamoDBKinesis
Streams
Lambda
JSON
ASCII
Dead Letter Queues
SNSSQSVPC CloudWatch KMS
Common Services
EU-West-1
AVRO
EBCDIC
Kafka
AVRO
EBCDIC
49. Lambda and Kinesis Data Streams Lessons Learned
• Increasing number of Kinesis Data Streams shards may not increase system
performance, batch size matters. Perform load test.
• Consider the impact of language and VPC usage on Lambda startup time vs. Lambda
execution time
• Java-based functions start slower vs. Python/Node but executes faster
• 3GB memory isn’t always fastest for VPC attached Lambdas. Most optimum mem
allocation for Java-based functions was 1GB. Consider ENI-reuse.
• Consider pre-warming VPC attached functions to achieve your latency SLA
50. Key Takeaways
• Follow the principle of "extract data once and reuse multiple times” to power new
customer experiences
• Generating a repeatable correlation ID from source is critical in a distributed system
• Perform load tests to fine tune your system and identify choke points
• Know the AWS services soft and hard limits
• Plan your network architecture to provide service isolation and to support production scale
• Consider how to unify your existing and cloud operation model – logging, monitoring and
alerting