SlideShare a Scribd company logo
1 of 38
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Searching for patterns:
Log analytics using Amazon ES
Kevin Fallis
Senior Specialist Solutions Architect
AWS – Search Services
A D B 2 0 5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Source: TechCrunch survey of popular open source software from April’17
• Sometimes referred to as the “ELK Stack”
– Elasticsearch, Logstash, & Kibana
• Distributed search and analytics engine
built on Apache Lucene
• Easy ingestion and visualization
What is Elasticsearch?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Machine data driving Elasticsearch growth
Machine-generated data is growing 10x faster than business data… Logs, logs, and more logs
IT & DevOps: Databases,
servers, storage,
networking
Increase in IoT and Mobile
devices: Gaming, sensors, web
content
Cloud-based
architectures
Source: insideBigData—The Exponential Growth of Data, February 16, 2017
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Popular use cases
Application
log monitoring
Security event
information
monitoring
Data
visualization
Full text
search
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Elasticsearch Service (Amazon
ES) is a fully managed service that
makes it easy to deploy, manage, and
scale Elasticsearch and Kibana in the
AWS Cloud
Amazon Elasticsearch Service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Seamless data ingestion, security,
auditing, and orchestration
Benefits of Amazon ES
Drop-in replacement with no need to
learn new APIs or skills
Deploy a production-ready
Elasticsearch cluster in minutes
Resize your cluster with a few clicks
or a single API call
Deploy into your VPC and restrict
access using security groups and IAM
policies
Replicate across Availability Zones,
with monitoring and automated self-
healing
Supports OS APIs and tools Easy to use Scalable
Secure Highly available Tightly integrated
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Cloud
Elasticsearch runs on a cluster of instances
VPC
Data nodes Master nodes
AWS Management Console
AWS Command Line Interface
AWS Tools and SDKs
AWS CloudFormation
AWS Identity and
Access
Management (IAM)
Elastic Load Balancing (ELB)
AWS CloudTrailAmazon CloudWatch AWS Database
Migration Service
Amazon Kinesis Data
Firehose
Amazon
CloudWatch
Logs
Amazon ES domain
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Provides Kibana real-time visualization tool
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Security information and event management (SIEM)
IoT & mobile
Application monitoring & root-cause analysis
Business and web analytics
Amazon ES empowers you with the data to
understand and intelligently react to your business
needs
• End-to-end visibility: Better understanding of
customers' behavior to improve user experience
and react to demand
• Improve reliability: Increased operational
efficiencies by identifying, solving and preventing
system failures in real time
• Faster time-to-value: Accelerate time to market
with application delivery and performance
monitoring
• Security: Improved business confidence with end-
to-end monitoring of data, infrastructure, and
transactions
Build actionable insights
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Central Log Management System
https://www.youtube.com/watch?v=fSjAfp-uqSs
Case study: Autodesk
Highly distributed organization. No consistent way to collect
and measure metrics.
Small ops team.
Must integrate easily with other AWS services.
Scale: Accommodate current and future requirements.
Must be cost effective with no data lock-in.
TBs of log data to sift through to find and fix issues that
impact customers.
C H A L L E N G E
B E N E F I T S
Unified log data management solution built on AWS. Single interface
for log analytics across applications. Annotate log records to enable
distributed tracing states.
Streaming application logs via Kinesis Data Firehose to Amazon S3,
Amazon Athena, and Amazon ES.
10 i3.4xlarge Amazon ES data nodes – 33 TB. Will grow to 110 TB.
Kibana, built-in within Amazon ES, for near real-time analytics and
dashboards
S O L U T I O N
All managed services: “Manage less to gain more.” Focus on developing awesome products.
Common vocabulary for diagnosing and solving problems. Eliminated silos.
Scalable and cost-effective – i3s delivering great value per TB.
Improving customer experience by reducing the time to find and fix customer issues.
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Application dataServer, application,
network, AWS, and
other logs
Amazon ES domain
with index
How it works
1. Send data as JSON via REST APIs
2. Data is indexed: All fields searchable, including nested
JSON
3. Queries, via REST APIs, allow fielded matching,
Boolean expressions, include sorting and analysis
1
2
3
Application users, analysts, DevOps, security
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use the query APIs to retrieve data from
Elasticsearch
Amazon ES domain
Query
engine
Scoring &
sorting
Ranked
resultsMatches
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The query engine matches requested field values
Field1:value1
Field2:value2
logs_11.28.2018 index
F1 index F2 index
V1
V2
Vn
V1
V2
Vn
ID
Field: value
Field: value
Field: value
Field: value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use aggregations to analyze log data
Amazon ES domain
Query
engine
Matches
Analysis
engine
(aggre-
gations)
• Histogram
• Numeric: sum,
min., max.
• Terms: bucketing
• Nesting
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
All docs
1/51/5 1/5 1/5 1/5
Index
ID
Field: value
Field: value
Field: value
Field: value
Data is stored in an index comprised of shards
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Shards are primary or replica
Index
Primary shards
Replica shards
ID
Field: value
Field: value
Field: value
Field: value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Elasticsearch distributes shards to data nodes
Queries
Updates
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Overview of delivering logs to Amazon ES
Collect Buffer Aggregate Store
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log collectors: Popular options
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log collectors: Properties
• Typically read files on a file system
• But can receive events with data from things other than file systems
• Configuration driven
• Can be “lightweight” or “heavyweight”
• Lightweight: Consumes as few system resources as possible
• Written in C, Ruby, or another efficient language
• Agent based: Runs as a service on the OS
• Config-driven
• Heavyweight: Requires a JVM or other execution engine
• Purpose built or leverage “plugins” via configuration
• Can perform data transformation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log buffers: Popular options
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log buffers: Properties
• Allow you to decouple producers from consumers
• Control the ingest pipeline
• Metered consumption of data from consumer fleets
• Have “data durability”
• Individual events can have a lifecycle outside of Elasticsearch when dealing with sliding windows
• Can allow you to replay events
• Give you options to involve other business functions
• Machine learning
• Big data and analytics
• Data science
• Promote “Lambda” architectures (batch + near-real time)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log aggregators: Popular options
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log aggregators: Properties
• Aggregate events into one payload for Amazon ES
• Give you control of the ingest activity
• Allow you to “throttle” the volume of request to Elasticsearch because:
• Data nodes have limited space in processing queues
• You need to balance query activity with ingest activity
• Use the _bulk API to push JSON formatted, grouped events to Elasticsearch
• Can be “lightweight” or “heavyweight,” just like forwarders
• Can act as interim buffers
• Use AWS Auto Scaling to throttle Amazon EC2 or container fleets
• Lambda should leverage “concurrency” setting to throttle indexing
• In some cases, can “fan out” to multiple destinations other than Elasticsearch
for additional business value
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Patterns help you build solutions quickly
• Asserted
• Others have done this
• Extensible
• Prescriptive
• Repeatable
• Verifiable
• Natively on AWS if using
AWS CloudFormation and
AWS Config
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
VPC option presents challenges to architectures
• Elastic Network Interfaces (ENIs) get presented to consumers of the Amazon ES
• This means all traffic to your domain is private and must be accessed from within the VPC
• ENIs cannot be presented to external services without a proxy, AWS PrivateLink or VPC peering
• DNS resolution of the endpoint is private
• You cannot present one Amazon ES domain to more that one VPC
• Kibana access via Amazon Cognito must be brokered with a proxy
• NGINX
• Apache
• Amazon Kinesis Data Firehose will eventually support VPC endpoints
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon S3 event notifications approach
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis approach
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data retention is directly proportional to cost
• Do you really need to log it?
• Remove irrelevant fields
• For example, are you really using that user-agent field in your access logs?
• Transform string values into integers
• For example, VPC Flow Logs contain a field called “action” and “status.” You could transform
these character fields to enumerations
• Do your customers need larger retention periods?
• Most data is actionable in a “hot” time period
• Consider smaller retention periods unless the business case dictates otherwise
• Use a “forensic cluster” that is populated by manual snapshots as needed
• Audits
• Historical trend analysis
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Pattern: Time-based indexes for log analytics
• You use a root string, e.g., logs_.
• Depending on volume, rotate at regular
intervals, normally daily.
• Daily indexes simplify index management.
Delete the oldest index to create more
space on your cluster.
• Use aliases to query aggregate indices.
logs_2019.07.01
logs_2019.07.02
logs_2019.07.03
logs_2019.07.04
logs_2019.07.05
logs_2019.07.06
logs_2019.07.07
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Going deeper on index management—aliases
Aliases enable you to query multiple indices using a
reference name
• Begin by creating a new index that fits the pattern-defined
using settings
• Adjust the alias to include the new index name, for example
‘logs_2019.07.01’
• Remove the oldest index from the alias for example
‘logs_2019.07.01’
• Manual snapshot the oldest index
• Drop the oldest index
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Forensic cluster pattern
• Amazon CloudWatch Events trigger Lambda, which invokes curator to
manage indices
• Create a schedule in Amazon CloudWatch for the event
• Create a snapshot repository
• AWS Lambda creates a metadata record in Amazon DynamoDB for the snapshot with a state of
“starting”
• Lambda calls curator to manage the indexes via API
• Snapshot is kicked off asynchronously
• Lambda updates the metadata record to a state of started
• Another scheduled event checks the snapshot using the _snapshots API to query the status. It
should be in a “SUCCESS” status, and you can mark the snapshot “complete”
• Code for error scenarios
• Create a new cluster
• Restore snapshots based on metadata records in Amazon DynamoDB
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Wrap up
• Machine-generated data is growing rapidly, driven by DevOps, cloud infrastructure,
and IoT
• Logs contain valuable insights: what your users are doing, whether you have bad
actors, & what's happening at your devices
• Amazon ES enables ingesting and analyzing logs in real time to provide you with the
data and insights you need
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Kevin Fallis
kffallis@amazon.com

More Related Content

What's hot

Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Amazon Web Services
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...Amazon Web Services
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Amazon Web Services
 
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdf
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdfDo you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdf
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdfAmazon Web Services
 
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Amazon Web Services
 
Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Amazon Web Services
 
Progetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSProgetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSAmazon Web Services
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitAmazon Web Services
 
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdf
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdfWhat's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdf
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdfAmazon Web Services
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Amazon Web Services
 
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...Amazon Web Services
 
Continuous security monitoring and threat detection with AWS services - SEC20...
Continuous security monitoring and threat detection with AWS services - SEC20...Continuous security monitoring and threat detection with AWS services - SEC20...
Continuous security monitoring and threat detection with AWS services - SEC20...Amazon Web Services
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitAmazon Web Services
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Amazon Web Services
 
Control your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsControl your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsAmazon Web Services
 
Introducing AWS App Mesh - MAD303 - Santa Clara AWS Summit
Introducing AWS App Mesh - MAD303 - Santa Clara AWS SummitIntroducing AWS App Mesh - MAD303 - Santa Clara AWS Summit
Introducing AWS App Mesh - MAD303 - Santa Clara AWS SummitAmazon Web Services
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統Amazon Web Services
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Amazon Web Services
 
Getting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesGetting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesAmazon Web Services
 

What's hot (20)

Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...
 
HK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-WorkshopHK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-Workshop
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
 
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdf
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdfDo you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdf
Do you need a ledger database or a blockchain - SVC208 - Atlanta AWS Summit.pdf
 
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
 
Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...
 
Progetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWSProgetta, crea e gestisci Modern Application per web e mobile su AWS
Progetta, crea e gestisci Modern Application per web e mobile su AWS
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
 
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdf
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdfWhat's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdf
What's new in Amazon Aurora - ADB204 - Santa Clara AWS Summit.pdf
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
 
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
 
Continuous security monitoring and threat detection with AWS services - SEC20...
Continuous security monitoring and threat detection with AWS services - SEC20...Continuous security monitoring and threat detection with AWS services - SEC20...
Continuous security monitoring and threat detection with AWS services - SEC20...
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS Summit
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...
 
Control your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsControl your cloud environment with AWS management tools
Control your cloud environment with AWS management tools
 
Introducing AWS App Mesh - MAD303 - Santa Clara AWS Summit
Introducing AWS App Mesh - MAD303 - Santa Clara AWS SummitIntroducing AWS App Mesh - MAD303 - Santa Clara AWS Summit
Introducing AWS App Mesh - MAD303 - Santa Clara AWS Summit
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
 
Getting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesGetting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless Architectures
 

Similar to Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit

Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitAmazon Web Services
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitAmazon Web Services
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...AWS Riyadh User Group
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWSAWS Germany
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSAmazon Web Services
 
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Amazon Web Services
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.javier ramirez
 
Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczAmazon Web Services
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAmazon Web Services
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SFAmazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfAmazon Web Services
 

Similar to Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit (20)

Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
 
Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter Dachnowicz
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San Francisco
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SF
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Building-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWSBuilding-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWS
 
Migrating your IT - Final
Migrating your IT - FinalMigrating your IT - Final
Migrating your IT - Final
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Searching for patterns: Log analytics using Amazon ES Kevin Fallis Senior Specialist Solutions Architect AWS – Search Services A D B 2 0 5
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Source: TechCrunch survey of popular open source software from April’17 • Sometimes referred to as the “ELK Stack” – Elasticsearch, Logstash, & Kibana • Distributed search and analytics engine built on Apache Lucene • Easy ingestion and visualization What is Elasticsearch?
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Machine data driving Elasticsearch growth Machine-generated data is growing 10x faster than business data… Logs, logs, and more logs IT & DevOps: Databases, servers, storage, networking Increase in IoT and Mobile devices: Gaming, sensors, web content Cloud-based architectures Source: insideBigData—The Exponential Growth of Data, February 16, 2017
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Popular use cases Application log monitoring Security event information monitoring Data visualization Full text search
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Elasticsearch Service (Amazon ES) is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana in the AWS Cloud Amazon Elasticsearch Service
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Seamless data ingestion, security, auditing, and orchestration Benefits of Amazon ES Drop-in replacement with no need to learn new APIs or skills Deploy a production-ready Elasticsearch cluster in minutes Resize your cluster with a few clicks or a single API call Deploy into your VPC and restrict access using security groups and IAM policies Replicate across Availability Zones, with monitoring and automated self- healing Supports OS APIs and tools Easy to use Scalable Secure Highly available Tightly integrated
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T AWS Cloud Elasticsearch runs on a cluster of instances VPC Data nodes Master nodes AWS Management Console AWS Command Line Interface AWS Tools and SDKs AWS CloudFormation AWS Identity and Access Management (IAM) Elastic Load Balancing (ELB) AWS CloudTrailAmazon CloudWatch AWS Database Migration Service Amazon Kinesis Data Firehose Amazon CloudWatch Logs Amazon ES domain
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Provides Kibana real-time visualization tool
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Security information and event management (SIEM) IoT & mobile Application monitoring & root-cause analysis Business and web analytics Amazon ES empowers you with the data to understand and intelligently react to your business needs • End-to-end visibility: Better understanding of customers' behavior to improve user experience and react to demand • Improve reliability: Increased operational efficiencies by identifying, solving and preventing system failures in real time • Faster time-to-value: Accelerate time to market with application delivery and performance monitoring • Security: Improved business confidence with end- to-end monitoring of data, infrastructure, and transactions Build actionable insights
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Central Log Management System https://www.youtube.com/watch?v=fSjAfp-uqSs Case study: Autodesk Highly distributed organization. No consistent way to collect and measure metrics. Small ops team. Must integrate easily with other AWS services. Scale: Accommodate current and future requirements. Must be cost effective with no data lock-in. TBs of log data to sift through to find and fix issues that impact customers. C H A L L E N G E B E N E F I T S Unified log data management solution built on AWS. Single interface for log analytics across applications. Annotate log records to enable distributed tracing states. Streaming application logs via Kinesis Data Firehose to Amazon S3, Amazon Athena, and Amazon ES. 10 i3.4xlarge Amazon ES data nodes – 33 TB. Will grow to 110 TB. Kibana, built-in within Amazon ES, for near real-time analytics and dashboards S O L U T I O N All managed services: “Manage less to gain more.” Focus on developing awesome products. Common vocabulary for diagnosing and solving problems. Eliminated silos. Scalable and cost-effective – i3s delivering great value per TB. Improving customer experience by reducing the time to find and fix customer issues.
  • 11. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Application dataServer, application, network, AWS, and other logs Amazon ES domain with index How it works 1. Send data as JSON via REST APIs 2. Data is indexed: All fields searchable, including nested JSON 3. Queries, via REST APIs, allow fielded matching, Boolean expressions, include sorting and analysis 1 2 3 Application users, analysts, DevOps, security
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use the query APIs to retrieve data from Elasticsearch Amazon ES domain Query engine Scoring & sorting Ranked resultsMatches
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The query engine matches requested field values Field1:value1 Field2:value2 logs_11.28.2018 index F1 index F2 index V1 V2 Vn V1 V2 Vn ID Field: value Field: value Field: value Field: value
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use aggregations to analyze log data Amazon ES domain Query engine Matches Analysis engine (aggre- gations) • Histogram • Numeric: sum, min., max. • Terms: bucketing • Nesting
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T All docs 1/51/5 1/5 1/5 1/5 Index ID Field: value Field: value Field: value Field: value Data is stored in an index comprised of shards
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Shards are primary or replica Index Primary shards Replica shards ID Field: value Field: value Field: value Field: value
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Elasticsearch distributes shards to data nodes Queries Updates
  • 19. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Overview of delivering logs to Amazon ES Collect Buffer Aggregate Store
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log collectors: Popular options
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log collectors: Properties • Typically read files on a file system • But can receive events with data from things other than file systems • Configuration driven • Can be “lightweight” or “heavyweight” • Lightweight: Consumes as few system resources as possible • Written in C, Ruby, or another efficient language • Agent based: Runs as a service on the OS • Config-driven • Heavyweight: Requires a JVM or other execution engine • Purpose built or leverage “plugins” via configuration • Can perform data transformation
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log buffers: Popular options
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log buffers: Properties • Allow you to decouple producers from consumers • Control the ingest pipeline • Metered consumption of data from consumer fleets • Have “data durability” • Individual events can have a lifecycle outside of Elasticsearch when dealing with sliding windows • Can allow you to replay events • Give you options to involve other business functions • Machine learning • Big data and analytics • Data science • Promote “Lambda” architectures (batch + near-real time)
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log aggregators: Popular options
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log aggregators: Properties • Aggregate events into one payload for Amazon ES • Give you control of the ingest activity • Allow you to “throttle” the volume of request to Elasticsearch because: • Data nodes have limited space in processing queues • You need to balance query activity with ingest activity • Use the _bulk API to push JSON formatted, grouped events to Elasticsearch • Can be “lightweight” or “heavyweight,” just like forwarders • Can act as interim buffers • Use AWS Auto Scaling to throttle Amazon EC2 or container fleets • Lambda should leverage “concurrency” setting to throttle indexing • In some cases, can “fan out” to multiple destinations other than Elasticsearch for additional business value
  • 27. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Patterns help you build solutions quickly • Asserted • Others have done this • Extensible • Prescriptive • Repeatable • Verifiable • Natively on AWS if using AWS CloudFormation and AWS Config
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T VPC option presents challenges to architectures • Elastic Network Interfaces (ENIs) get presented to consumers of the Amazon ES • This means all traffic to your domain is private and must be accessed from within the VPC • ENIs cannot be presented to external services without a proxy, AWS PrivateLink or VPC peering • DNS resolution of the endpoint is private • You cannot present one Amazon ES domain to more that one VPC • Kibana access via Amazon Cognito must be brokered with a proxy • NGINX • Apache • Amazon Kinesis Data Firehose will eventually support VPC endpoints
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon S3 event notifications approach
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis approach
  • 32. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data retention is directly proportional to cost • Do you really need to log it? • Remove irrelevant fields • For example, are you really using that user-agent field in your access logs? • Transform string values into integers • For example, VPC Flow Logs contain a field called “action” and “status.” You could transform these character fields to enumerations • Do your customers need larger retention periods? • Most data is actionable in a “hot” time period • Consider smaller retention periods unless the business case dictates otherwise • Use a “forensic cluster” that is populated by manual snapshots as needed • Audits • Historical trend analysis
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Pattern: Time-based indexes for log analytics • You use a root string, e.g., logs_. • Depending on volume, rotate at regular intervals, normally daily. • Daily indexes simplify index management. Delete the oldest index to create more space on your cluster. • Use aliases to query aggregate indices. logs_2019.07.01 logs_2019.07.02 logs_2019.07.03 logs_2019.07.04 logs_2019.07.05 logs_2019.07.06 logs_2019.07.07
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Going deeper on index management—aliases Aliases enable you to query multiple indices using a reference name • Begin by creating a new index that fits the pattern-defined using settings • Adjust the alias to include the new index name, for example ‘logs_2019.07.01’ • Remove the oldest index from the alias for example ‘logs_2019.07.01’ • Manual snapshot the oldest index • Drop the oldest index
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Forensic cluster pattern • Amazon CloudWatch Events trigger Lambda, which invokes curator to manage indices • Create a schedule in Amazon CloudWatch for the event • Create a snapshot repository • AWS Lambda creates a metadata record in Amazon DynamoDB for the snapshot with a state of “starting” • Lambda calls curator to manage the indexes via API • Snapshot is kicked off asynchronously • Lambda updates the metadata record to a state of started • Another scheduled event checks the snapshot using the _snapshots API to query the status. It should be in a “SUCCESS” status, and you can mark the snapshot “complete” • Code for error scenarios • Create a new cluster • Restore snapshots based on metadata records in Amazon DynamoDB
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Wrap up • Machine-generated data is growing rapidly, driven by DevOps, cloud infrastructure, and IoT • Logs contain valuable insights: what your users are doing, whether you have bad actors, & what's happening at your devices • Amazon ES enables ingesting and analyzing logs in real time to provide you with the data and insights you need
  • 38. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kevin Fallis kffallis@amazon.com