SlideShare ist ein Scribd-Unternehmen logo
1 von 71
The Elastic Stack
as a SIEM
Philly Security Shell 2019
Who Am I?
John Hubbard [@SecHubb]
• Previous SOC Lead @ GlaxoSmithKline
• Certified SANS Instructor
• Author
• SEC450: Blue Team Fundamentals – Security Analysis and Operations
• SEC455: SIEM Design & Implementation (Elasticsearch as a SIEM)
• Instructor
• SEC511: Continuous Monitoring & Security Operations
• SEC555: SIEM with Tactical Analytics
• Mission: Make life awesome for the blue team
• Data for this talk: https://github.com/SecHubb/SecShell_Demo
What is a SIEM?
• A central log repository that enriches
logs and assists threat detection
• Components
• Log Sources
• Log Aggregator
• Log Storage & Indexing
• Search & Viz. Interface + Alerting Engine
Log Sources
Log Aggregation
/ Queue
Log Storage &
Indexing
Search,
Visualization, &
Alerting
John Hubbard [@SecHubb] 3
What is the
Elastic Stack?
• Open source, real-time
search and analytics engine
• Made up of 4 pieces:
collection, ingestion,
storage, and visualization
John Hubbard [@SecHubb] 4
History of
Elastic 2010
Created by Shay Bannon
Recipe search engine for his wife in
culinary school
Inspired by Minority Report
2012
Elastic Co. Founded
2019
Used by Wikipedia, Stack Overflow,
GitHub, Netflix, LinkedIn, …
One of the most popular projects
on GitHub
Iterating versions rapidly with
awesome new features
John Hubbard [@SecHubb] 5
Elastic Stack vs. SIEM
Log Sources
Log Aggregation /
Queue
Log Storage &
Indexing
Search, Visualization,
& Alerting
John Hubbard [@SecHubb] 6
Elastic stack as a SIEM
Used for many different use cases
• NOT a SIEM out of the box
• Not in the magic quadrant as one
• Can do the things a SIEM does
Gartner's definition of a SIEM:
"supports threat detection and security incident response through the real-
time collection and historical analysis of security events from a wide variety
of event and contextual data sources. It also supports compliance reporting
and incident investigation through analysis of historical data from these
sources."
John Hubbard [@SecHubb] 7
Elasticsearch as a SIEM
• Collects, indexes, and stores high volumes of logs
• Functional visualizations and dashboards
• Reporting and alerting
• Log enrichment through plugins
• Compatible with almost every format
• Log retention settings
• Anomaly detection via machine learning
• RBAC securable
John Hubbard [@SecHubb] 8
Elastic Stack Overview
Raw
Logs
Raw
Logs
Log Ingestion
& Parsing
Log Storage
Search &
Visualization
John Hubbard [@SecHubb] 9
Winlogbeat
Beats Agents
Lightweight log agents written in Go
• Filebeat
• Winlogbeat
• Packetbeat
• Auditbeat
• Functionbeat
• Journalbeat
• Community Beats
FilebeatPacketbeat
John Hubbard [@SecHubb] 10
Elasticsearch
Architecture
John Hubbard [@SecHubb] 11
Clusters, Nodes, and Indices
Cluster Node Indices
John Hubbard [@SecHubb] 12
Index Creation Across Time
Firewall-2018-01 Firewall-2018-02 Firewall-2018-03
IDS-2018-01 IDS-2018-02 IDS-2018-03
John Hubbard [@SecHubb] 13
Shards and Documents
Index Shards Documents
John Hubbard [@SecHubb] 14
John Hubbard [@SecHubb] 15
Reason 1: Schema on Ingest
Many SIEMs:
Schema applied at search time
Elasticsearch:
Schema applied at ingestion
John Hubbard [@SecHubb] 16
Reason 2: Data is distributed
Index
Shards
Nodes
John Hubbard [@SecHubb] 17
Shard Types
Primary Shards
• Like RAID 0 – Need all shards to make the whole index
Replica Shards
• Like RAID 1
• Each primary shard has arbitrary number of copies
• Each copy can be polled to balance search load
John Hubbard [@SecHubb] 18
Shards
• All shards belong to and make up an index
• Enables arbitrary horizontal scaling
• Spread evenly across all available hardware
• Designated a Primary or Replica Primary Shard 1
Primary Shard 2
Primary Shard 3
Replica Shard 1
Replica Shard 2
Replica Shard 3
Full
Index
Data
John Hubbard [@SecHubb] 19
Primaries and Replicas
Copy 2
Shards
Nodes
P0 P1 R0 R1
Copy 1
John Hubbard [@SecHubb] 20
Primaries and Replicas
Copy 2
Shards
Nodes
P0 P1 R0 R1
Copy 1 Copy 3
R0 R1
John Hubbard [@SecHubb] 21
Balancing Writes
Incoming Logs
Shards
Nodes
P0 P1 P2 P3 P4 P5
John Hubbard [@SecHubb] 22
Balancing Searches
Search Requests
Shards
Nodes
P0 R0 R0 R0 R0 R0
John Hubbard [@SecHubb] 23
Balancing Searches: multi-shard
Search 2
Shards
Nodes
P0 P1 R0 R1
Search 1 Search 3
R0 R1
John Hubbard [@SecHubb] 24
Documents to Fields
Document Single Log
(Converted to JSON
by Logstash)
Fields
John Hubbard [@SecHubb] 25
Documents
• Indices hold documents in
serialized JSON objects
• 1 document = 1 log entry
• Contains "field : value" pairs
• Metadata
• _index – Index the document
belongs to
• _id – unique ID for that log
• _source – parsed log fields
Fields and Mappings
• Field – A key-value pair inside a document
• username: admin
• hostname: web-server1
• Mapping - Defines information about the fields
• Think "database schema"
• The data type for each field (integer, ip, keyword, etc.)
John Hubbard [@SecHubb] 27
Key Concept: Keyword vs. Text
String datatypes are either text or keyword, or both!
• Keyword indexes the exact values
• Example: Usernames, ID numbers, tags, FQDNs
• Binary search results – full exact matches, or not
• Text type breaks things up into pieces
• Example: "http://www.mywebmail.com/mailbox/mail1.htm"
• Allows searching for "http", "www.mywebmail.com", "mailbox", "mail1.htm"
• Fed through an "analyzer"
• This data type cannot be aggregated / visualized
John Hubbard [@SecHubb] 28
http://www.mywebmail.com/mailbox/mail1.htm
Text Data Type Example
Character Filter
http www.mywebmail.com mailbox mail1.htm
Tokenizer
http www.mywebmail.com mailbox mail1.htm
Tokens can be
searched for
Where Tokens Go: Inverted Index
Lucene builds "inverted index" of
tokens in text field data
Doc 1: "The woman is walking down
the street."
Doc 2: "The man is walking into the
store."
Tokens Doc 1 Doc 2
the x x
woman x
is x x
walking x x
down x
street x
man x
into x
store x
John Hubbard [@SecHubb] 30
Elasticsearch instance
Elasticsearch Term Summary
Shard
Lucene
Cluster = Multiple
Nodes
Segment
Segment
Segment
Shard
Lucene
Segment
Segment
Segment
Index
Shard
Lucene
Segment
Segment
Segment
Shard
Lucene
Segment
Segment
Segment
Index
Node
Holds one log type
Partial index
Search engine
"Inverted index"
Kibana
John Hubbard [@SecHubb] 32
Kibana Interface
• Discover - Search and explore data
• Visualize - Create graphs and charts
• Dashboard – Display a collection of saved items
• Timelion – Unique time series data visualization
• Canvas – New visualization type
• Machine Learning – Ponies and magic
• Infrastructure – Monitor all Metricbeats
• Logs – Watch logs streaming from Filebeat
• Dev Tools – Console for API access
• Monitoring – Health of your cluster/agents/logstash
• Management – Manage the cluster
Using the Discover Tab
Histogram
Document data
Field list
Index pattern
Time filter
Discover Tab Details
Field must exist
Add as column
Filter out this field
value
Filter for this field value
Data type
Move left/right
Remove this column
Sort by this
column
Show
document
Index Patterns
• Kibana must be told to show an index for searching
• Searching can be performed on more than 1 index at once
Example usage:
• "*" - Search ALL indices
• "firewall-*"
• "firewall-pfsense-*"
• "firewall-pfsense-2019-*"
• "alexa-top1M"
John Hubbard [@SecHubb] 36
Visualization Types
Creating Visualizations
• Metrics: What to calculate
• Buckets: How to group it
"I want to see <metric> per <bucket>"
• "Total bytes"
• "Total bytes per username"
• "Request count, bytes per HTTP method"
• "Requests per user per site"
John Hubbard [@SecHubb] 38
Bucket Options
• Date Histogram (time)
• Date Range
• Filters
• Histogram
• IPv4 Range
• Range
• Significant Terms
• Terms (log fields)
John Hubbard [@SecHubb] 39
Visualization Demo
John Hubbard [@SecHubb] 40
Default Elasticsearch Security
Elasticsearch is completely open by default
John Hubbard [@SecHubb] 41
Options for Security
•N00b mode: nginx reverse proxy with basic auth
•Better:
•Best:
John Hubbard [@SecHubb] 42
Logstash
John Hubbard [@SecHubb] 43
Logstash
• Free, developed and maintained by Elastic
• Integrates with Beats
• Integrates with Elasticsearch
• Tons of plugins
• Easy to learn and use
• Built-in buffering
• Back-pressure support
Logstash – Ingestion Workhorse
Syslog
TCP
UDP
Other
Routing to Logstash
Logstash01
Logstash02
Logstash03
Load
Balancer
Input -> Filter -> Output
Logstash has 3 components:
• Input - Methods to listen for and accept logs
• Filter - Filters, parses, and enriches logs
• Output - Sends logs to another system or program
Input
plugins
Filter
plugins
Output
plugins
Logstash Pipeline
Log source Log destination
Logstash Config Files
John Hubbard [@SecHubb] 48
For our premade configs, see:
https://github.com/HASecuritySolutions/Logstash
Data Ingestion Demo
John Hubbard [@SecHubb] 49
Input Plugins
• Input receives logs in multiple formats
• Key plugins:
• Common options – beats, syslog, file, http, tcp,
udp, elasticsearch
• Database – jdbc, sqlite
• Message Brokers – kafka, redis, rabbitmq
Input
plugins
Filter
plugins
Output
plugins
Filter Plugins
• Filter section parses, filters, and enriches logs
• Key plugins:
• Parsing - csv, grok, kv, json, syslog_pri, xml, date
• Log filtering - drop
• Enrichment - dns, elasticsearch, geoip, mutate, rest,
oui, useragent, tld, and ruby
Input
plugins
Filter
plugins
Output
plugins
Output Plugins
• Output steers parsed logs to multiple destinations
• Key plugins:
• elasticsearch – For storage
• stdout – for debugging and development
• 3rd party applications - email, irc, csv, kafka,
rabbitmq, graphite, google_cloud_storage,
jira, nagios, pagerduty, sns, tcp/udp
Input
plugins
Filter
plugins
Output
plugins
Traditional Logging - Syslog
<81>Feb 21 14:43:13 logparse sudo: jhubbard : 1
incorrect password attempt ; TTY=pts/1 ;
PWD=/var/log ; USER=root ; COMMAND=/bin/su
•
PRI = <81>
Time/date = Jan 4 14:43:13
Source host = logparse
Source process = sudo
Message = jhubbard : 1
incorrect password attempt ;
TTY=pts/1 ; PWD=/var/log ;
USER=root ;
COMMAND=/bin/su
The Problems With Syslog
• Unstructured syslog is the worst
• Wrong regex? No parsing
• No pre-made regex? No parsing
• Poor regex? Poor performance = Low EPS
• Unparsed logs means your analytics don't work!
• Grok plugin in Logstash eases pain of writing statements
• Gives pre-made regexs a name
• Use the name, statement becomes readable and dependable
• Ideally new log formats should be used when available
Log Standardization
Better log formats are becoming more prevalent
• Comma Separated Values (CSV)
• Key-Value pairs (KV)
• JavaScript Object Notation (JSON)
Logstash has plugins for these log formats
• csv, kv, and json
csv - Filter Plugin
Delimited values can be automatically extracted
csv {
columns => ["src_ip","src_port","dst_ip",
"method","virtual_host","uri"]
}
"10.4.55.1","50001","8.8.8.8","GET"
,"sec455.com","/page.php"
kv - Filter Plugin
Syslog is still the most common transport method
• Syslog message portion is not standardized
• Standardization inside syslog message is becoming more common
Example: Firewall log message uses key : value pairs
kv {
value_split => "="
field_split => " "
}
Example log message:
src_ip=10.0.01 src_port=50001
dst_ip=8.8.8.8 dst_port=53
policyid=17 action=allow
kv + Logstash: Easing syslog pain
<81>Jan 4 14:43:13 logparse sudo: jhubbard : 1 incorrect password
attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ; COMMAND=/bin/su
Applying Logstash config:
input {
syslog {}
}
filter {
kv {}
}
"severity" => 1,
"syslog_severity_code" => 5,
"syslog_facility" => "user-level",
"syslog_facility_code" => 1,
"program" => "sudo",
"message" => "jhubbard : 1 incorrect
password attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ;
COMMAND=/bin/sun",
"priority" => 81,
"logsource" => "logparse",
"USER" => "root",
"syslog_severity" => "notice",
"@timestamp" => 2017-01-04T19:43:13.000Z,
"TTY" => "pts/1",
"COMMAND" => "/bin/sun",
"PWD" => "/var/log",
"facility" => 10,
"severity_label" => "Alert",
"facility_label" => "security/authorization"
json - Filter Plugin
The easiest…the json plugin
json {
source => "message"}
}
That's all!
Windows logs have lots of fields, let JSON handle it!
Full Elastic Stack In a Nutshell
1. Send things to Logstash via agents or forwarding
2. Parse them in whatever way you want
3. Send them to Elasticsearch for storage
4. Query Elasticsearch via Kibana
John Hubbard [@SecHubb] 60
Default Ports
:9200
:5601
:9300
HTTP
HTTP
:5044
Dual Stack SIEM
John Hubbard [@SecHubb] 62
Logstash to Multiple SIEMs
Logs
Commercial
SIEM
Elasticsearch
Logstash Log Pulling
Commercial
SIEM
Elasticsearch
Logs
Pull
Message Broker to SIEM
Logs
Commercial
SIEM
Elasticsearch
Log Agent
The Full Layout
John Hubbard [@SecHubb] 66
https://www.elastic.co/assets/blt2614227bb99b9878/architecture-best-practices.pdf
Hardware
Backup Slides
John Hubbard [@SecHubb] 67
CPU and Memory
• How much CPU and memory are required?
Memory will run out first
• Use as much as possible
• 8GB+ per node
• 64GB = sweet spot (Java limitations)
• <=31GB dedicated to Java max
• /etc/elasticsearch/jvm.options file
CPU – multi-core/node, 64bit
• More cores better than faster speed
Heap
OS / Lucene
Node RAM
<=31GB
John Hubbard [@SecHubb] 68
All
other
RAM
See: https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
Networking
• You can never have too much bandwidth!
• Moving 50GB shards node to node
• Returning large query results
• Restoring from backup
• Network Setup:
• 1GB is required
• 10GB is better!
• Minimize latency
• Jumbo frames enabled
John Hubbard [@SecHubb] 69
Hard Drives
• Disk speed for logging clusters is VERY important
• Lots of hard drives for high IO, not one big one
• RAID0 setup, replica shards take care of availability
John Hubbard [@SecHubb] 70
Thanks!
John Hubbard [@SecHubb] 71

Weitere ähnliche Inhalte

Was ist angesagt?

SIEM presentation final
SIEM presentation finalSIEM presentation final
SIEM presentation final
Rizwan S
 

Was ist angesagt? (20)

Security Information and Event Management (SIEM)
Security Information and Event Management (SIEM)Security Information and Event Management (SIEM)
Security Information and Event Management (SIEM)
 
SIEM presentation final
SIEM presentation finalSIEM presentation final
SIEM presentation final
 
Beginner's Guide to SIEM
Beginner's Guide to SIEM Beginner's Guide to SIEM
Beginner's Guide to SIEM
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCache
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
dlux - Splunk Technical Overview
dlux - Splunk Technical Overviewdlux - Splunk Technical Overview
dlux - Splunk Technical Overview
 
Elk - An introduction
Elk - An introductionElk - An introduction
Elk - An introduction
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Static Analysis Security Testing for Dummies... and You
Static Analysis Security Testing for Dummies... and YouStatic Analysis Security Testing for Dummies... and You
Static Analysis Security Testing for Dummies... and You
 
Fleet and elastic agent
Fleet and elastic agentFleet and elastic agent
Fleet and elastic agent
 
Combining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityCombining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified Observability
 
Security Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SMESecurity Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SME
 
CLOUD NATIVE SECURITY
CLOUD NATIVE SECURITYCLOUD NATIVE SECURITY
CLOUD NATIVE SECURITY
 
SOAR and SIEM.pptx
SOAR and SIEM.pptxSOAR and SIEM.pptx
SOAR and SIEM.pptx
 
SIEM Primer:
SIEM Primer:SIEM Primer:
SIEM Primer:
 
SIEM : Security Information and Event Management
SIEM : Security Information and Event Management SIEM : Security Information and Event Management
SIEM : Security Information and Event Management
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elk
 
Security Information and Event Management (SIEM)
Security Information and Event Management (SIEM)Security Information and Event Management (SIEM)
Security Information and Event Management (SIEM)
 
Threat detection on AWS: An introduction to Amazon GuardDuty - FND216 - AWS r...
Threat detection on AWS: An introduction to Amazon GuardDuty - FND216 - AWS r...Threat detection on AWS: An introduction to Amazon GuardDuty - FND216 - AWS r...
Threat detection on AWS: An introduction to Amazon GuardDuty - FND216 - AWS r...
 
Security Information Event Management - nullhyd
Security Information Event Management - nullhydSecurity Information Event Management - nullhyd
Security Information Event Management - nullhyd
 

Ähnlich wie The Elastic Stack as a SIEM

Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetup
gregchanan
 
Introduction to g reg 4.6.0
Introduction to g reg 4.6.0Introduction to g reg 4.6.0
Introduction to g reg 4.6.0
WSO2
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
gregchanan
 

Ähnlich wie The Elastic Stack as a SIEM (20)

ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Why do you consider to adopt Koha Open Source Integrated Library System for y...
Why do you consider to adopt Koha Open Source Integrated Library System for y...Why do you consider to adopt Koha Open Source Integrated Library System for y...
Why do you consider to adopt Koha Open Source Integrated Library System for y...
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
 
Logstash, Elasticsearch and Kibana
Logstash, Elasticsearch and KibanaLogstash, Elasticsearch and Kibana
Logstash, Elasticsearch and Kibana
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetup
 
Introduction to g reg 4.6.0
Introduction to g reg 4.6.0Introduction to g reg 4.6.0
Introduction to g reg 4.6.0
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at Yahoo
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
What's new in JBoss ON 3.2
What's new in JBoss ON 3.2What's new in JBoss ON 3.2
What's new in JBoss ON 3.2
 
Workshop: Big Data Visualization for Security
Workshop: Big Data Visualization for SecurityWorkshop: Big Data Visualization for Security
Workshop: Big Data Visualization for Security
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
SharePoint 2013 Search Operations
SharePoint 2013 Search OperationsSharePoint 2013 Search Operations
SharePoint 2013 Search Operations
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
 
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

The Elastic Stack as a SIEM

  • 1. The Elastic Stack as a SIEM Philly Security Shell 2019
  • 2. Who Am I? John Hubbard [@SecHubb] • Previous SOC Lead @ GlaxoSmithKline • Certified SANS Instructor • Author • SEC450: Blue Team Fundamentals – Security Analysis and Operations • SEC455: SIEM Design & Implementation (Elasticsearch as a SIEM) • Instructor • SEC511: Continuous Monitoring & Security Operations • SEC555: SIEM with Tactical Analytics • Mission: Make life awesome for the blue team • Data for this talk: https://github.com/SecHubb/SecShell_Demo
  • 3. What is a SIEM? • A central log repository that enriches logs and assists threat detection • Components • Log Sources • Log Aggregator • Log Storage & Indexing • Search & Viz. Interface + Alerting Engine Log Sources Log Aggregation / Queue Log Storage & Indexing Search, Visualization, & Alerting John Hubbard [@SecHubb] 3
  • 4. What is the Elastic Stack? • Open source, real-time search and analytics engine • Made up of 4 pieces: collection, ingestion, storage, and visualization John Hubbard [@SecHubb] 4
  • 5. History of Elastic 2010 Created by Shay Bannon Recipe search engine for his wife in culinary school Inspired by Minority Report 2012 Elastic Co. Founded 2019 Used by Wikipedia, Stack Overflow, GitHub, Netflix, LinkedIn, … One of the most popular projects on GitHub Iterating versions rapidly with awesome new features John Hubbard [@SecHubb] 5
  • 6. Elastic Stack vs. SIEM Log Sources Log Aggregation / Queue Log Storage & Indexing Search, Visualization, & Alerting John Hubbard [@SecHubb] 6
  • 7. Elastic stack as a SIEM Used for many different use cases • NOT a SIEM out of the box • Not in the magic quadrant as one • Can do the things a SIEM does Gartner's definition of a SIEM: "supports threat detection and security incident response through the real- time collection and historical analysis of security events from a wide variety of event and contextual data sources. It also supports compliance reporting and incident investigation through analysis of historical data from these sources." John Hubbard [@SecHubb] 7
  • 8. Elasticsearch as a SIEM • Collects, indexes, and stores high volumes of logs • Functional visualizations and dashboards • Reporting and alerting • Log enrichment through plugins • Compatible with almost every format • Log retention settings • Anomaly detection via machine learning • RBAC securable John Hubbard [@SecHubb] 8
  • 9. Elastic Stack Overview Raw Logs Raw Logs Log Ingestion & Parsing Log Storage Search & Visualization John Hubbard [@SecHubb] 9
  • 10. Winlogbeat Beats Agents Lightweight log agents written in Go • Filebeat • Winlogbeat • Packetbeat • Auditbeat • Functionbeat • Journalbeat • Community Beats FilebeatPacketbeat John Hubbard [@SecHubb] 10
  • 12. Clusters, Nodes, and Indices Cluster Node Indices John Hubbard [@SecHubb] 12
  • 13. Index Creation Across Time Firewall-2018-01 Firewall-2018-02 Firewall-2018-03 IDS-2018-01 IDS-2018-02 IDS-2018-03 John Hubbard [@SecHubb] 13
  • 14. Shards and Documents Index Shards Documents John Hubbard [@SecHubb] 14
  • 16. Reason 1: Schema on Ingest Many SIEMs: Schema applied at search time Elasticsearch: Schema applied at ingestion John Hubbard [@SecHubb] 16
  • 17. Reason 2: Data is distributed Index Shards Nodes John Hubbard [@SecHubb] 17
  • 18. Shard Types Primary Shards • Like RAID 0 – Need all shards to make the whole index Replica Shards • Like RAID 1 • Each primary shard has arbitrary number of copies • Each copy can be polled to balance search load John Hubbard [@SecHubb] 18
  • 19. Shards • All shards belong to and make up an index • Enables arbitrary horizontal scaling • Spread evenly across all available hardware • Designated a Primary or Replica Primary Shard 1 Primary Shard 2 Primary Shard 3 Replica Shard 1 Replica Shard 2 Replica Shard 3 Full Index Data John Hubbard [@SecHubb] 19
  • 20. Primaries and Replicas Copy 2 Shards Nodes P0 P1 R0 R1 Copy 1 John Hubbard [@SecHubb] 20
  • 21. Primaries and Replicas Copy 2 Shards Nodes P0 P1 R0 R1 Copy 1 Copy 3 R0 R1 John Hubbard [@SecHubb] 21
  • 22. Balancing Writes Incoming Logs Shards Nodes P0 P1 P2 P3 P4 P5 John Hubbard [@SecHubb] 22
  • 23. Balancing Searches Search Requests Shards Nodes P0 R0 R0 R0 R0 R0 John Hubbard [@SecHubb] 23
  • 24. Balancing Searches: multi-shard Search 2 Shards Nodes P0 P1 R0 R1 Search 1 Search 3 R0 R1 John Hubbard [@SecHubb] 24
  • 25. Documents to Fields Document Single Log (Converted to JSON by Logstash) Fields John Hubbard [@SecHubb] 25
  • 26. Documents • Indices hold documents in serialized JSON objects • 1 document = 1 log entry • Contains "field : value" pairs • Metadata • _index – Index the document belongs to • _id – unique ID for that log • _source – parsed log fields
  • 27. Fields and Mappings • Field – A key-value pair inside a document • username: admin • hostname: web-server1 • Mapping - Defines information about the fields • Think "database schema" • The data type for each field (integer, ip, keyword, etc.) John Hubbard [@SecHubb] 27
  • 28. Key Concept: Keyword vs. Text String datatypes are either text or keyword, or both! • Keyword indexes the exact values • Example: Usernames, ID numbers, tags, FQDNs • Binary search results – full exact matches, or not • Text type breaks things up into pieces • Example: "http://www.mywebmail.com/mailbox/mail1.htm" • Allows searching for "http", "www.mywebmail.com", "mailbox", "mail1.htm" • Fed through an "analyzer" • This data type cannot be aggregated / visualized John Hubbard [@SecHubb] 28
  • 29. http://www.mywebmail.com/mailbox/mail1.htm Text Data Type Example Character Filter http www.mywebmail.com mailbox mail1.htm Tokenizer http www.mywebmail.com mailbox mail1.htm Tokens can be searched for
  • 30. Where Tokens Go: Inverted Index Lucene builds "inverted index" of tokens in text field data Doc 1: "The woman is walking down the street." Doc 2: "The man is walking into the store." Tokens Doc 1 Doc 2 the x x woman x is x x walking x x down x street x man x into x store x John Hubbard [@SecHubb] 30
  • 31. Elasticsearch instance Elasticsearch Term Summary Shard Lucene Cluster = Multiple Nodes Segment Segment Segment Shard Lucene Segment Segment Segment Index Shard Lucene Segment Segment Segment Shard Lucene Segment Segment Segment Index Node Holds one log type Partial index Search engine "Inverted index"
  • 33. Kibana Interface • Discover - Search and explore data • Visualize - Create graphs and charts • Dashboard – Display a collection of saved items • Timelion – Unique time series data visualization • Canvas – New visualization type • Machine Learning – Ponies and magic • Infrastructure – Monitor all Metricbeats • Logs – Watch logs streaming from Filebeat • Dev Tools – Console for API access • Monitoring – Health of your cluster/agents/logstash • Management – Manage the cluster
  • 34. Using the Discover Tab Histogram Document data Field list Index pattern Time filter
  • 35. Discover Tab Details Field must exist Add as column Filter out this field value Filter for this field value Data type Move left/right Remove this column Sort by this column Show document
  • 36. Index Patterns • Kibana must be told to show an index for searching • Searching can be performed on more than 1 index at once Example usage: • "*" - Search ALL indices • "firewall-*" • "firewall-pfsense-*" • "firewall-pfsense-2019-*" • "alexa-top1M" John Hubbard [@SecHubb] 36
  • 38. Creating Visualizations • Metrics: What to calculate • Buckets: How to group it "I want to see <metric> per <bucket>" • "Total bytes" • "Total bytes per username" • "Request count, bytes per HTTP method" • "Requests per user per site" John Hubbard [@SecHubb] 38
  • 39. Bucket Options • Date Histogram (time) • Date Range • Filters • Histogram • IPv4 Range • Range • Significant Terms • Terms (log fields) John Hubbard [@SecHubb] 39
  • 41. Default Elasticsearch Security Elasticsearch is completely open by default John Hubbard [@SecHubb] 41
  • 42. Options for Security •N00b mode: nginx reverse proxy with basic auth •Better: •Best: John Hubbard [@SecHubb] 42
  • 44. Logstash • Free, developed and maintained by Elastic • Integrates with Beats • Integrates with Elasticsearch • Tons of plugins • Easy to learn and use • Built-in buffering • Back-pressure support
  • 45. Logstash – Ingestion Workhorse Syslog TCP UDP Other
  • 47. Input -> Filter -> Output Logstash has 3 components: • Input - Methods to listen for and accept logs • Filter - Filters, parses, and enriches logs • Output - Sends logs to another system or program Input plugins Filter plugins Output plugins Logstash Pipeline Log source Log destination
  • 48. Logstash Config Files John Hubbard [@SecHubb] 48 For our premade configs, see: https://github.com/HASecuritySolutions/Logstash
  • 49. Data Ingestion Demo John Hubbard [@SecHubb] 49
  • 50. Input Plugins • Input receives logs in multiple formats • Key plugins: • Common options – beats, syslog, file, http, tcp, udp, elasticsearch • Database – jdbc, sqlite • Message Brokers – kafka, redis, rabbitmq Input plugins Filter plugins Output plugins
  • 51. Filter Plugins • Filter section parses, filters, and enriches logs • Key plugins: • Parsing - csv, grok, kv, json, syslog_pri, xml, date • Log filtering - drop • Enrichment - dns, elasticsearch, geoip, mutate, rest, oui, useragent, tld, and ruby Input plugins Filter plugins Output plugins
  • 52. Output Plugins • Output steers parsed logs to multiple destinations • Key plugins: • elasticsearch – For storage • stdout – for debugging and development • 3rd party applications - email, irc, csv, kafka, rabbitmq, graphite, google_cloud_storage, jira, nagios, pagerduty, sns, tcp/udp Input plugins Filter plugins Output plugins
  • 53. Traditional Logging - Syslog <81>Feb 21 14:43:13 logparse sudo: jhubbard : 1 incorrect password attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ; COMMAND=/bin/su • PRI = <81> Time/date = Jan 4 14:43:13 Source host = logparse Source process = sudo Message = jhubbard : 1 incorrect password attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ; COMMAND=/bin/su
  • 54. The Problems With Syslog • Unstructured syslog is the worst • Wrong regex? No parsing • No pre-made regex? No parsing • Poor regex? Poor performance = Low EPS • Unparsed logs means your analytics don't work! • Grok plugin in Logstash eases pain of writing statements • Gives pre-made regexs a name • Use the name, statement becomes readable and dependable • Ideally new log formats should be used when available
  • 55. Log Standardization Better log formats are becoming more prevalent • Comma Separated Values (CSV) • Key-Value pairs (KV) • JavaScript Object Notation (JSON) Logstash has plugins for these log formats • csv, kv, and json
  • 56. csv - Filter Plugin Delimited values can be automatically extracted csv { columns => ["src_ip","src_port","dst_ip", "method","virtual_host","uri"] } "10.4.55.1","50001","8.8.8.8","GET" ,"sec455.com","/page.php"
  • 57. kv - Filter Plugin Syslog is still the most common transport method • Syslog message portion is not standardized • Standardization inside syslog message is becoming more common Example: Firewall log message uses key : value pairs kv { value_split => "=" field_split => " " } Example log message: src_ip=10.0.01 src_port=50001 dst_ip=8.8.8.8 dst_port=53 policyid=17 action=allow
  • 58. kv + Logstash: Easing syslog pain <81>Jan 4 14:43:13 logparse sudo: jhubbard : 1 incorrect password attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ; COMMAND=/bin/su Applying Logstash config: input { syslog {} } filter { kv {} } "severity" => 1, "syslog_severity_code" => 5, "syslog_facility" => "user-level", "syslog_facility_code" => 1, "program" => "sudo", "message" => "jhubbard : 1 incorrect password attempt ; TTY=pts/1 ; PWD=/var/log ; USER=root ; COMMAND=/bin/sun", "priority" => 81, "logsource" => "logparse", "USER" => "root", "syslog_severity" => "notice", "@timestamp" => 2017-01-04T19:43:13.000Z, "TTY" => "pts/1", "COMMAND" => "/bin/sun", "PWD" => "/var/log", "facility" => 10, "severity_label" => "Alert", "facility_label" => "security/authorization"
  • 59. json - Filter Plugin The easiest…the json plugin json { source => "message"} } That's all! Windows logs have lots of fields, let JSON handle it!
  • 60. Full Elastic Stack In a Nutshell 1. Send things to Logstash via agents or forwarding 2. Parse them in whatever way you want 3. Send them to Elasticsearch for storage 4. Query Elasticsearch via Kibana John Hubbard [@SecHubb] 60
  • 62. Dual Stack SIEM John Hubbard [@SecHubb] 62
  • 63. Logstash to Multiple SIEMs Logs Commercial SIEM Elasticsearch
  • 65. Message Broker to SIEM Logs Commercial SIEM Elasticsearch Log Agent
  • 66. The Full Layout John Hubbard [@SecHubb] 66 https://www.elastic.co/assets/blt2614227bb99b9878/architecture-best-practices.pdf
  • 68. CPU and Memory • How much CPU and memory are required? Memory will run out first • Use as much as possible • 8GB+ per node • 64GB = sweet spot (Java limitations) • <=31GB dedicated to Java max • /etc/elasticsearch/jvm.options file CPU – multi-core/node, 64bit • More cores better than faster speed Heap OS / Lucene Node RAM <=31GB John Hubbard [@SecHubb] 68 All other RAM See: https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
  • 69. Networking • You can never have too much bandwidth! • Moving 50GB shards node to node • Returning large query results • Restoring from backup • Network Setup: • 1GB is required • 10GB is better! • Minimize latency • Jumbo frames enabled John Hubbard [@SecHubb] 69
  • 70. Hard Drives • Disk speed for logging clusters is VERY important • Lots of hard drives for high IO, not one big one • RAID0 setup, replica shards take care of availability John Hubbard [@SecHubb] 70