SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
London | 14-15 November 2019
A Kernel of Truth
Matt Carroll
Matt Carroll
@grimmware
A Kernel of Truth
Intrusion Detection and Attestation with eBPF
● Matt Carroll
○ @grimmware
○ github.com/oholiab
● Infrastructure Security
Engineer at Yelp
● Ex-SRE (like a sysadmin but
with more yaml)
● Hand-wringing Linux
botherer
Who am I?
● We built a supplementary* IDS and it’s pretty cool!
● Utilizing OS features as security features
● Told in (roughly) the order it happened.
What is this about?
● How to get a greenfield security project off the ground
○ Treating defensive security like economics
○ Gluing together extant technologies to bootstrap
custom security tools
○ Using your business logic to maximize signal vs noise
What is this about?
Yelp’s Mission
Connecting people with great
local businesses.
● Built on Mesos + Marathon
+ Docker
● More recently migration
towards k8s
● Majority of our workloads
run here
● What are they all doing???
PaaSTA
Network IDS: Amazon GuardDuty
Kind of unsurprising, also pretty unhelpful...
Welp...
Uuuhhh… 🤔
WHAAAAAAA
😱
● What host class connected?
● What IP/ASN did it connect to?
● What’s on the other end?
● How long was the connection?
● What direction?
● How many bytes were transferred?
● What did the pslogs say?
Attestation From Inference
● What host class connected?
● What IP/ASN did it connect to?
● What’s on the other end?
● How long was the connection?
● What direction?
● How many bytes were transferred?
● What did the pslogs say?
Attestation From Inference
lol jk
Context is lost as soon as the
instantiating process ends
What if we could reduce MTTR for false
positives?
● When a GuardDuty alert fires I want to be able to
determine if it’s a false-positive quickly
● Only for GuardDuty traffic (not internal to our VPCs)
● Only for outbound TCP (i.e. non-RFC1918)
● I want the entire calling process tree so I can see full
local causality
● Include process ownership information
● Must not require workload tooling
The problem space
eBPF!
eBPF!
● “Berkeley Packet Filter” from BSD
● An in-kernel VM accessed as a device
(/dev/bpf)
● Limited number of registers
● No loops (to prevent kernel
deadlocking)
● Used for packet filtering
BPF
● An in-kernel VM in Linux (and now FreeBSD!)
● It’s “extended”!
● Moar registers than BPF
● Used for hooking syscalls, tracing, proxying sockets, and
(you guessed) in-kernel packet filtering
○ Can actually offload to some NICs!
● In our case, dispatching kprobes for the tcp_v4_connect
syscall
eBPF
Enjoy writing your filters
as an array of BPF VM
instructions...
bcc + psutil = PROFIT???
bcc + psutil = PROFIT???
✅
How it works
sd
54321
for each syscall...
● Filters in-kernel from Jinja2
templates which iterate over
subnets in YAML
configuration
● Events that don’t get filtered
out are passed to userland
Python daemon
● psutil used to crawl process
tree to init and log alongside
other metadata
The End.
Except it was a hackathon project so all
it did was print events to stdout and
could only match classful networks and
I developed it on my personal laptop.
The Road To Production
Don’t try
to be
clever
with
bitwise
network
matching
● I realised only the classful networks worked
because of the byte boundaries
● Don’t try to do clever bitwise shifting with the
mask length
● Endianness and byte ordering between network
and host don’t work how you think they do
● No srs
Matching
all CIDRs
● A coworker was trying to figure out which batch jobs
were accessing a service for a data auth project
● He asked me if we could match ports
● I said I’d have it:
○ Matching ports
○ Dockerized for adhoc usage
○ By the next day
● The next day he found all
unauthenticated clients.
Dockerizing for debugging
● Contains python2.7 and
dependencies (sorry)
● Needs some setup at
runtime
● Volume mount
/etc/passwd for uid
mapping
● Not your typical flags:
○ --privileged
○ --cap-add sys_admin
○ --pid host
● Don’t worry I am a
professional probably.
pidtree-bcc in Docker
● We run our own PaaS called PaaSTA which uses Docker
as containerizer
● Runs the vast majority of our workloads
● Can pull-from-registry and run in a systemd unit file
without further setup
● Don’t have to install dependencies
(inc. LLVM, python2)
● Get coverage quickly
Opportunistic deploy with Docker
● Previous projects with goaudit meant we already had a
secure logging pipeline for reading a FIFO and outputting
to Amazon Kinesis
○ syslog2kinesis adds other Yelpy metadata (e.g.
hostname, environment, Puppet role...)
● Originally fed to our Logstash => Elasticsearch SIEM
● Migrated to Kinesis Firehose => Splunk this quarter <3
Log aggregation
● Better to ask forgiveness than permission...
● Rolled out to two security devboxes and watched the
logs roll in!
● Negligible performance impact!!!
○ As postulated, cost of subnet filtering << cost of
instantiating a TCP connection
● Lots of connections out to public Amazon IPs creating a
lot of noise
Dip Test
If only Amazon maintained some kind
of list of their public prefixes...
Surely you can’t load ~200 netblocks
into the kernel and compare all non-
RFC1918 tcp_v4_connect syscalls to
them in a performant manner...
Surely you can’t load ~200 netblocks
into the kernel and compare all non-
RFC1918 tcp_v4_connect syscalls to
them in a performant manner...
● ~25,000 - ~50,000 messages per hour across dev and
stage
● Once accidentally load-tested at ~80,000 messages in
5m from one host for several hours
● Nobody on the host noticed
● TCP connections are way more expensive than the
filters!
Load
● bpf_trace_printk() -> BPF_PERF_OUTPUT()
○ Global (e.g. per-kernel) debug output with hand-
hacked json and string manipulation
○ To structured data in a ring buffer
○ Multi-tenancy makes it a better utility and more
testable!
● Added unit tests
● Adding integration tests
● Adding infrastructure for deploy in production
environment
Undoing my nasty hacks
● De-containerize (e.g. debian package)
● Python3
● Plugin for container awareness
○ Easy mapping to service and therefore owner!
● Enable immutable loginuid and add that to metadata
○ --loginuid-immutable under `man auditctl`
○ Cryptically says “but can cause some problems in
certain kinds of containers”
● Threat modelling/hardening!
Future work
● Performance improvements
○ BPF longest-match maps
○ Pre-processing masks
○ Probably totally unnecessary
● Moar syscalls!
○ TCP listens, ipv6, UDP, SUID, forwarded SSH socket
reads…
● SIEM tooling
○ ASN matching, bad IP matching, GuardDuty auto-
enrichment...
Future work
www.yelp.com/careers/
We're Hiring!
@YelpEngineering
fb.com/YelpEngineers
engineeringblog.yelp.com
github.com/yelp
London | 14-15 November 2019
https://github.com/Yelp/pidtree-bcc
@grimmware
Thanks for listening!

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ignacy Kowalczyk
Ignacy KowalczykIgnacy Kowalczyk
Ignacy Kowalczyk
 
DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...
DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...
DevSecCon London 2017: Permitting agility whilst enforcing security by Alina ...
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
 
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
 
Modern Reconnaissance Phase on APT - protection layer
Modern Reconnaissance Phase on APT - protection layerModern Reconnaissance Phase on APT - protection layer
Modern Reconnaissance Phase on APT - protection layer
 
Использование Docker в CI / Александр Акбашев (HERE Technologies)
Использование Docker в CI / Александр Акбашев (HERE Technologies)Использование Docker в CI / Александр Акбашев (HERE Technologies)
Использование Docker в CI / Александр Акбашев (HERE Technologies)
 
OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)
 
OpenShift & SELinux with Dan Walsh @rhatdan
OpenShift & SELinux with Dan Walsh @rhatdanOpenShift & SELinux with Dan Walsh @rhatdan
OpenShift & SELinux with Dan Walsh @rhatdan
 
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/20146 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
 
A Distributed Malware Analysis System Cuckoo Sandbox
A Distributed Malware Analysis System Cuckoo SandboxA Distributed Malware Analysis System Cuckoo Sandbox
A Distributed Malware Analysis System Cuckoo Sandbox
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
 
Introduction to InSpec and 1.0 release update
Introduction to InSpec and 1.0 release updateIntroduction to InSpec and 1.0 release update
Introduction to InSpec and 1.0 release update
 
Csw2016 wang docker_escapetechnology
Csw2016 wang docker_escapetechnologyCsw2016 wang docker_escapetechnology
Csw2016 wang docker_escapetechnology
 
Appsec DC - wXf -2010
Appsec DC - wXf  -2010Appsec DC - wXf  -2010
Appsec DC - wXf -2010
 
Security Checkpoints in Agile SDLC
Security Checkpoints in Agile SDLCSecurity Checkpoints in Agile SDLC
Security Checkpoints in Agile SDLC
 
Devoops: DoJ Annual Cybersecurity Training Symposium Edition 2015
Devoops: DoJ Annual Cybersecurity Training Symposium Edition 2015Devoops: DoJ Annual Cybersecurity Training Symposium Edition 2015
Devoops: DoJ Annual Cybersecurity Training Symposium Edition 2015
 
La importancia de versionar el código: GitHub, portafolio y recursos para est...
La importancia de versionar el código: GitHub, portafolio y recursos para est...La importancia de versionar el código: GitHub, portafolio y recursos para est...
La importancia de versionar el código: GitHub, portafolio y recursos para est...
 
Presentation security automation (Selenium Camp)
Presentation security automation (Selenium Camp)Presentation security automation (Selenium Camp)
Presentation security automation (Selenium Camp)
 
Wordpress y Docker, de desarrollo a produccion
Wordpress y Docker, de desarrollo a produccionWordpress y Docker, de desarrollo a produccion
Wordpress y Docker, de desarrollo a produccion
 
Deploying 3 times a day without a downtime @ Rocket Tech Summit in Berlin
Deploying 3 times a day without a downtime @ Rocket Tech Summit in BerlinDeploying 3 times a day without a downtime @ Rocket Tech Summit in Berlin
Deploying 3 times a day without a downtime @ Rocket Tech Summit in Berlin
 

Ähnlich wie DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation with eBPF

Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Docker, Inc.
 
Introduction to Docker and Containers
Introduction to Docker and ContainersIntroduction to Docker and Containers
Introduction to Docker and Containers
Docker, Inc.
 

Ähnlich wie DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation with eBPF (20)

Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1t
 
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo..."Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
 
Building a Small Datacenter
Building a Small DatacenterBuilding a Small Datacenter
Building a Small Datacenter
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
Building a Small DC
Building a Small DCBuilding a Small DC
Building a Small DC
 
MIPS-X
MIPS-XMIPS-X
MIPS-X
 
Introduction to Docker and Containers
Introduction to Docker and ContainersIntroduction to Docker and Containers
Introduction to Docker and Containers
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12x
 
Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in Containerization
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
OSDC 2017 | The evolution of the Container Network Interface by Casey Callend...
 
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network InterfaceOSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
OSDC 2017 - Casey Callendrello -The evolution of the Container Network Interface
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
 

Mehr von DevSecCon

DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
DevSecCon
 
DevSecCon Seattle 2019: Containerizing IT Security Knowledge
DevSecCon Seattle 2019: Containerizing IT Security KnowledgeDevSecCon Seattle 2019: Containerizing IT Security Knowledge
DevSecCon Seattle 2019: Containerizing IT Security Knowledge
DevSecCon
 
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
DevSecCon
 
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
DevSecCon
 
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscapeDevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
DevSecCon
 
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
DevSecCon
 
DevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heelDevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon
 

Mehr von DevSecCon (20)

DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
DevSecCon London 2019: Workshop: Cloud Agnostic Security Testing with Scout S...
 
DevSecCon London 2019: Are Open Source Developers Security’s New Front Line?
DevSecCon London 2019: Are Open Source Developers Security’s New Front Line?DevSecCon London 2019: Are Open Source Developers Security’s New Front Line?
DevSecCon London 2019: Are Open Source Developers Security’s New Front Line?
 
DevSecCon London 2019: How to Secure OpenShift Environments and What Happens ...
DevSecCon London 2019: How to Secure OpenShift Environments and What Happens ...DevSecCon London 2019: How to Secure OpenShift Environments and What Happens ...
DevSecCon London 2019: How to Secure OpenShift Environments and What Happens ...
 
DevSecCon Seattle 2019: Containerizing IT Security Knowledge
DevSecCon Seattle 2019: Containerizing IT Security KnowledgeDevSecCon Seattle 2019: Containerizing IT Security Knowledge
DevSecCon Seattle 2019: Containerizing IT Security Knowledge
 
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
DevSecCon Seattle 2019: Decentralized Authorization - Implementing Fine Grain...
 
DevSecCon Seattle 2019: Liquid Software as the real solution for the Sec in D...
DevSecCon Seattle 2019: Liquid Software as the real solution for the Sec in D...DevSecCon Seattle 2019: Liquid Software as the real solution for the Sec in D...
DevSecCon Seattle 2019: Liquid Software as the real solution for the Sec in D...
 
DevSecCon Seattle 2019: Fully Automated production deployments with HIPAA/HIT...
DevSecCon Seattle 2019: Fully Automated production deployments with HIPAA/HIT...DevSecCon Seattle 2019: Fully Automated production deployments with HIPAA/HIT...
DevSecCon Seattle 2019: Fully Automated production deployments with HIPAA/HIT...
 
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
DevSecCon Singapore 2019: Four years of reflection: How (not) to secure Web A...
 
DevSecCon Singapore 2019: crypto jacking: An evolving threat for cloud contai...
DevSecCon Singapore 2019: crypto jacking: An evolving threat for cloud contai...DevSecCon Singapore 2019: crypto jacking: An evolving threat for cloud contai...
DevSecCon Singapore 2019: crypto jacking: An evolving threat for cloud contai...
 
DevSecCon Singapore 2019: Can "dev", "sec" and "ops" really coexist in the wi...
DevSecCon Singapore 2019: Can "dev", "sec" and "ops" really coexist in the wi...DevSecCon Singapore 2019: Can "dev", "sec" and "ops" really coexist in the wi...
DevSecCon Singapore 2019: Can "dev", "sec" and "ops" really coexist in the wi...
 
DevSecCon Singapore 2019: Workshop - Burp extension writing workshop
DevSecCon Singapore 2019: Workshop - Burp extension writing workshopDevSecCon Singapore 2019: Workshop - Burp extension writing workshop
DevSecCon Singapore 2019: Workshop - Burp extension writing workshop
 
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscapeDevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
DevSecCon Singapore 2019: Embracing Security - A changing DevOps landscape
 
DevSecCon Singapore 2019: Web Services aren’t as secure as we think
DevSecCon Singapore 2019: Web Services aren’t as secure as we thinkDevSecCon Singapore 2019: Web Services aren’t as secure as we think
DevSecCon Singapore 2019: Web Services aren’t as secure as we think
 
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
DevSecCon Singapore 2019: An attacker's view of Serverless and GraphQL apps S...
 
DevSecCon Singapore 2019: The journey of digital transformation through DevSe...
DevSecCon Singapore 2019: The journey of digital transformation through DevSe...DevSecCon Singapore 2019: The journey of digital transformation through DevSe...
DevSecCon Singapore 2019: The journey of digital transformation through DevSe...
 
DevSecCon Singapore 2019: Preventative Security for Kubernetes
DevSecCon Singapore 2019: Preventative Security for KubernetesDevSecCon Singapore 2019: Preventative Security for Kubernetes
DevSecCon Singapore 2019: Preventative Security for Kubernetes
 
DevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heelDevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heel
 
DevSecCon London 2018: Get rid of these TLS certificates
DevSecCon London 2018: Get rid of these TLS certificatesDevSecCon London 2018: Get rid of these TLS certificates
DevSecCon London 2018: Get rid of these TLS certificates
 
DevSecCon London 2018: Open DevSecOps
DevSecCon London 2018: Open DevSecOpsDevSecCon London 2018: Open DevSecOps
DevSecCon London 2018: Open DevSecOps
 
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation with eBPF

  • 1. London | 14-15 November 2019 A Kernel of Truth Matt Carroll
  • 2. Matt Carroll @grimmware A Kernel of Truth Intrusion Detection and Attestation with eBPF
  • 3. ● Matt Carroll ○ @grimmware ○ github.com/oholiab ● Infrastructure Security Engineer at Yelp ● Ex-SRE (like a sysadmin but with more yaml) ● Hand-wringing Linux botherer Who am I?
  • 4. ● We built a supplementary* IDS and it’s pretty cool! ● Utilizing OS features as security features ● Told in (roughly) the order it happened. What is this about?
  • 5. ● How to get a greenfield security project off the ground ○ Treating defensive security like economics ○ Gluing together extant technologies to bootstrap custom security tools ○ Using your business logic to maximize signal vs noise What is this about?
  • 6. Yelp’s Mission Connecting people with great local businesses.
  • 7. ● Built on Mesos + Marathon + Docker ● More recently migration towards k8s ● Majority of our workloads run here ● What are they all doing??? PaaSTA
  • 9.
  • 10. Kind of unsurprising, also pretty unhelpful...
  • 13.
  • 15. ● What host class connected? ● What IP/ASN did it connect to? ● What’s on the other end? ● How long was the connection? ● What direction? ● How many bytes were transferred? ● What did the pslogs say? Attestation From Inference
  • 16. ● What host class connected? ● What IP/ASN did it connect to? ● What’s on the other end? ● How long was the connection? ● What direction? ● How many bytes were transferred? ● What did the pslogs say? Attestation From Inference lol jk
  • 17. Context is lost as soon as the instantiating process ends
  • 18. What if we could reduce MTTR for false positives?
  • 19. ● When a GuardDuty alert fires I want to be able to determine if it’s a false-positive quickly ● Only for GuardDuty traffic (not internal to our VPCs) ● Only for outbound TCP (i.e. non-RFC1918) ● I want the entire calling process tree so I can see full local causality ● Include process ownership information ● Must not require workload tooling The problem space
  • 20. eBPF!
  • 21. eBPF!
  • 22. ● “Berkeley Packet Filter” from BSD ● An in-kernel VM accessed as a device (/dev/bpf) ● Limited number of registers ● No loops (to prevent kernel deadlocking) ● Used for packet filtering BPF
  • 23. ● An in-kernel VM in Linux (and now FreeBSD!) ● It’s “extended”! ● Moar registers than BPF ● Used for hooking syscalls, tracing, proxying sockets, and (you guessed) in-kernel packet filtering ○ Can actually offload to some NICs! ● In our case, dispatching kprobes for the tcp_v4_connect syscall eBPF
  • 24.
  • 25. Enjoy writing your filters as an array of BPF VM instructions...
  • 26.
  • 27.
  • 28. bcc + psutil = PROFIT???
  • 29. bcc + psutil = PROFIT??? ✅
  • 30.
  • 31. How it works sd 54321 for each syscall...
  • 32.
  • 33.
  • 34. ● Filters in-kernel from Jinja2 templates which iterate over subnets in YAML configuration ● Events that don’t get filtered out are passed to userland Python daemon ● psutil used to crawl process tree to init and log alongside other metadata
  • 36. Except it was a hackathon project so all it did was print events to stdout and could only match classful networks and I developed it on my personal laptop.
  • 37. The Road To Production
  • 39. ● I realised only the classful networks worked because of the byte boundaries ● Don’t try to do clever bitwise shifting with the mask length ● Endianness and byte ordering between network and host don’t work how you think they do ● No srs Matching all CIDRs
  • 40. ● A coworker was trying to figure out which batch jobs were accessing a service for a data auth project ● He asked me if we could match ports ● I said I’d have it: ○ Matching ports ○ Dockerized for adhoc usage ○ By the next day ● The next day he found all unauthenticated clients. Dockerizing for debugging
  • 41. ● Contains python2.7 and dependencies (sorry) ● Needs some setup at runtime ● Volume mount /etc/passwd for uid mapping ● Not your typical flags: ○ --privileged ○ --cap-add sys_admin ○ --pid host ● Don’t worry I am a professional probably. pidtree-bcc in Docker
  • 42. ● We run our own PaaS called PaaSTA which uses Docker as containerizer ● Runs the vast majority of our workloads ● Can pull-from-registry and run in a systemd unit file without further setup ● Don’t have to install dependencies (inc. LLVM, python2) ● Get coverage quickly Opportunistic deploy with Docker
  • 43. ● Previous projects with goaudit meant we already had a secure logging pipeline for reading a FIFO and outputting to Amazon Kinesis ○ syslog2kinesis adds other Yelpy metadata (e.g. hostname, environment, Puppet role...) ● Originally fed to our Logstash => Elasticsearch SIEM ● Migrated to Kinesis Firehose => Splunk this quarter <3 Log aggregation
  • 44. ● Better to ask forgiveness than permission... ● Rolled out to two security devboxes and watched the logs roll in! ● Negligible performance impact!!! ○ As postulated, cost of subnet filtering << cost of instantiating a TCP connection ● Lots of connections out to public Amazon IPs creating a lot of noise Dip Test
  • 45. If only Amazon maintained some kind of list of their public prefixes...
  • 46.
  • 47.
  • 48. Surely you can’t load ~200 netblocks into the kernel and compare all non- RFC1918 tcp_v4_connect syscalls to them in a performant manner...
  • 49. Surely you can’t load ~200 netblocks into the kernel and compare all non- RFC1918 tcp_v4_connect syscalls to them in a performant manner...
  • 50.
  • 51. ● ~25,000 - ~50,000 messages per hour across dev and stage ● Once accidentally load-tested at ~80,000 messages in 5m from one host for several hours ● Nobody on the host noticed ● TCP connections are way more expensive than the filters! Load
  • 52. ● bpf_trace_printk() -> BPF_PERF_OUTPUT() ○ Global (e.g. per-kernel) debug output with hand- hacked json and string manipulation ○ To structured data in a ring buffer ○ Multi-tenancy makes it a better utility and more testable! ● Added unit tests ● Adding integration tests ● Adding infrastructure for deploy in production environment Undoing my nasty hacks
  • 53. ● De-containerize (e.g. debian package) ● Python3 ● Plugin for container awareness ○ Easy mapping to service and therefore owner! ● Enable immutable loginuid and add that to metadata ○ --loginuid-immutable under `man auditctl` ○ Cryptically says “but can cause some problems in certain kinds of containers” ● Threat modelling/hardening! Future work
  • 54. ● Performance improvements ○ BPF longest-match maps ○ Pre-processing masks ○ Probably totally unnecessary ● Moar syscalls! ○ TCP listens, ipv6, UDP, SUID, forwarded SSH socket reads… ● SIEM tooling ○ ASN matching, bad IP matching, GuardDuty auto- enrichment... Future work
  • 57. London | 14-15 November 2019 https://github.com/Yelp/pidtree-bcc @grimmware Thanks for listening!