SlideShare ist ein Scribd-Unternehmen logo
1 von 54
Downloaden Sie, um offline zu lesen
Edgar Barbosa
H2HC 2009
São Paulo
Who am I?
• Currently working for COSEINC, a
Singapore based security company.
• Old rootkit.com contributor (opc0de)
• Discovered the KdVersionBlock trick (used
by Windows memory forensic analysis
tools)
• One of the developers of BluePill, a
hardware-based virtualization rootkit.
Outline
• Concepts
• Taint analysis on the x86 architecture
• Taint objects and instructions
• Advanced tainting
• References
Motivation
• The motivation for this research came from
the following questions:
– Is it possible to measure the level of
“influence” that external data have over some
application? E.g. network packets or PDF files.
CONCEPTS
Taint Analysis
Information flow
• Follow any application inside a debugger and
you‟ll see that data information is being copied
and modified all the time. In another words,
information is always moving.
• Taint analysis can be seen as a form of Information
Flow Analysis.
• Great definition provided by Dorothy Denning at
the paper “Certification of programs for secure
information flow”:
– “Information flows from object x to object y, denoted
x→y , whenever information stored in x is transferred
to, object y.”
Flow
• “An operation, or series of operations, that uses
the value of some object, say x, to derive a value
for another, say y, causes a flow from x to y.” [1]
Object X
Object Y
Operation
Information
Value derived
from X
Tainted objects
• If the source of the value of the object X is
untrustworthy, we say that X is tainted.
Object X
Untrustworthy
Source
TAINTED
Taint
• To “taint” user data is to insert some kind
of tag or label for each object of the user
data.
• The tag allow us to track the influence of
the tainted object along the execution of
the program.
Taint sources
• Files (*.mp3, *.pdf, *.svg, *.html, *.js, …)
• Network protocols (HTTP, UDP, DNS, ... )
• Keyboard, mouse and touchscreen input
messages
• Webcam
• USB
• Virtual machines (Vmware images)
Taint propagation
• If an operation uses the value of some
tainted object, say X, to derive a value for
another, say Y, then object Y becomes
tainted. Object X tainted the object Y
• Taint operator t
• X → t(Y)
• Taint operator is transitive
– X → t(Y) and Y → t(Z), then X → t(Z)
Taint propagation
Untrusted source #2
K
L
M
X
W
Z
Untrusted source #1
Merge of two different
tainted sources
Applications
• Exploit detection
– If we can track user data, we can detect if non-
trusted data reaches a privileged location
– SQL injection, buffer overflows, XSS, …
– Perl tainted mode
– Detects even unknown attacks!
– Taint analysis for web applications
• Before execution of any statement, the taint
analysis module checks if the statement is
tainted or not! If tainted issue an attack alert!
Applications
• Data Lifetime analysis
– Jin Chow – “Understanding data lifetime via whole
system emulation” – presented at Usenix‟04.
– Created a modified Bochs (TaintBochs) emulator to
taint sensitive data.
– Keep track of the lifetime of sensitive data (passwords,
pin numbers, credit card numbers) stored in the virtual
machine memory
– Tracks data even in the kernel mode.
– Concluded that most applications doesn‟t have any
measure to minimize the lifetime of the sensitive data
in the memory.
TAINT ANALYSIS ON THE X86
ARCHITECTURE
Taint Analysis
Languages
• There are taint analysis tools for C, C++
and Java programming languages.
• In this presentation we will focus on
tainted analysis for the x86 assembly
language.
• The advantages are to not need the source
code of applications and to avoid to create
a parser for each available high-level
language.
x86 instructions
• A taint analysis module for the x86
architecture must at least:
– Identify all the operands of each instruction
– Identify the type of operand
(source/destination)
– Track each tainted object
– Understand the semantics of each instruction
x86 instructions
• A typical instruction like mov eax, 040h has
2 explicit operands like eax and the
immediate value 040h.
• The destination operand:
– eax
• The source operands are:
– eax (register)
– 040h (immediate value)
• Some instructions have implicit operands
x86 instructions
• PUSH EAX
• Explicit operand  EAX
• Semantics:
– ESPESP–4 (subtraction operation)
– SS:[ESP]EAX ( move operation )
• Implicit operands  ESP register
 SS segment register
• How to deal with implicit operands or
complex instructions?
Intermediate languages
• Translate the x86 instructions into an
Intermediate language!
• VEX language  Valgrind
• VINE IL  BitBlaze project
• REIL Zynamics BinNavi
Intermediate languages
• With an intermediate language it becomes
much more easy to parse and identify the
operands.
• Example:
– REIL  Uses only 17 instructions!
– For more info about REIL, see Sebastian Porst
presentation today
– sample:
• 1006E4B00: str edi, , edi
• 1006E4D00: sub esp, 4, esp
• 1006E4D01: and esp, 4294967295, esp
TAINT OBJECTS AND
INSTRUCTIONS
Taint Analysis
Taint objects
• In the x86 architecture we have 2 possible
objects to taint:
1. Memory locations
2. Processor registers
• Memory objects:
– Keep track of the initial address of the memory
area
– Keep track of the area size
• Register objects:
– Keep track of the register identifier (name)
– Keep a bit-level track of each bit
Taint objects
• The tainted objects representation presented here keeps track
of each bit.
• Some tools uses a byte-level tracking mechanism (Valgrind
TaintChecker)
Range = [6..7]
Register AL
tainted
Range = [0..4]
tainted
Memory
tainted
area
Size
Instruction analysis
• The ISA (Instruction Set Architecture) of
any platform can be divided in several
categories:
– Assignment instructions (load/store  mov,
xchg, …)
– Boolean instructions
– Arithmetical instructions (add, sub, mul,
div,…)
– String instructions (rep movsb, rep scasb, …)
– Branch instructions (call, jmp, jnz, ret, iret,…)
Memory
Assignment instructions
• mov eax, dword ptr [4C001000h]
tainted
EAX
tainted
Range = [0..31]
MOV
Range =
[4c000000-
4c002000]
Boolean
• Taint analysis of the most common boolean
operators.
– AND
– OR
– XOR
• The analysis must consider if the result of the
boolean operator depends on the value of
the tainted input.
• Special care must be take in the case of both
inputs to be the same tainted object.
Boolean operators
• AND truth table
• If A is tainted
– And B is equal 0, then the result is UNTAINTED
because the result doesn‟t depends on the value of
A.
– And B is equal 1, then the result is TAINTED
because A can control the result of the operation.
A B A and B
0 0 0
0 1 0
1 0 0
1 1 1
Boolean operators
• OR truth table
• If A is tainted
– And B is equal 1, then the result is UNTAINTED
because the result doesn‟t depends on the value of
A.
– And B is equal 0, then the result is TAINTED
because A can control the result of the operation.
A B A or B
0 0 0
0 1 1
1 0 1
1 1 1
Boolean operators
• OR truth table
• If A is tainted
– And B is equal 1, then the result is UNTAINTED
because the result doesn‟t depends on the value of
A.
– And B is equal 0, then the result is TAINTED
because A can control the result of the operation.
A B A or B
0 0 0
0 1 1
1 0 1
1 1 1
Boolean operators
• XOR truth table
• If A is tainted,then all possible results are
TAINTED indepently of any value of B.
• Special case  A XOR A
A B A xor B
0 0 0
0 1 1
1 0 1
1 1 0
Boolean operators
• For the tautology and contradiction
truth tables the result is always
UNTAINTED because none of the inputs
can can influentiate the result.
• In general operations which always results
on constant values produces untainted
objects.
Boolean operators
• and al, 0xdf
AL
tainted
Range = [0..7]
AND
0xDF
Range = [6..7]
0xDF = 11011111
AL
tainted
Range = [0..4]
Boolean operators
• Special case:
xor al, al AL
tainted
Range = [0..7]
AND
AL
UNTAINTED
AL
tainted
Range = [0..7]
A XOR A  0 (constant)
Arithmetical instructions
• add, sub, div, mul, idiv, imul, inc, dec
• All arithmetical instructions can be expressed
using boolean operations.
• ADD expressed using only AND and XOR
operators.
• Generally if one of the operands of an
arithmetical operation is tainted, the result is
also tainted.
• The affected flags in the EFLAGS register are
also tainted.
String instructions
• Strings are just a linear array of characters.
• x86 string instructions – scas, lods, cmps, …
• As a general rule any string instruction
applied to a tainted string results in a
tainted object.
• String operations used to:
– calculate the string size  Tainted
– search for some specific char and set a flag if
found/not found  Tainted
Lifetime of a tainted object
• Creation:
– Assignment from an unstruted object
• mov eax, userbuffer[ecx]
– Assignment from a tainted object
• add eax, eax
• Deletion:
– Assignment from an untainted object
• mov eax, 030h
– Assignment from a tainted object which results in
a constant value.
• xor eax, eax
ADVANCED TAINTING
Taint Analysis
Level of details
• Some taint-based tools does not taint every
object which is affected by a tainted object.
• For example, TaintBochs doesn`t taint
comparison flags (eflags zf, cf, of,...). Others
taint at a byte-level.
• This sometimes provides easy ways to bypass
these tools.
• This section deals with more „agressive‟ taint
methods.
Optional taint objects
• Bit-level tracking instead of a byte-level.
• Conditional branch instructions tainting the
EIP register and all the flag affect in the
eflags register.
• Taint the code execution time.
• Taint at the code-block level of a control
flow graph (CFG).
Comparison instructions
• x86 instructions  cmp, test
• CMP EAX, 020h
pseudo-code:
temp = eax – 20h
set_eflags(temp)
• Lots of flags (Carry, Zero, Parity, Overflow,...)
Conditional branch instructions
• 0100h: cmp eax, 020h
0108h: jnz 0120h
010dh: inc eax
…
…
0120h: xor ebx, ebx
Target if not zero
Target if zero
Conditional branch instructions
• We already taint comparison flags like the
Zero Flag.
• Branch instructions affects the EIP register.
• If a jump is dependent of the flag value,
then the EIP must be tainted.
• How to express in a intermediate language
the conditional jump to show relationship
between the EIP and the ZF?
Tainted EIP
Jump if TRUE
085h: cmp eax, ebx
088h: jnz 100h
08ch: mov ecx, edx
...
100h: xchg ecx, eax
Jump if FALSE
DELTA
Next instruction
after jnz
Formula for conditional jumps
• NIA  Next instruction address after the
conditional jump
• TT  True Target (address of the target
address if comparison is evaluated to TRUE)
• FT  Jump If False Target (008Ch)
• B  Flag value (always Boolean)
• D  Delta = abs (JITT - JIFT)
• We can now express EIP: EIP = NIA + BD
Tainted EIP
TT
085h: cmp eax, ebx
088h: jnz 100h
08ch: mov ecx, edx
...
100h: xchg ecx, eax
FT
DELTA
NIA
DELTA = abs( 100h – 88h) = 13h
NIA = 100
EIP  8Ch + ZF * 13h
Tainted EIP
• What is the consequence of Tainted(EIP) =
TRUE?
• The target code blocks of the Control Flow
Graph are TAINTED!
• We can also use taint analysis to solve
reachability problems!
– Can I create a mp3 file which will make
Winamp to execute the code block #357 of
the function playSound()?
Full control
• A tainted EIP is not SUFFICIENT condition
to define a vulnerability. It is necessary that
the contents of the memory pointed by EIP
to also be tainted:
• IF IsVulnerable() = TRUE then
(IsTainted(EIP) = TRUE)
AND
(IsTainted(*EIP) = TRUE)
Algorithmic complexity attacks
• First published by Scott Crosby and Dan
Wallach from Rice University at USENIX
• “Denial of Service via Algorithmic Complexity
Attacks”
• Based on the fact that lots of algorithms have
the worst-case.
• Creates special input which will direct the
execution of the algorithm to the worst-case
performance eventually causing a Denial of
Service (DoS) attack.
Algorithmic complexity analysis
• If the time of the execution of a hash-algorithm
depends of unstrusted data, then we can also taint
time!
• Tainted Time Analysis (TTA) is generally more
complex due to the use of more advanced
mathematical analysis methods normally found on
books about Analysis of algorithms.
• I applied a very basic TTA to detect the BluePill
rootkit presented at SyScan`07.
• OS X Kernel Mach-O File Loading Denial of Service
Vulnerability
– http://www.securityfocus.com/bid/13222
Thank you for allowing me to
taint your precious time! 
QUESTIONS?
References
1. “Certification of programs for secure information flow” – Dorothy E.
Denning and Peter J. Denning. 1977 – Communication of the ACM
2. “A lattice model for secure information flow” – Dorothy E. Denning – 1976
– Communication of the ACM.
3. “Dytan: A generic dynamic taint analysis framework” – James Clause,
Wanchun Li, and Alessandro Orso. Georgia Institute of Technology.
4. “Understanding data lifetime via whole system emulation” – Jim Chow, Tal
Garfinkel, Kevi Christopher, Mendel Rosenblum – USENIX – Stanford
University
5. “LIFT: A Low-Overhead Practical Information Flow Tracking System for
Detecting Security Attacks” - Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop
Kim, Yuanyuan zhou, Youfeng Wu - University of Illinois at Urbana-
Champaign
6. “BitBlaze: A New Approach to Computer Security via Binary Analysis” -
Dawn Song
7. “Denial of Service via Algorithmic Complexity Attacks” - Scott A. Crosby
EXTRA SLIDES
Taint Analysis
Dynamic taint analysis
• To implement a Dynamic taint analysis tools it
is necessary:
– Debuggers  Paimei is great for prototypes.
Necessary to insert breakpoint at the functions
which provides interface to tainted sources like
fopen, fread, ...
– For performance the amazing Rafal`s UMSS
available at Avert Labs. It is around 100x faster
than any debugger. Lazy evaluation of the affected
flags of the eflags register also helps a lot.
– Tainted objects tracking – tree/graph algorithms.

Weitere ähnliche Inhalte

Was ist angesagt?

Intelligence-Driven Industrial Security with Case Studies in ICS Attacks
Intelligence-Driven Industrial Security with Case Studies in ICS Attacks  Intelligence-Driven Industrial Security with Case Studies in ICS Attacks
Intelligence-Driven Industrial Security with Case Studies in ICS Attacks Dragos, Inc.
 
Windows internals Essentials
Windows internals EssentialsWindows internals Essentials
Windows internals EssentialsJohn Ombagi
 
Open Source IDS Tools: A Beginner's Guide
Open Source IDS Tools: A Beginner's GuideOpen Source IDS Tools: A Beginner's Guide
Open Source IDS Tools: A Beginner's GuideAlienVault
 
Thick client pentesting_the-hackers_meetup_version1.0pptx
Thick client pentesting_the-hackers_meetup_version1.0pptxThick client pentesting_the-hackers_meetup_version1.0pptx
Thick client pentesting_the-hackers_meetup_version1.0pptxAnurag Srivastava
 
An Active Case Study on Insider Threat Detection in your Applications
An Active Case Study on Insider Threat Detection in your ApplicationsAn Active Case Study on Insider Threat Detection in your Applications
An Active Case Study on Insider Threat Detection in your ApplicationsAmazon Web Services
 
Osint {open source intelligence }
Osint {open source intelligence }Osint {open source intelligence }
Osint {open source intelligence }AkshayJha40
 
Introduction To Ethical Hacking
Introduction To Ethical HackingIntroduction To Ethical Hacking
Introduction To Ethical HackingNeel Kamal
 
Threat intelligence notes
Threat intelligence notesThreat intelligence notes
Threat intelligence notesAmgad Magdy
 
Metasploit
MetasploitMetasploit
Metasploithenelpj
 
CNIT 126 11. Malware Behavior
CNIT 126 11. Malware BehaviorCNIT 126 11. Malware Behavior
CNIT 126 11. Malware BehaviorSam Bowne
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceDeep Shankar Yadav
 
Next Generation Memory Forensics
Next Generation Memory ForensicsNext Generation Memory Forensics
Next Generation Memory ForensicsAndrew Case
 
Mirroring web site using ht track
Mirroring web site using ht trackMirroring web site using ht track
Mirroring web site using ht trackVishal Kumar
 
The Cyber Threat Intelligence Matrix
The Cyber Threat Intelligence MatrixThe Cyber Threat Intelligence Matrix
The Cyber Threat Intelligence MatrixFrode Hommedal
 

Was ist angesagt? (20)

Event Viewer
Event ViewerEvent Viewer
Event Viewer
 
Intelligence-Driven Industrial Security with Case Studies in ICS Attacks
Intelligence-Driven Industrial Security with Case Studies in ICS Attacks  Intelligence-Driven Industrial Security with Case Studies in ICS Attacks
Intelligence-Driven Industrial Security with Case Studies in ICS Attacks
 
Windows internals Essentials
Windows internals EssentialsWindows internals Essentials
Windows internals Essentials
 
Open Source IDS Tools: A Beginner's Guide
Open Source IDS Tools: A Beginner's GuideOpen Source IDS Tools: A Beginner's Guide
Open Source IDS Tools: A Beginner's Guide
 
Thick client pentesting_the-hackers_meetup_version1.0pptx
Thick client pentesting_the-hackers_meetup_version1.0pptxThick client pentesting_the-hackers_meetup_version1.0pptx
Thick client pentesting_the-hackers_meetup_version1.0pptx
 
An Active Case Study on Insider Threat Detection in your Applications
An Active Case Study on Insider Threat Detection in your ApplicationsAn Active Case Study on Insider Threat Detection in your Applications
An Active Case Study on Insider Threat Detection in your Applications
 
Osint {open source intelligence }
Osint {open source intelligence }Osint {open source intelligence }
Osint {open source intelligence }
 
Introduction To Ethical Hacking
Introduction To Ethical HackingIntroduction To Ethical Hacking
Introduction To Ethical Hacking
 
Kali linux and hacking
Kali linux  and hackingKali linux  and hacking
Kali linux and hacking
 
Threat Intelligence
Threat IntelligenceThreat Intelligence
Threat Intelligence
 
Threat intelligence notes
Threat intelligence notesThreat intelligence notes
Threat intelligence notes
 
Metasploit
MetasploitMetasploit
Metasploit
 
Octave
OctaveOctave
Octave
 
CNIT 126 11. Malware Behavior
CNIT 126 11. Malware BehaviorCNIT 126 11. Malware Behavior
CNIT 126 11. Malware Behavior
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligence
 
NMAP - The Network Scanner
NMAP - The Network ScannerNMAP - The Network Scanner
NMAP - The Network Scanner
 
Next Generation Memory Forensics
Next Generation Memory ForensicsNext Generation Memory Forensics
Next Generation Memory Forensics
 
Memory Forensics
Memory ForensicsMemory Forensics
Memory Forensics
 
Mirroring web site using ht track
Mirroring web site using ht trackMirroring web site using ht track
Mirroring web site using ht track
 
The Cyber Threat Intelligence Matrix
The Cyber Threat Intelligence MatrixThe Cyber Threat Intelligence Matrix
The Cyber Threat Intelligence Matrix
 

Ähnlich wie Taint analysis

A guided fuzzing approach for security testing of network protocol software
A guided fuzzing approach for security testing of network protocol softwareA guided fuzzing approach for security testing of network protocol software
A guided fuzzing approach for security testing of network protocol softwarebinish_hyunseok
 
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...GangSeok Lee
 
Performance analysis and troubleshooting using DTrace
Performance analysis and troubleshooting using DTracePerformance analysis and troubleshooting using DTrace
Performance analysis and troubleshooting using DTraceGraeme Jenkinson
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterLucidworks
 
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT Talks
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT TalksMykhailo Zarai "Be careful when dealing with C++" at Rivne IT Talks
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT TalksVadym Muliavka
 
Arduino Platform with C programming.
Arduino Platform with C programming.Arduino Platform with C programming.
Arduino Platform with C programming.Govind Jha
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Protocol T50: Five months later... So what?
Protocol T50: Five months later... So what?Protocol T50: Five months later... So what?
Protocol T50: Five months later... So what?Nelson Brito
 
Anatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackAnatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackRob Gillen
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareLastline, Inc.
 
Remnux tutorial-1 Statically Analyse Portable Executable(PE) Files
Remnux tutorial-1  Statically Analyse Portable Executable(PE) FilesRemnux tutorial-1  Statically Analyse Portable Executable(PE) Files
Remnux tutorial-1 Statically Analyse Portable Executable(PE) FilesRhydham Joshi
 
ImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_DoinImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_DoinJonny Doin
 
(2) logic design lec foe-rd
(2) logic design lec   foe-rd(2) logic design lec   foe-rd
(2) logic design lec foe-rdwhataaa99
 

Ähnlich wie Taint analysis (20)

A guided fuzzing approach for security testing of network protocol software
A guided fuzzing approach for security testing of network protocol softwareA guided fuzzing approach for security testing of network protocol software
A guided fuzzing approach for security testing of network protocol software
 
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...
[2010 CodeEngn Conference 04] passket - Taint analysis for vulnerability disc...
 
Performance analysis and troubleshooting using DTrace
Performance analysis and troubleshooting using DTracePerformance analysis and troubleshooting using DTrace
Performance analysis and troubleshooting using DTrace
 
Java Tutorial
Java Tutorial Java Tutorial
Java Tutorial
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, Twitter
 
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT Talks
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT TalksMykhailo Zarai "Be careful when dealing with C++" at Rivne IT Talks
Mykhailo Zarai "Be careful when dealing with C++" at Rivne IT Talks
 
Arduino Platform with C programming.
Arduino Platform with C programming.Arduino Platform with C programming.
Arduino Platform with C programming.
 
Eusecwest
EusecwestEusecwest
Eusecwest
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Protocol T50: Five months later... So what?
Protocol T50: Five months later... So what?Protocol T50: Five months later... So what?
Protocol T50: Five months later... So what?
 
Anatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackAnatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow Attack
 
Fuzzing - Part 1
Fuzzing - Part 1Fuzzing - Part 1
Fuzzing - Part 1
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
 
Remnux tutorial-1 Statically Analyse Portable Executable(PE) Files
Remnux tutorial-1  Statically Analyse Portable Executable(PE) FilesRemnux tutorial-1  Statically Analyse Portable Executable(PE) Files
Remnux tutorial-1 Statically Analyse Portable Executable(PE) Files
 
Java Datatypes
Java DatatypesJava Datatypes
Java Datatypes
 
ImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_DoinImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_Doin
 
(2) logic design lec foe-rd
(2) logic design lec   foe-rd(2) logic design lec   foe-rd
(2) logic design lec foe-rd
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 

Kürzlich hochgeladen

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Taint analysis

  • 2. Who am I? • Currently working for COSEINC, a Singapore based security company. • Old rootkit.com contributor (opc0de) • Discovered the KdVersionBlock trick (used by Windows memory forensic analysis tools) • One of the developers of BluePill, a hardware-based virtualization rootkit.
  • 3. Outline • Concepts • Taint analysis on the x86 architecture • Taint objects and instructions • Advanced tainting • References
  • 4. Motivation • The motivation for this research came from the following questions: – Is it possible to measure the level of “influence” that external data have over some application? E.g. network packets or PDF files.
  • 6. Information flow • Follow any application inside a debugger and you‟ll see that data information is being copied and modified all the time. In another words, information is always moving. • Taint analysis can be seen as a form of Information Flow Analysis. • Great definition provided by Dorothy Denning at the paper “Certification of programs for secure information flow”: – “Information flows from object x to object y, denoted x→y , whenever information stored in x is transferred to, object y.”
  • 7. Flow • “An operation, or series of operations, that uses the value of some object, say x, to derive a value for another, say y, causes a flow from x to y.” [1] Object X Object Y Operation Information Value derived from X
  • 8. Tainted objects • If the source of the value of the object X is untrustworthy, we say that X is tainted. Object X Untrustworthy Source TAINTED
  • 9. Taint • To “taint” user data is to insert some kind of tag or label for each object of the user data. • The tag allow us to track the influence of the tainted object along the execution of the program.
  • 10. Taint sources • Files (*.mp3, *.pdf, *.svg, *.html, *.js, …) • Network protocols (HTTP, UDP, DNS, ... ) • Keyboard, mouse and touchscreen input messages • Webcam • USB • Virtual machines (Vmware images)
  • 11. Taint propagation • If an operation uses the value of some tainted object, say X, to derive a value for another, say Y, then object Y becomes tainted. Object X tainted the object Y • Taint operator t • X → t(Y) • Taint operator is transitive – X → t(Y) and Y → t(Z), then X → t(Z)
  • 12. Taint propagation Untrusted source #2 K L M X W Z Untrusted source #1 Merge of two different tainted sources
  • 13. Applications • Exploit detection – If we can track user data, we can detect if non- trusted data reaches a privileged location – SQL injection, buffer overflows, XSS, … – Perl tainted mode – Detects even unknown attacks! – Taint analysis for web applications • Before execution of any statement, the taint analysis module checks if the statement is tainted or not! If tainted issue an attack alert!
  • 14. Applications • Data Lifetime analysis – Jin Chow – “Understanding data lifetime via whole system emulation” – presented at Usenix‟04. – Created a modified Bochs (TaintBochs) emulator to taint sensitive data. – Keep track of the lifetime of sensitive data (passwords, pin numbers, credit card numbers) stored in the virtual machine memory – Tracks data even in the kernel mode. – Concluded that most applications doesn‟t have any measure to minimize the lifetime of the sensitive data in the memory.
  • 15. TAINT ANALYSIS ON THE X86 ARCHITECTURE Taint Analysis
  • 16. Languages • There are taint analysis tools for C, C++ and Java programming languages. • In this presentation we will focus on tainted analysis for the x86 assembly language. • The advantages are to not need the source code of applications and to avoid to create a parser for each available high-level language.
  • 17. x86 instructions • A taint analysis module for the x86 architecture must at least: – Identify all the operands of each instruction – Identify the type of operand (source/destination) – Track each tainted object – Understand the semantics of each instruction
  • 18. x86 instructions • A typical instruction like mov eax, 040h has 2 explicit operands like eax and the immediate value 040h. • The destination operand: – eax • The source operands are: – eax (register) – 040h (immediate value) • Some instructions have implicit operands
  • 19. x86 instructions • PUSH EAX • Explicit operand  EAX • Semantics: – ESPESP–4 (subtraction operation) – SS:[ESP]EAX ( move operation ) • Implicit operands  ESP register  SS segment register • How to deal with implicit operands or complex instructions?
  • 20. Intermediate languages • Translate the x86 instructions into an Intermediate language! • VEX language  Valgrind • VINE IL  BitBlaze project • REIL Zynamics BinNavi
  • 21. Intermediate languages • With an intermediate language it becomes much more easy to parse and identify the operands. • Example: – REIL  Uses only 17 instructions! – For more info about REIL, see Sebastian Porst presentation today – sample: • 1006E4B00: str edi, , edi • 1006E4D00: sub esp, 4, esp • 1006E4D01: and esp, 4294967295, esp
  • 23. Taint objects • In the x86 architecture we have 2 possible objects to taint: 1. Memory locations 2. Processor registers • Memory objects: – Keep track of the initial address of the memory area – Keep track of the area size • Register objects: – Keep track of the register identifier (name) – Keep a bit-level track of each bit
  • 24. Taint objects • The tainted objects representation presented here keeps track of each bit. • Some tools uses a byte-level tracking mechanism (Valgrind TaintChecker) Range = [6..7] Register AL tainted Range = [0..4] tainted Memory tainted area Size
  • 25. Instruction analysis • The ISA (Instruction Set Architecture) of any platform can be divided in several categories: – Assignment instructions (load/store  mov, xchg, …) – Boolean instructions – Arithmetical instructions (add, sub, mul, div,…) – String instructions (rep movsb, rep scasb, …) – Branch instructions (call, jmp, jnz, ret, iret,…)
  • 26. Memory Assignment instructions • mov eax, dword ptr [4C001000h] tainted EAX tainted Range = [0..31] MOV Range = [4c000000- 4c002000]
  • 27. Boolean • Taint analysis of the most common boolean operators. – AND – OR – XOR • The analysis must consider if the result of the boolean operator depends on the value of the tainted input. • Special care must be take in the case of both inputs to be the same tainted object.
  • 28. Boolean operators • AND truth table • If A is tainted – And B is equal 0, then the result is UNTAINTED because the result doesn‟t depends on the value of A. – And B is equal 1, then the result is TAINTED because A can control the result of the operation. A B A and B 0 0 0 0 1 0 1 0 0 1 1 1
  • 29. Boolean operators • OR truth table • If A is tainted – And B is equal 1, then the result is UNTAINTED because the result doesn‟t depends on the value of A. – And B is equal 0, then the result is TAINTED because A can control the result of the operation. A B A or B 0 0 0 0 1 1 1 0 1 1 1 1
  • 30. Boolean operators • OR truth table • If A is tainted – And B is equal 1, then the result is UNTAINTED because the result doesn‟t depends on the value of A. – And B is equal 0, then the result is TAINTED because A can control the result of the operation. A B A or B 0 0 0 0 1 1 1 0 1 1 1 1
  • 31. Boolean operators • XOR truth table • If A is tainted,then all possible results are TAINTED indepently of any value of B. • Special case  A XOR A A B A xor B 0 0 0 0 1 1 1 0 1 1 1 0
  • 32. Boolean operators • For the tautology and contradiction truth tables the result is always UNTAINTED because none of the inputs can can influentiate the result. • In general operations which always results on constant values produces untainted objects.
  • 33. Boolean operators • and al, 0xdf AL tainted Range = [0..7] AND 0xDF Range = [6..7] 0xDF = 11011111 AL tainted Range = [0..4]
  • 34. Boolean operators • Special case: xor al, al AL tainted Range = [0..7] AND AL UNTAINTED AL tainted Range = [0..7] A XOR A  0 (constant)
  • 35. Arithmetical instructions • add, sub, div, mul, idiv, imul, inc, dec • All arithmetical instructions can be expressed using boolean operations. • ADD expressed using only AND and XOR operators. • Generally if one of the operands of an arithmetical operation is tainted, the result is also tainted. • The affected flags in the EFLAGS register are also tainted.
  • 36. String instructions • Strings are just a linear array of characters. • x86 string instructions – scas, lods, cmps, … • As a general rule any string instruction applied to a tainted string results in a tainted object. • String operations used to: – calculate the string size  Tainted – search for some specific char and set a flag if found/not found  Tainted
  • 37. Lifetime of a tainted object • Creation: – Assignment from an unstruted object • mov eax, userbuffer[ecx] – Assignment from a tainted object • add eax, eax • Deletion: – Assignment from an untainted object • mov eax, 030h – Assignment from a tainted object which results in a constant value. • xor eax, eax
  • 39. Level of details • Some taint-based tools does not taint every object which is affected by a tainted object. • For example, TaintBochs doesn`t taint comparison flags (eflags zf, cf, of,...). Others taint at a byte-level. • This sometimes provides easy ways to bypass these tools. • This section deals with more „agressive‟ taint methods.
  • 40. Optional taint objects • Bit-level tracking instead of a byte-level. • Conditional branch instructions tainting the EIP register and all the flag affect in the eflags register. • Taint the code execution time. • Taint at the code-block level of a control flow graph (CFG).
  • 41. Comparison instructions • x86 instructions  cmp, test • CMP EAX, 020h pseudo-code: temp = eax – 20h set_eflags(temp) • Lots of flags (Carry, Zero, Parity, Overflow,...)
  • 42. Conditional branch instructions • 0100h: cmp eax, 020h 0108h: jnz 0120h 010dh: inc eax … … 0120h: xor ebx, ebx Target if not zero Target if zero
  • 43. Conditional branch instructions • We already taint comparison flags like the Zero Flag. • Branch instructions affects the EIP register. • If a jump is dependent of the flag value, then the EIP must be tainted. • How to express in a intermediate language the conditional jump to show relationship between the EIP and the ZF?
  • 44. Tainted EIP Jump if TRUE 085h: cmp eax, ebx 088h: jnz 100h 08ch: mov ecx, edx ... 100h: xchg ecx, eax Jump if FALSE DELTA Next instruction after jnz
  • 45. Formula for conditional jumps • NIA  Next instruction address after the conditional jump • TT  True Target (address of the target address if comparison is evaluated to TRUE) • FT  Jump If False Target (008Ch) • B  Flag value (always Boolean) • D  Delta = abs (JITT - JIFT) • We can now express EIP: EIP = NIA + BD
  • 46. Tainted EIP TT 085h: cmp eax, ebx 088h: jnz 100h 08ch: mov ecx, edx ... 100h: xchg ecx, eax FT DELTA NIA DELTA = abs( 100h – 88h) = 13h NIA = 100 EIP  8Ch + ZF * 13h
  • 47. Tainted EIP • What is the consequence of Tainted(EIP) = TRUE? • The target code blocks of the Control Flow Graph are TAINTED! • We can also use taint analysis to solve reachability problems! – Can I create a mp3 file which will make Winamp to execute the code block #357 of the function playSound()?
  • 48. Full control • A tainted EIP is not SUFFICIENT condition to define a vulnerability. It is necessary that the contents of the memory pointed by EIP to also be tainted: • IF IsVulnerable() = TRUE then (IsTainted(EIP) = TRUE) AND (IsTainted(*EIP) = TRUE)
  • 49. Algorithmic complexity attacks • First published by Scott Crosby and Dan Wallach from Rice University at USENIX • “Denial of Service via Algorithmic Complexity Attacks” • Based on the fact that lots of algorithms have the worst-case. • Creates special input which will direct the execution of the algorithm to the worst-case performance eventually causing a Denial of Service (DoS) attack.
  • 50. Algorithmic complexity analysis • If the time of the execution of a hash-algorithm depends of unstrusted data, then we can also taint time! • Tainted Time Analysis (TTA) is generally more complex due to the use of more advanced mathematical analysis methods normally found on books about Analysis of algorithms. • I applied a very basic TTA to detect the BluePill rootkit presented at SyScan`07. • OS X Kernel Mach-O File Loading Denial of Service Vulnerability – http://www.securityfocus.com/bid/13222
  • 51. Thank you for allowing me to taint your precious time!  QUESTIONS?
  • 52. References 1. “Certification of programs for secure information flow” – Dorothy E. Denning and Peter J. Denning. 1977 – Communication of the ACM 2. “A lattice model for secure information flow” – Dorothy E. Denning – 1976 – Communication of the ACM. 3. “Dytan: A generic dynamic taint analysis framework” – James Clause, Wanchun Li, and Alessandro Orso. Georgia Institute of Technology. 4. “Understanding data lifetime via whole system emulation” – Jim Chow, Tal Garfinkel, Kevi Christopher, Mendel Rosenblum – USENIX – Stanford University 5. “LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks” - Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan zhou, Youfeng Wu - University of Illinois at Urbana- Champaign 6. “BitBlaze: A New Approach to Computer Security via Binary Analysis” - Dawn Song 7. “Denial of Service via Algorithmic Complexity Attacks” - Scott A. Crosby
  • 54. Dynamic taint analysis • To implement a Dynamic taint analysis tools it is necessary: – Debuggers  Paimei is great for prototypes. Necessary to insert breakpoint at the functions which provides interface to tainted sources like fopen, fread, ... – For performance the amazing Rafal`s UMSS available at Avert Labs. It is around 100x faster than any debugger. Lazy evaluation of the affected flags of the eflags register also helps a lot. – Tainted objects tracking – tree/graph algorithms.