SlideShare ist ein Scribd-Unternehmen logo
1 von 103
Downloaden Sie, um offline zu lesen
Dytan:A Generic
Dynamic Taint Analysis
Framework
James Clause,Wanchun (Paul) Li,
and Alessandro Orso
College of Computing
Georgia Institute of Technology
Partially supported by:
NSF awards CCF-0541080 and CCR-0205422 to Georgia Tech,
DHS and US Air Force Contract No. FA8750-05-2-0214
C
A
B Z
Dynamic taint analysis
(aka dynamic information-flow analysis)
C
A
B
312
Z
Dynamic taint analysis
(aka dynamic information-flow analysis)
C
A
B
312
Z
Dynamic taint analysis
(aka dynamic information-flow analysis)
C
A
B
312
Z
3
Dynamic taint analysis
(aka dynamic information-flow analysis)
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Attack detection / prevention
Detect / prevent attacks such as SQL injection, buffer overruns,
stack smashing, cross site scripting
e.g., Suh et al. 04, Newsome and Song 05,
Halfond et al. 06, Kong et al. 06, Qin et al. 06
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Information policy enforcement
ensure classified information does not leak outside the system
e.g.,Vachharajani et al. 04, McCamant and Ernst 06
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Testing
Coverage metrics, test data generation heuristic, ...
e.g., Masri et al 05, Leek et al. 07
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scopeData lifetime / scope
track how long sensitive data, such as passwords or account
numbers, remain in the application
e.g., Chow et al. 04
Dynamic tainting applications
Information policy enforcement
Attack detection / prevention
Testing
Data lifetime / scope
Motivation
Ad-hoc taint analysis
implementation
Results
Ad-hoc taint analysis
implementation
Ad-hoc taint analysis
implementation
Results
Results
Motivation
Ad-hoc taint analysis
implementation
Results
Ad-hoc taint analysis
implementation
Ad-hoc taint analysis
implementation
Results
Results
Ad-hoc taint analysis
implementation
Results
Motivation
Ad-hoc taint analysis
implementation
Results
Ad-hoc taint analysis
implementation
Ad-hoc taint analysis
implementation
Results
Results
Ad-hoc taint analysis
implementation
Results
Motivation
Configuration
Dytan Generic
Framework
Custom Dynamic
Taint Analysis Results
Motivation
•Flexible
Configuration
Dytan Generic
Framework
Custom Dynamic
Taint Analysis Results
Motivation
•Flexible
•Easy to use
Configuration
Dytan Generic
Framework
Custom Dynamic
Taint Analysis Results
Motivation
•Flexible
•Easy to use
•Accurate
Configuration
Dytan Generic
Framework
Custom Dynamic
Taint Analysis Results
Outline
✓Motivation & overview
• Framework (Dytan)
• flexibility
• ease of use
• accuracy
• Empirical evaluation
• Conclusions
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinksConfiguration
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinks
Framework: flexibility
Taint
sources
Taint
sources
Propagation
policy
Taint
sinks
Which data to tag, and how to tag it
Framework: flexibility
Propagation
policy
Taint
sources
Propagation
policy
Taint
sinks
How tags should be propagated at runtime
Framework: flexibility
Taint
sinks
Taint
sources
Propagation
policy
Taint
sinks
Where and how tags should be checked
Framework: flexibility
Taint
sources
Propagation
policy
Taint
sinks
Taint sources
What to tag How to tag
Taint sources
What to tag How to tag
Identify what program data
should be assigned tags
Taint sources
What to tag How to tag
Identify what program data
should be assigned tags
• Variables (local or global)
• Function parameters
• Function return values
• Data from an input stream
network, filesystem,
keyboard, ...
• Specific input stream
141.195.121.134:80, a.txt,...
Taint sources
What to tag How to tag
Identify what program data
should be assigned tags
• Variables (local or global)
• Function parameters
• Function return values
• Data from an input stream
network, filesystem,
keyboard, ...
• Specific input stream
141.195.121.134:80, a.txt,...
Describe how tags should be
assigned for identified data
Taint sources
What to tag How to tag
Identify what program data
should be assigned tags
• Variables (local or global)
• Function parameters
• Function return values
• Data from an input stream
network, filesystem,
keyboard, ...
• Specific input stream
141.195.121.134:80, a.txt,...
Describe how tags should be
assigned for identified data
• Single tag
• One tag per source
• Multiple tags per source
Taint sources
What to tag How to tag
Identify what program data
should be assigned tags
• Variables (local or global)
• Function parameters
• Function return values
• Data from an input stream
network, filesystem,
keyboard, ...
• Specific input stream
141.195.121.134:80, a.txt,...
Describe how tags should be
assigned for identified data
• Single tag
• One tag per source
• Multiple tags per source
• ...
a.txt
Taint sources
What to tag: a.txt
How to tag: single tag
a.txt
Taint sources
What to tag: a.txt
How to tag: single tag
Taint sources
What to tag: a.txt
How to tag: single tag
a.txt
Taint sources
What to tag: a.txt
How to tag: single tag
a.txt
1 1 1 1 1 1
Taint sources
What to tag: a.txt
How to tag: single tag
a.txt
Taint sources
What to tag: a.txt
a.txt
How to tag: multiple tags
Taint sources
What to tag: a.txt
a.txt
2 31 4 5 n
How to tag: multiple tags
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
A policy can consider both or only
data dependencies
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
A policy can consider both or only
data dependencies
Define how tags associated with
affecting data should be combined
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
A policy can consider both or only
data dependencies
Define how tags associated with
affecting data should be combined
• Union
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
A policy can consider both or only
data dependencies
Define how tags associated with
affecting data should be combined
• Union
• Max
Propagation policy
3
B
A12
3
C
Affecting data Mapping function
Data that affects the outcome of a
statement through
• Data dependencies
• Control dependencies
A policy can consider both or only
data dependencies
Define how tags associated with
affecting data should be combined
• Union
• Max
• ...
Propagation policy
3
B
A12
3
C
if(X) {
C = A + B;
}
Propagation policy
3
if(X) {
C = A + B;
}
1 2
Propagation policy
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
union
max
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
union
max
✔
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
union
max
✔
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence✔
union
max
✔
1 2
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
✔
✔
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
✔
✔
✔
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
✔
✔
✔
3
if(X) {
C = A + B;
}
1 2
Propagation policy
Affecting data:
control dependence
Mapping function:
data dependence
union
max
✔
✔
✔
3
Where to check What to check
Taint Sinks
How to check
Where to check What to check
Location in the program to
perform a check
Taint Sinks
How to check
Where to check What to check
Location in the program to
perform a check
• Function entry / exit
• Statement type
• Specific program point
Taint Sinks
How to check
Where to check What to check
Location in the program to
perform a check
• Function entry / exit
• Statement type
• Specific program point
The data whose tags should
be checked
Taint Sinks
How to check
Where to check What to check
Location in the program to
perform a check
• Function entry / exit
• Statement type
• Specific program point
The data whose tags should
be checked
• Variables
• Function parameters
• Function return value
Taint Sinks
How to check
Where to check What to check
Location in the program to
perform a check
• Function entry / exit
• Statement type
• Specific program point
The data whose tags should
be checked
• Variables
• Function parameters
• Function return value
Taint Sinks
How to check
Set of conditions to check and a set of actions to perform if the
conditions are not met.
Where to check What to check
Location in the program to
perform a check
• Function entry / exit
• Statement type
• Specific program point
The data whose tags should
be checked
• Variables
• Function parameters
• Function return value
Taint Sinks
How to check
Set of conditions to check and a set of actions to perform if the
conditions are not met.
• validate presence of tags (exit or log)
• ensure absence of tags (exit or log)
• ...
Taint Sinks
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
Taint Sinks
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
2
Taint Sinks
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
2
3
validate presence of:
validate absence of:
Taint Sinks
function: exec, param: 0
Where / what to check:
How to check:
Result:
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
2
3
2
3
validate presence of:
validate absence of:
Taint Sinks
function: exec, param: 0
Where / what to check:
How to check:
Result:
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
2
3
2
3
2 3
validate presence of:
validate absence of:
Taint Sinks
function: exec, param: 0
Where / what to check:
How to check:
Result:
cmd = read(file);
args = read(socket);
cmd = trim(cmd + args);
...
tok[] = parse(cmd);
exec(tok[0], tok[1]);
✘
2
3
2
3
2 3
Framework: ease of use
Provide two ways to configure the framework
Framework: ease of use
• Basic
• Select sources, propagation policies, and sinks
from a set of predefined options
• XML based configuration
Provide two ways to configure the framework
Framework: ease of use
• Basic
• Select sources, propagation policies, and sinks
from a set of predefined options
• XML based configuration
• Advanced
• Suitable for more esoteric applications
• Extend OO implementation
Provide two ways to configure the framework
Framework: accuracy
• Dytan operates at the binary level
• consider the actual program semantics
• transparently handle libraries
• Dytan accounts for both data- and control-
flow dependencies
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
Two common examples:
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax, %eflags
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
• Address Generators
add %eax, %ebx // A = A + B
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax, %eflags
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
• Address Generators
add %eax, %ebx // A = A + B
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax, %eflags
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
[ ] *
• Address Generators
add %eax, %ebx // A = A + B
consumed: %eax, [%ebx]
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax, %eflags
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
[ ] *
• Address Generators
add %eax, %ebx // A = A + B
consumed: %eax, [%ebx], %ebx
Two common examples:
• Implicit operands
add %eax, %ebx // A = A + B
produced: %eax, %eflags
Framework: accuracy
The most common source of inaccuracy is
incorrectly identifying the information
produced and consumed by a statement
[ ] *
Outline
✓Motivation & overview
✓Framework
✓flexibility
✓ease of use
✓accuracy
• Empirical evaluation
• Conclusions
Empirical evaluation
• RQ1: Can Dytan be used to (easily) implement
existing dynamic taint analyses?
• RQ2: How do inaccurate propagation policies
affect the analysis results?
• In addition: discussion on performance
RQ1: flexibility
• Selected two techniques:
• Overwrite attack detection [Qin et al. 04]
• SQL injection detection [Halfond et al. 06]
• Used Dytan to re-implement both techniques
• Measure implementation time
• Validate against the original implementation
Goal: show that Dytan can be used to (easily)
implement existing dynamic taint analyses
RQ1: results
• Implementation time:
• Overwrite attack detection: < 1 hour
• SQL injection detection: < 1 day
• Comparison with original implementations:
• Successfully stopped same attacks as the
original implementations
RQ2: accuracy impact
Goal: measure the effect of inaccurate propagation
policies on analysis results
RQ2: accuracy impact
• Selected two subjects:
• Gzip (75kb w/o libraries)
• Firefox (850kb w/o libraries)
Goal: measure the effect of inaccurate propagation
policies on analysis results
RQ2: accuracy impact
• Selected two subjects:
• Gzip (75kb w/o libraries)
• Firefox (850kb w/o libraries)
• Use Dytan to taint program inputs and measure the
amount of heap data tainted at program exit
Goal: measure the effect of inaccurate propagation
policies on analysis results
RQ2: accuracy impact
• Selected two subjects:
• Gzip (75kb w/o libraries)
• Firefox (850kb w/o libraries)
• Use Dytan to taint program inputs and measure the
amount of heap data tainted at program exit
• Compare Dytan against inaccurate policies
• no implicit operands (no IM)
• no address generators (no AG)
• no implicit operands, no address generators (no
IM, no AG)
Goal: measure the effect of inaccurate propagation
policies on analysis results
RQ2: results
0%
25%
50%
75%
100%
Firefox (1 page) Firefox (3 pages) Gzip
Dytan No IM No AG No IM, no IG
Performance
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Designed for experimentation
• Favors flexibility over performance
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Performance
• In line with existing implementations
• Designed for experimentation
• Favors flexibility over performance
• Implementation can be further optimized
• Measured for gzip:
≈30x for data flow
≈50x for data and control flow
• High overhead, but...
Related work
• Existing dynamic tainting approaches
[Suh et al. 04, Newsome and Song 05, Halfond et al. 06, Kong et al. 06, ...]
• Ad-hoc
• Other dynamic taint analysis frameworks
[Xu et al. 06 and Lam and Chiueh 06]
• Focused on security applications
• Single taint mark
• No control-flow propagation
• Operate at the source code level
Conclusions
• Dytan
• a general framework for dynamic tainting
• allows for instantiating and experimenting with
different dynamic taint analysis approaches
• Initial evaluation
• flexible
• easy to use
• accurate
Future directions
• Tool release (documentation, code cleanup)
http://www.cc.gatech.edu/~clause/dytan/
(pre-release on request)
• Optimization (general and specific)
• Applications
• Memory protection
• Debugging
Questions?
http://www.cc.gatech.edu/~clause/dytan/

Weitere ähnliche Inhalte

Ähnlich wie Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)

Ähnlich wie Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007) (20)

Burton - Security, Privacy and Trust
Burton - Security, Privacy and TrustBurton - Security, Privacy and Trust
Burton - Security, Privacy and Trust
 
Bonneau - Complex Networks Foundations of Information Systems - Spring Review...
Bonneau - Complex Networks Foundations of Information Systems - Spring Review...Bonneau - Complex Networks Foundations of Information Systems - Spring Review...
Bonneau - Complex Networks Foundations of Information Systems - Spring Review...
 
Data Sharing Guidebook
Data Sharing GuidebookData Sharing Guidebook
Data Sharing Guidebook
 
Isf 2015 continuous diagnostics monitoring may 2015
Isf 2015 continuous diagnostics monitoring  may 2015Isf 2015 continuous diagnostics monitoring  may 2015
Isf 2015 continuous diagnostics monitoring may 2015
 
PII_IQPC_May_04
PII_IQPC_May_04PII_IQPC_May_04
PII_IQPC_May_04
 
Proposal for System Analysis and Desing
Proposal for System Analysis and DesingProposal for System Analysis and Desing
Proposal for System Analysis and Desing
 
Cisco Connect Halifax 2018 Application insight and zero trust policies with...
Cisco Connect Halifax 2018   Application insight and zero trust policies with...Cisco Connect Halifax 2018   Application insight and zero trust policies with...
Cisco Connect Halifax 2018 Application insight and zero trust policies with...
 
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data Management
 
Privacy and Auditing in Clouds
Privacy and Auditing in CloudsPrivacy and Auditing in Clouds
Privacy and Auditing in Clouds
 
IRJET- AC Duct Monitoring and Cleaning Vehicle for Train Coaches
IRJET- AC Duct Monitoring and Cleaning Vehicle for Train CoachesIRJET- AC Duct Monitoring and Cleaning Vehicle for Train Coaches
IRJET- AC Duct Monitoring and Cleaning Vehicle for Train Coaches
 
IRJET- A Data Stream Mining Technique Dynamically Updating a Model with Dynam...
IRJET- A Data Stream Mining Technique Dynamically Updating a Model with Dynam...IRJET- A Data Stream Mining Technique Dynamically Updating a Model with Dynam...
IRJET- A Data Stream Mining Technique Dynamically Updating a Model with Dynam...
 
PAACDA Comprehensive Data Corruption Detection Algorithm.docx
PAACDA Comprehensive Data Corruption Detection Algorithm.docxPAACDA Comprehensive Data Corruption Detection Algorithm.docx
PAACDA Comprehensive Data Corruption Detection Algorithm.docx
 
Strata San Jose 2016 - Reduce False Positives in Security
Strata San Jose 2016 - Reduce False Positives in Security Strata San Jose 2016 - Reduce False Positives in Security
Strata San Jose 2016 - Reduce False Positives in Security
 
Ws For Aqm
Ws For AqmWs For Aqm
Ws For Aqm
 
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
 
Roman Khazankin (Vienna University of Technology): Providence: A Framework fo...
Roman Khazankin (Vienna University of Technology): Providence: A Framework fo...Roman Khazankin (Vienna University of Technology): Providence: A Framework fo...
Roman Khazankin (Vienna University of Technology): Providence: A Framework fo...
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
Maturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management StrategyMaturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management Strategy
 
GRA, NIEM and XACML Security Profiles July 2012
GRA, NIEM and XACML Security Profiles July 2012GRA, NIEM and XACML Security Profiles July 2012
GRA, NIEM and XACML Security Profiles July 2012
 

Mehr von James Clause

Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
James Clause
 
Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)
James Clause
 
Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)
James Clause
 
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
James Clause
 
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
James Clause
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
James Clause
 
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
James Clause
 
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
James Clause
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
James Clause
 
Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)
James Clause
 
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
James Clause
 
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
James Clause
 
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
James Clause
 
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
James Clause
 

Mehr von James Clause (14)

Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
Investigating the Impacts of Web Servers on Web Application Energy Usage (GRE...
 
Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)Energy-directed Test Suite Optimization (GREENS 2013)
Energy-directed Test Suite Optimization (GREENS 2013)
 
Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)Enabling and Supporting the Debugging of Field Failures (Job Talk)
Enabling and Supporting the Debugging of Field Failures (Job Talk)
 
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
Leakpoint: Pinpointing the Causes of Memory Leaks (ICSE 2010)
 
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
Debugging Field Failures by Minimizing Captured Executions (ICSE 2009: NIER e...
 
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
A Technique for Enabling and Supporting Debugging of Field Failures (ICSE 2007)
 
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
Demand-Driven Structural Testing with Dynamic Instrumentation (ICSE 2005)
 
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)Initial Explorations on Design Pattern Energy Usage (GREENS 12)
Initial Explorations on Design Pattern Energy Usage (GREENS 12)
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
 
Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)Taint-based Dynamic Analysis (CoC Research Day 2009)
Taint-based Dynamic Analysis (CoC Research Day 2009)
 
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)Effective Memory Protection Using Dynamic Tainting (ASE 2007)
Effective Memory Protection Using Dynamic Tainting (ASE 2007)
 
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
 
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
Penumbra: Automatically Identifying Failure-Relevant Inputs (ISSTA 2009)
 
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)Camouflage: Automated Anonymization of Field Data (ICSE 2011)
Camouflage: Automated Anonymization of Field Data (ICSE 2011)
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Dytan: A Generic Dynamic Taint Analysis Framework (ISSTA 2007)

  • 1. Dytan:A Generic Dynamic Taint Analysis Framework James Clause,Wanchun (Paul) Li, and Alessandro Orso College of Computing Georgia Institute of Technology Partially supported by: NSF awards CCF-0541080 and CCR-0205422 to Georgia Tech, DHS and US Air Force Contract No. FA8750-05-2-0214
  • 2. C A B Z Dynamic taint analysis (aka dynamic information-flow analysis)
  • 3. C A B 312 Z Dynamic taint analysis (aka dynamic information-flow analysis)
  • 4. C A B 312 Z Dynamic taint analysis (aka dynamic information-flow analysis)
  • 5. C A B 312 Z 3 Dynamic taint analysis (aka dynamic information-flow analysis)
  • 6. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scope
  • 7. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scope Attack detection / prevention Detect / prevent attacks such as SQL injection, buffer overruns, stack smashing, cross site scripting e.g., Suh et al. 04, Newsome and Song 05, Halfond et al. 06, Kong et al. 06, Qin et al. 06
  • 8. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scope Information policy enforcement ensure classified information does not leak outside the system e.g.,Vachharajani et al. 04, McCamant and Ernst 06
  • 9. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scope Testing Coverage metrics, test data generation heuristic, ... e.g., Masri et al 05, Leek et al. 07
  • 10. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scopeData lifetime / scope track how long sensitive data, such as passwords or account numbers, remain in the application e.g., Chow et al. 04
  • 11. Dynamic tainting applications Information policy enforcement Attack detection / prevention Testing Data lifetime / scope
  • 12. Motivation Ad-hoc taint analysis implementation Results Ad-hoc taint analysis implementation Ad-hoc taint analysis implementation Results Results
  • 13. Motivation Ad-hoc taint analysis implementation Results Ad-hoc taint analysis implementation Ad-hoc taint analysis implementation Results Results Ad-hoc taint analysis implementation Results
  • 14. Motivation Ad-hoc taint analysis implementation Results Ad-hoc taint analysis implementation Ad-hoc taint analysis implementation Results Results Ad-hoc taint analysis implementation Results
  • 17. Motivation •Flexible •Easy to use Configuration Dytan Generic Framework Custom Dynamic Taint Analysis Results
  • 18. Motivation •Flexible •Easy to use •Accurate Configuration Dytan Generic Framework Custom Dynamic Taint Analysis Results
  • 19. Outline ✓Motivation & overview • Framework (Dytan) • flexibility • ease of use • accuracy • Empirical evaluation • Conclusions
  • 26. Taint sources What to tag How to tag
  • 27. Taint sources What to tag How to tag Identify what program data should be assigned tags
  • 28. Taint sources What to tag How to tag Identify what program data should be assigned tags • Variables (local or global) • Function parameters • Function return values • Data from an input stream network, filesystem, keyboard, ... • Specific input stream 141.195.121.134:80, a.txt,...
  • 29. Taint sources What to tag How to tag Identify what program data should be assigned tags • Variables (local or global) • Function parameters • Function return values • Data from an input stream network, filesystem, keyboard, ... • Specific input stream 141.195.121.134:80, a.txt,... Describe how tags should be assigned for identified data
  • 30. Taint sources What to tag How to tag Identify what program data should be assigned tags • Variables (local or global) • Function parameters • Function return values • Data from an input stream network, filesystem, keyboard, ... • Specific input stream 141.195.121.134:80, a.txt,... Describe how tags should be assigned for identified data • Single tag • One tag per source • Multiple tags per source
  • 31. Taint sources What to tag How to tag Identify what program data should be assigned tags • Variables (local or global) • Function parameters • Function return values • Data from an input stream network, filesystem, keyboard, ... • Specific input stream 141.195.121.134:80, a.txt,... Describe how tags should be assigned for identified data • Single tag • One tag per source • Multiple tags per source • ...
  • 32. a.txt Taint sources What to tag: a.txt How to tag: single tag
  • 33. a.txt Taint sources What to tag: a.txt How to tag: single tag
  • 34. Taint sources What to tag: a.txt How to tag: single tag a.txt
  • 35. Taint sources What to tag: a.txt How to tag: single tag a.txt 1 1 1 1 1 1
  • 36. Taint sources What to tag: a.txt How to tag: single tag a.txt
  • 37. Taint sources What to tag: a.txt a.txt How to tag: multiple tags
  • 38. Taint sources What to tag: a.txt a.txt 2 31 4 5 n How to tag: multiple tags
  • 40. Affecting data Mapping function Propagation policy 3 B A12 3 C
  • 41. Affecting data Mapping function Data that affects the outcome of a statement through Propagation policy 3 B A12 3 C
  • 42. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies Propagation policy 3 B A12 3 C
  • 43. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies Propagation policy 3 B A12 3 C
  • 44. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies A policy can consider both or only data dependencies Propagation policy 3 B A12 3 C
  • 45. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies A policy can consider both or only data dependencies Define how tags associated with affecting data should be combined Propagation policy 3 B A12 3 C
  • 46. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies A policy can consider both or only data dependencies Define how tags associated with affecting data should be combined • Union Propagation policy 3 B A12 3 C
  • 47. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies A policy can consider both or only data dependencies Define how tags associated with affecting data should be combined • Union • Max Propagation policy 3 B A12 3 C
  • 48. Affecting data Mapping function Data that affects the outcome of a statement through • Data dependencies • Control dependencies A policy can consider both or only data dependencies Define how tags associated with affecting data should be combined • Union • Max • ... Propagation policy 3 B A12 3 C
  • 49. if(X) { C = A + B; } Propagation policy
  • 50. 3 if(X) { C = A + B; } 1 2 Propagation policy
  • 51. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max
  • 52. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence✔ union max
  • 53. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence✔ union max ✔
  • 54. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence✔ union max ✔
  • 55. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence✔ union max ✔ 1 2
  • 56. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max
  • 57. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max ✔ ✔
  • 58. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max ✔ ✔ ✔
  • 59. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max ✔ ✔ ✔
  • 60. 3 if(X) { C = A + B; } 1 2 Propagation policy Affecting data: control dependence Mapping function: data dependence union max ✔ ✔ ✔ 3
  • 61. Where to check What to check Taint Sinks How to check
  • 62. Where to check What to check Location in the program to perform a check Taint Sinks How to check
  • 63. Where to check What to check Location in the program to perform a check • Function entry / exit • Statement type • Specific program point Taint Sinks How to check
  • 64. Where to check What to check Location in the program to perform a check • Function entry / exit • Statement type • Specific program point The data whose tags should be checked Taint Sinks How to check
  • 65. Where to check What to check Location in the program to perform a check • Function entry / exit • Statement type • Specific program point The data whose tags should be checked • Variables • Function parameters • Function return value Taint Sinks How to check
  • 66. Where to check What to check Location in the program to perform a check • Function entry / exit • Statement type • Specific program point The data whose tags should be checked • Variables • Function parameters • Function return value Taint Sinks How to check Set of conditions to check and a set of actions to perform if the conditions are not met.
  • 67. Where to check What to check Location in the program to perform a check • Function entry / exit • Statement type • Specific program point The data whose tags should be checked • Variables • Function parameters • Function return value Taint Sinks How to check Set of conditions to check and a set of actions to perform if the conditions are not met. • validate presence of tags (exit or log) • ensure absence of tags (exit or log) • ...
  • 68. Taint Sinks cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]);
  • 69. Taint Sinks cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]); 2
  • 70. Taint Sinks cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]); 2 3
  • 71. validate presence of: validate absence of: Taint Sinks function: exec, param: 0 Where / what to check: How to check: Result: cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]); 2 3 2 3
  • 72. validate presence of: validate absence of: Taint Sinks function: exec, param: 0 Where / what to check: How to check: Result: cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]); 2 3 2 3 2 3
  • 73. validate presence of: validate absence of: Taint Sinks function: exec, param: 0 Where / what to check: How to check: Result: cmd = read(file); args = read(socket); cmd = trim(cmd + args); ... tok[] = parse(cmd); exec(tok[0], tok[1]); ✘ 2 3 2 3 2 3
  • 74. Framework: ease of use Provide two ways to configure the framework
  • 75. Framework: ease of use • Basic • Select sources, propagation policies, and sinks from a set of predefined options • XML based configuration Provide two ways to configure the framework
  • 76. Framework: ease of use • Basic • Select sources, propagation policies, and sinks from a set of predefined options • XML based configuration • Advanced • Suitable for more esoteric applications • Extend OO implementation Provide two ways to configure the framework
  • 77. Framework: accuracy • Dytan operates at the binary level • consider the actual program semantics • transparently handle libraries • Dytan accounts for both data- and control- flow dependencies
  • 78. Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 79. Two common examples: Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 80. Two common examples: • Implicit operands add %eax, %ebx // A = A + B Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 81. Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 82. Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax, %eflags Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 83. • Address Generators add %eax, %ebx // A = A + B Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax, %eflags Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement
  • 84. • Address Generators add %eax, %ebx // A = A + B Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax, %eflags Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement [ ] *
  • 85. • Address Generators add %eax, %ebx // A = A + B consumed: %eax, [%ebx] Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax, %eflags Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement [ ] *
  • 86. • Address Generators add %eax, %ebx // A = A + B consumed: %eax, [%ebx], %ebx Two common examples: • Implicit operands add %eax, %ebx // A = A + B produced: %eax, %eflags Framework: accuracy The most common source of inaccuracy is incorrectly identifying the information produced and consumed by a statement [ ] *
  • 87. Outline ✓Motivation & overview ✓Framework ✓flexibility ✓ease of use ✓accuracy • Empirical evaluation • Conclusions
  • 88. Empirical evaluation • RQ1: Can Dytan be used to (easily) implement existing dynamic taint analyses? • RQ2: How do inaccurate propagation policies affect the analysis results? • In addition: discussion on performance
  • 89. RQ1: flexibility • Selected two techniques: • Overwrite attack detection [Qin et al. 04] • SQL injection detection [Halfond et al. 06] • Used Dytan to re-implement both techniques • Measure implementation time • Validate against the original implementation Goal: show that Dytan can be used to (easily) implement existing dynamic taint analyses
  • 90. RQ1: results • Implementation time: • Overwrite attack detection: < 1 hour • SQL injection detection: < 1 day • Comparison with original implementations: • Successfully stopped same attacks as the original implementations
  • 91. RQ2: accuracy impact Goal: measure the effect of inaccurate propagation policies on analysis results
  • 92. RQ2: accuracy impact • Selected two subjects: • Gzip (75kb w/o libraries) • Firefox (850kb w/o libraries) Goal: measure the effect of inaccurate propagation policies on analysis results
  • 93. RQ2: accuracy impact • Selected two subjects: • Gzip (75kb w/o libraries) • Firefox (850kb w/o libraries) • Use Dytan to taint program inputs and measure the amount of heap data tainted at program exit Goal: measure the effect of inaccurate propagation policies on analysis results
  • 94. RQ2: accuracy impact • Selected two subjects: • Gzip (75kb w/o libraries) • Firefox (850kb w/o libraries) • Use Dytan to taint program inputs and measure the amount of heap data tainted at program exit • Compare Dytan against inaccurate policies • no implicit operands (no IM) • no address generators (no AG) • no implicit operands, no address generators (no IM, no AG) Goal: measure the effect of inaccurate propagation policies on analysis results
  • 95. RQ2: results 0% 25% 50% 75% 100% Firefox (1 page) Firefox (3 pages) Gzip Dytan No IM No AG No IM, no IG
  • 96. Performance • Measured for gzip: ≈30x for data flow ≈50x for data and control flow • High overhead, but...
  • 97. Performance • In line with existing implementations • Measured for gzip: ≈30x for data flow ≈50x for data and control flow • High overhead, but...
  • 98. Performance • In line with existing implementations • Designed for experimentation • Favors flexibility over performance • Measured for gzip: ≈30x for data flow ≈50x for data and control flow • High overhead, but...
  • 99. Performance • In line with existing implementations • Designed for experimentation • Favors flexibility over performance • Implementation can be further optimized • Measured for gzip: ≈30x for data flow ≈50x for data and control flow • High overhead, but...
  • 100. Related work • Existing dynamic tainting approaches [Suh et al. 04, Newsome and Song 05, Halfond et al. 06, Kong et al. 06, ...] • Ad-hoc • Other dynamic taint analysis frameworks [Xu et al. 06 and Lam and Chiueh 06] • Focused on security applications • Single taint mark • No control-flow propagation • Operate at the source code level
  • 101. Conclusions • Dytan • a general framework for dynamic tainting • allows for instantiating and experimenting with different dynamic taint analysis approaches • Initial evaluation • flexible • easy to use • accurate
  • 102. Future directions • Tool release (documentation, code cleanup) http://www.cc.gatech.edu/~clause/dytan/ (pre-release on request) • Optimization (general and specific) • Applications • Memory protection • Debugging