Developing Tools for “What if…” Testing of Large-scale Software Systems

DEVELOPING TOOLS FOR
“WHAT IF…” TESTING OF
LARGE-SCALE SOFTWARE
SYSTEMS
Dr. James H. Hill
Indiana U.-Purdue U.
Indianapolis
hillj@cs.iupui.edu
http://www.cs.iupui.edu/~hillj
@Virginia Tech, November 14, 2014

Presentation Outline
1. Research Overview
2. Modeling Effort Research
3. System Execution Trace Research
4. Wrapping Up

Domain: Large-scale Distribute
Real-time & Embedded (DRE)
Systems
Large-scale distributed real-time &
embedded (DRE) systems have
the following characteristics:
 Stringent quality-of-service
(QoS) requirements w/
conflicting functional
requirements
 e.g., performance, reliability,
security, fault tolerance, &
etc.
 Heterogeneity in both
operational environment &
technology
 Often developed as monolithic,
stove-piped applications, where
small changes have a large
Low latency

Domain: Large-scale Distribute
Real-time & Embedded (DRE)
Systems
Large-scale distributed real-time &
embedded (DRE) systems have
the following characteristics:
 Stringent quality-of-service
(QoS) requirements w/
conflicting functional
requirements
 e.g., performance, reliability,
security, fault tolerance, &
etc.
 Heterogeneity in both
operational environment &
technology
Historically, these characteristics have resulted in elongated software
lifecycles realized at the expense of overrun project deadlines & effort
 Often developed as monolithic,
stove-piped applications, where
small changes have a large
Low latency

Examples of Large-scale DRE
Systems
Mission avionics
systems
Shipboard computing
environments
Traffic management
systems
Emergency response
systems
Manufacturing
control systems

Shortcomings of Conventional Testing
Techniques
Conventional testing techniques
focus on functional concerns in the
development phase of the software
system
Functional Testing Techniques
(design) System Integration (production)
Software Lifecycle Timeline

Shortcomings of Conventional Testing
Techniques
Performance concerns
are not validated until
system integration
Performance Testing
Functional Testing
(design) System Integration (production)
Software Lifecycle Timeline
Conventional testing techniques make it hard to understand
performance properties of software systems early in the software
lifecycle

Cost of Fixing “Bugs”
 Performance
testing/validation
typically begins
when a system is
completely
developed
 In theory, you
already begin with
cost >150x
Boehm, B., Software Engineering Economics, Prentice Hall, New Jersey, 1981.
* Image from http://www.superwebdeveloper.com/2009/11/25/the-incredible-rate-of-diminishing-
returns-of-fixing-software-bugs/

CUTS System Execution Modeling
Tool
Behavior/Workload
1 Model
Source Code
2
3
QoS Performance Graph
4
Developer This process is repeated throughout the entire software
lifecycle

Technology Enablers for CUTS
Modeling Tools
Technology Enablers
• Component Behavior
Modeling Language
(CBML)
• Workload Modeling
Language (WML)
Runtime/Testing
Tools
Technology Enablers
• Semi-virtualization
techniques
• Atomic testing framework
• Extensible data archiving
framework
• Continuous integration
Analysis Tools
Technology Enablers
• Understanding Non-functional
Intensions via
Testing & Experimentation
(UNITE)

Areas of Research
• Measuring & reducing modeling effort
Model-Driven
Engineering
• Accurately emulating resource usage for
early performance testing
Software System
Emulation
• Techniques for integrating software systems
built from heterogeneous technologies
Software System
Integration
• Methods for increasing how much data is
collected without decreasing performance
Real-time Software
Instrumentation
• Understanding performance properties from
system execution traces
Software Performance
Analytics

Context: MDE & DSMLs
15
 Model-driven engineering (MDE)
powered by domain-specific
modeling languages (DSMLs)
capture the syntax & constraints
of target domain
 Constraints enforce “correct-by-construction”
constraints
 Graphical artifacts make it more
intuitive to construct models for
the target domain
 DSML Environments: Generic
Modeling Environment, Domain-specific
Language Tools, &
Generic Eclipse Modeling
System

Downside of Using a DSML
 DSMLs increase the level of abstraction, but there can be a “high”
cost associated with using a DSML
 i.e., high modeling effort
Question:
Where is each
component
deployed?

Downside of Using a DSML
 DSMLs increase the level of abstraction, but there can be a “high”
cost associated with using a DSML
 i.e., high modeling effort
&'( ) *+, -', . $/ -/0 /1*$
2'33/1$/ -/0 /1*$
Solution: Right-click
to show
deployments
Question: Is
there a way to
know before
hand how much
effort a modeling
task will take?

Modeling Effort in DSMLs
 Modeling effort. The number of
steps it takes a modeler to
complete a high-level modeling
goal (or a task)
 Adding an element, clicking an
element, typing, etc.

Peoples’ Thoughts Really Do
Matter
In the domain of Software Performance Engineering…
Human
Think Time
Computer
Processing
Time
Measured
Performance

Measuring Modeling Effort
 Modeling tasks are a series of actions composed of 2 independent
components:
 variant human component, i.e., think time
 measurable computer component
å
M(T) = (Zt +time(t))
tÎT
where,
 T = task
 t = action
 Zt = think time for a given action
 time(t) = computer time to complete its portion of the task
Example. M(2) means it takes 2 actions to complete task; M(n) means
it takes n actions to complete task.

Reducing Modeling Effort
å
M(T) = (Zt +time(t))
tÎT
 It is hard to impact think time
because it is variable
 i.e., dependent on the
modelers experience
 i.e., M(n+2) ≠ M(n)
 True impact of think time
requires focus groups &
unbiased feedback
 One can remove think time
associated with a given action
by removing the action
completely from the task!
Approach allows DSML developers
to reduce modeling effort without
requiring external assistance…

Approaches for Reducing
Modeling
 Model observers are
artifacts that run in the
background and observe
modeling events
 GME Add-on
 Model decorators are
artifacts that manage the
visualization aspect of a
model element
 GME Decorator
 Model solvers are artifacts
that complete parts of a
model based on the current
state of the mode

Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CCM
specification

PICML
within CoSMIC for modeling component-based system based on the CORBA
Component Modeling (CCM) specification
Task
Modeling Effort
Original Approach Optimized
Defining unique identifier
for new components
Defining the value of a
component’s attribute
Changing an attributes
type
Determining component
deployment status
Modeling a component’s
implementation

PICML
Task
Modeling Effort
for new components
M (n + 2)
M (5)
type
M (n + 1)
deployment status
M (n x m)
implementation
M (4n + 6)

PICML
Task
Modeling Effort
for new components
M (n + 2) observer
M (5) solver
type
M (n + 1) observer
deployment status
M (n x m) decorator
implementation
M (4n + 6) solver

PICML
Task
Modeling Effort
for new components
M (n + 2) observer M (1)
M (5) solver M (3)
type
M (n + 1) observer M (1)
deployment status
M (n x m) decorator M (n)
implementation
M (4n + 6) solver M (2)

Lessons Learned
 Model observers can reduce
modeling effort to a constant
 Most effective when auto-completing
manual actions
by inferring future actions
based on the models state
 Model decorators can remove
a factor from modeling effort
 Provides a mechanism for
exposing hidden information
 Does not remove actions
from a given task

Past & Current Research
Projects
Parameterized
Modeling
• Method for
supporting many
different
configurations of
same model
without modifying
the original model
• Realized using
parameterized
attributes
Proactive Modeling
• Reduce the
amount of manual
modeling required
when using a
graphical domain-specific
modeling
language
• Assist in step-by-step
creation of a
model
State-based
Decorators
• Allow domain-specific
modeling
language creators
to easily create
decorators base on
the state of the
model
• Scripting
framework for
implementing
decorators for the
GME modeling tool

Why Automate Modeling?
 Modeling can be tedious, time-consuming
and error-prone—
especially when dealing with
complex DSMLs & large models
 Current MDE approaches mainly
focus on checking a model for
correctness, but many enterprise
domains require finding an error-free
model
 MDE techniques are gaining
popularity and thus, it is necessary
to increase its usability for less-skilled
demographic

Current: Constraint Solvers
Manually create a
partial model using
the DSML
Constraint Solver
Solution 1 Solution 2 ……………………… Solution
n
Limitations:
 Manually creating the partial model
 Modeler’s must fully understand how to use the DSML
 Modeler’s must identify an appropriate partial model

Current: Model Guidance
Engines
 Modelers select a model
element and guidance
engines can either highlight
valid associations or provide
valid editing operations
Limitations:
 Existing implementations
are based on “fixing” an
existing model and more
reactive…

Some Dynamic Constraints
 Number of elements required (e.g., Book element):
let partCount = self.parts (“Book”)->size in
(partCount >= 3)
 Required city:
self.City = “Indianapolis”
 Required age:
self.Age >= 18
 Salary range:
(self.Salary >= 30000) and (self.Salary <= 40000)
 Book borrowing condition:
let conn = self.connectedFCOs (Borrows) in
conn->forAll (p:Book | self.Major = p.Department)

Current Limitations in Context of
LMS
 Modeler must have in-depth
knowledge of the
metamodel and constraints
 Current model intelligence
approaches do not address
the following aspects:
 Managing the model
proactively (e.g.,
required elements that
must be added
manually)
 Executing valid actions
inferred from the
metamodel

Our Approach: Proactive
Modeling
 Proactive modeling means “foreseeing modeling”
 Automate—as much as possible—the modeling process by foreseeing
valid model transformations and automatically executing them
Proposed Benefits
 Reduce modeling effort by removing many tedious and error-prone
modeling actions from the modeling process, such as manually creating
required model elements
 Assist novice modelers who are not familiar with a DSML

Proactive Modeling: Automation
Aspects
Automated Modeling
Automatically managing
different model elements when
an associated model element
is created/deleted/modified
Decision-Making
Presenting the modeler with a
list of valid model actions
based on the current model
state

Proactive Modeling: Automation
Process
Constraint Analysis
Parsing and analyzing a constraints
collected during the syntax analysis
process
Supports model guidance and
resolution
Syntax Analysis
Analyzing metamodel at run-time
to discover static
information
Operates independent of the
target metamodel

Proactive Modeling: Constraint
Types
Reference
Constraints
Containmen
t
Constraints
Attribute
Constraints
Association
Constraints

Proactive Modeling Engine:
Architecture

PME: Containment Handler
Responsible for auto-generating
model
elements by resolving the
containment relationships
between model elements.

PME: Association Handler
Responsible for identifying
valid destination model
elements for a given
source model element
when making a
connection between the
two model elements

PME: Modeler Guidance
Handler
Responsible for providing a
modeler with a list of possible
operations to choose from
when the proactive modeling
engine has finished auto-generating
model elements.
Supports adding an
element, deleting an
element, and creating a
connection

Modeling Effort for Generic
Tasks
Task
Modeling effort
Without PME With PME
Adding a model
element such as
Model, Atom, and
Reference
M (1) M (1)
Setting a reference M (mn+2) M (n+1)
Creating a connection
M (n+3) M (2)
between two elements

Modeling Effort in the LMS
Task Actions
Modeling effort
Without
PME
With PME
Creating Library
model
• Adding 1 Library element;
• Adding 3 Book elements;
• Adding 2 HRStaff
elements;
• Adding 2 Librarian
elements;
• Adding 2 Shelf elements.
M (10) M (1)
Creating and
setting Patronref
• Adding 1 Patronref
element;
• Finding and dragging the
element to be referred.
M (mn+3) M (n+2)

Modeling Effort in the LMS
Task Actions
Modeling effort
Without PME With PME
Creating a
Borrows
connection
• Selecting a Patron
element and checking
his/her major;
• Selecting and checking
all the books in the
library for their
department;
• Creating a connection
between the two.
M (n+3) M (2)
Creating a Hires
connection
• Finding the librarians;
• Creating a connection
between Librarian and
HRStaff elements.
M (n+2) M (2)

System Execution Trace
Research

System Execution Traces
...
activating ConfigOp
...
ConfigOp recv request 6 at 1234945638
validating username and password for request 6
username and password is valid
granting access at 1234945652 to request 6
...
deactivating the ConfigOp
...
Example System Execution Trace
Generated by Distributed Components
(generates)

System Execution Trace
Collection
• Problem: How do you
reconstruct the all the execution
flows from the merged
execution traces?
DB
Logging
Client
Logging
Client
Logging
Client
Logging
Server

Dealing with Deployments
SensorA
SensorB
Config
Host 1 Host 2
Sensor components are
deployed on same host

Sensor components are
deployed on separate host
Config
SensorB SensorA
Host 1 Host 2 Host 3

SensorA
SensorB
Config
Host 1 Host 2
Config
• The dataflow model remains
the same irrespective of the
system’s composition and
structure
• e.g., addition/removal of
components at runtime
S C
Dataflow Model for this Example
SensorB SensorA
Host 1 Host 2 Host 3

Mapping Dataflow Models To
System Execution Traces
• System trace can contain many log messages that capture the behavior of
a system
...
activating ConfigOp
...
...
...
Example System Trace Generated by
Distributed Components
• Log messages are instances of well-defined log formats that capture key
information
ConfigOp recv request {INT evid} at {INT recvTime}
The log formats can map directly to the nodes in the dataflow model. We
just need to show how each log format relates to one another.

The UNITE Approach
...
activating ConfigOp
...
...
...
Example System Trace Generated by
Distributed Components
LF1: ConfigOp recv request {INT
evid} at {INT recvTime}
LF2: granting access at {INT
grantTime} to request {INT evid}
Example Log Formats for System Trace
1.Identify the log messages in
system trace that contain metrics
of interest
2.Convert corresponding log
messages to log formats

The UNITE Approach
1.Identify the log messages in
system trace that contain metrics
of interest
2.Convert corresponding log
messages to log formats
3.Determine causality between log
formats using variables in the log
formats
4.Define evaluation function using
variables in log formats where
LF1.evid = LF2.evid
Avg. Authentication Time
AVG(LF2.grantTime – LF1.recvTime)
Use relational database theory
techniques to evaluate user-defined
functions
• O(n) processing time
• Does not require use of time
LF1: ConfigOp recv request {INT
evid} at {INT recvTime}
LF2: granting access at {INT
grantTime} to request {INT evid}
Example Log Formats for System Trace

Requirements for UNITE
Method
Properties
• Dataflow models
possess properties to
support the UNITE
method
• e.g., keys and unique
relations
Dataflow Model
• There exists a
dataflow model that
can be used to define
analysis equations
• Becomes a challenge
as systems increase
in size and
complexity

Solutions for Requirements
Properties
• Adapt the system execution
trace at runtime to inject
missing properties
• Does not require
modification of existing
program
• System Execution Trace
Adaption Framework
(SETAF)
Dataflow Model
• Auto-reconstruct dataflow
model from system
execution trace
• We used frequent pattern
sequence mining and
Dempster-Shafer theory
• Dataflow Model Auto-
Constructor (DMAC)
• 94% coverage of existing
system execution trace

Forms of Software
Instrumentation
Intrusive
• The process of collecting data
from software for analytical
purposes by modifying the
original source code
Non-intrusive
• The process of collecting data
from software for analytical
purposes without modifying
the original source code
• e.g., Dynamic binary
instrumentation (DBI)
• Examples: Pin, Solaris
Dynamic Tracing (Dtrace),
DynamoRIO, DynInst

Dynamic Binary Instrumentation
 Dynamic Binary Instrumentation (DBI) is a form of
non-intrusive software instrumentation where
instrumentation code is injected into a binary executable
at runtime
 Monitor both application- and system-level behavior
 E.g., resource usage, system calls, multi-threading behavior,
branching, & etc
Original Binary Binary with Injected
Instrumentation Code

ofstream OutFile;
// Running count of instructions is kept here make it static to help the compiler optimize
docount
static UINT64 icount = 0;
// This function is called before every instruction is executed
VOID docount() { icount++; }
// Pin calls this function every time a new instruction is encountered
VOID Instruction(INS ins, VOID *v) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}
// This function is called when the application exits
VOID Fini(INT32 code, VOID *v){
OutFile.setf(ios::showbase);
OutFile << "Count " << icount << endl;
OutFile.close();
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
"o", "inscount.out", "specify output file name");
int main(int argc, char * argv[]) {
// Initialize pin
if (PIN_Init(argc, argv)) return Usage();
OutFile.open(KnobOutputFile.Value().c_str());
// Register Instruction to be called to instrument instructions
INS_AddInstrumentFunction(Instruction, 0);
// Register Fini to be called when the application exits
PIN_AddFiniFunction(Fini, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}

Main Challenges Writing
Pintools
 Hard to see the design and structure of a
Pintool
Tool$
<Class>$
//$no, fica, on$methods$
Instrument$
<Class>$
//$instrumenta, on$methods$
Callback$
<Class>$
//$analysis$methods$
Pintool$
<SharedLibrary>$
1$
*$
*$
*$
*$
1$

Main Challenges Writing
Pintools
 Hidden constraints between analysis routine
definition and its registration with Pin
// This function is called before every instruction is executed
VOID docount() { icount++; }
// Pin calls this function every time a new instruction is
encountered
VOID Instruction(INS ins, VOID *v) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount,
IARG_END);
}

Pin++: Lightweight Framework for
Pin
 A object-oriented framework for writing Pintools
 Uses design patterns to promote reuse and reduce
complexity of Pintools
 Uses template-metaprogramming to reduce potential
development errors and optimize the performance of a
Pintool at compile time
 Designed to promote reuse of different components in a
Pintool
 Codifies many requirements of a Pintool so developers to
not have to re-implement them for each and every tool
 e.g., bootstrapping, initialization, registration, & etc
 Freely available at github.com/SEDS/PinPP

An Example Pintool in Pin++
// Pin++ callback object
class docount : public Callback <docount(void)> {
public:
docount (void) : count_ (0) { }
void handle_analyze (void) { ++ count_; } // required analysis method
UINT64 count (void) const { return count_; }
private:
UINT64 count_;
};
// Pin++ instruction-level instrument
class Instruction : public Instruction_Instrument <Instruction> {
public:
void handle_instrument (const Ins & ins) { callback_.insert (IPOINT_BEFORE, ins);
}
UINT64 count (void) const {return cnt_.count();}
private:
docount cnt_;
};
Analysis callback
transformed into strongly
type object
Insert methods on callback
handle complex insert task

// one and only Pin++ tool object
class inscount : public Tool <inscount> {
public:
inscount (void) { enable_fini_callback (); }
Signals are implemented as
callback hooks on the tool object
void handle_fini (INT32 code) {
std::ofstream fout (outfile_.Value ().c_str ());
fout.setf (ios::showbase);
fout << "Count “ << instruction_.count () << std::endl;
fout.close ();
}
private:
// One and only instrument
Instruction instruction_;
static KNOB <string> outfile_;
};
Instrumentation methods automatically
KNOB <string> inscount::outfile_ (KNOB_MODE_WRITEONCE, "pintool", "o",
"inscount.out", "output file name");
// macro simplifying pintool initialization
DECLARE_PINTOOL (inscount);
registered with Pintool when
instantiated
Removes repetitive bootstrap code

A Pin++ Tool with Contextual
Data
class Mem_Read : public Callback <Mem_Read(ARG_INST_PTR,
ARG_MEMORYOP_EA)> {
public:
Mem_Read (FILE * file) : file_ (file) { }
void handle_analyze (arg1_type ip, arg2_type addr) { fprintf (file_, "%p: R %pn", ip,
addr); }
private:
FILE * file_;
};
class Instrument : public Instruction_Instrument <Instrument> {
public:
void handle_instrument (const Ins & ins) {
UINT32 operands = ins.memory_operand_count ();
for (UINT32 op = 0; op < operands; ++ op)
if (ins.is_memory_operand_read (op))
mr_.insert_predicated (IPOINT_BEFORE, ins, op);
}
private:
Mem_Read mr_;

Evaluating the Performance of
Pin++
 Can Pin++ reduce the complexity of writing analysis
tools for Pin?
 The lower the complexity, the easier it will be to write (or code) an
analysis tool for Pin
 Lower complexity also is know to have less “bugs”
 Can Pin++ improve the modularity of analysis tools for
Pin?
 The more modular the source code, the more possible it is to
reuse the the source code
 Can Pin++ improve the performance of analysis tools
for Pin?
 Improving performance equates to reduce instrumentation
overhead

Reducing Complexity with Pin++
0
5
10
15
20
detach
emudiv
follow_child_tool
fork_jit_tool
imageload
inscount_tls
inscount0
inscount1
inscount2
invocation
isampling
itrace
malloc_mt
malloctrace
nonstatica
pinatrace
proccount
replacesigprobed
statica
staticcount
strace
Complexity
Name of the Pintool
Cyclomatic Complexity of Pintools from Pin's User Manual
Traditional
Pin++
Pin++ w/ Reuse

Improving Modularity with Pin++
0
5
10
15
20
detach
emudiv
follow_child_tool
fork_jit_tool
imageload
inscount_tls
inscount0
inscount1
inscount2
invocation
isampling
itrace
malloc_mt
malloctrace
nonstatica
pinatrace
proccount
replacesigprobed
statica
staticcount
strace
Modularity
Name of the Pintool
Modularity of Pintools from Pin's User Manual
Traditional
Pin++
Pin++ w/ Reuse

Improving Performance with
Pin++

Software Engineering & Distributed
Systems Group (Partial List)
Full-Time Developers
 Dennis Feiock
Doctoral Students
 Patrick Burton
 Manjula Peiris (Microsoft Research)
 Enas Alikhashashneh
Master Students
 Mike Du
 *Scott McNeany (Crowe Horwath)
 *Tanumoy Pati (JDS Uniphase Corp.)
 *Darshan G. Puranik (Interactive
Intelligence)
 Geetha R. Satyanaray
 *Rahul Singhania (Cummins)
Undergraduate Students
 *Erik Birch
 *Patrick Burton (VMWare)
 *Malcolm Collins
 Dane Doehr
 *Bryan Hofer (Allegient)
 *Evan Smith (Apple, Inc)
 *Nehom Tehome (from Jackson
State University)
 *Brandon Wilson
(DeveloperTown)
* Denotes alumni

Developing Tools for “What if…” Testing of Large-scale Software Systems

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Developing Tools for “What if…” Testing of Large-scale Software Systems

Ähnlich wie Developing Tools for “What if…” Testing of Large-scale Software Systems (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Developing Tools for “What if…” Testing of Large-scale Software Systems