This presentation discusses some of our experience and results of the years for developing tools for "what if..." testing of large-scale software systems. This work has been sponsored by many public and private organizations.
This talk was originally presented at a Virginia Tech Computer Science seminar.
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
Developing Tools for “What if…” Testing of Large-scale Software Systems
1. DEVELOPING TOOLS FOR
“WHAT IF…” TESTING OF
LARGE-SCALE SOFTWARE
SYSTEMS
Dr. James H. Hill
Indiana U.-Purdue U.
Indianapolis
hillj@cs.iupui.edu
http://www.cs.iupui.edu/~hillj
@Virginia Tech, November 14, 2014
2. Presentation Outline
1. Research Overview
2. Modeling Effort Research
3. System Execution Trace Research
4. Wrapping Up
4. Domain: Large-scale Distribute
Real-time & Embedded (DRE)
Systems
Large-scale distributed real-time &
embedded (DRE) systems have
the following characteristics:
Stringent quality-of-service
(QoS) requirements w/
conflicting functional
requirements
e.g., performance, reliability,
security, fault tolerance, &
etc.
Heterogeneity in both
operational environment &
technology
Often developed as monolithic,
stove-piped applications, where
small changes have a large
Low latency
5. Domain: Large-scale Distribute
Real-time & Embedded (DRE)
Systems
Large-scale distributed real-time &
embedded (DRE) systems have
the following characteristics:
Stringent quality-of-service
(QoS) requirements w/
conflicting functional
requirements
e.g., performance, reliability,
security, fault tolerance, &
etc.
Heterogeneity in both
operational environment &
technology
Historically, these characteristics have resulted in elongated software
lifecycles realized at the expense of overrun project deadlines & effort
Often developed as monolithic,
stove-piped applications, where
small changes have a large
Low latency
6. Examples of Large-scale DRE
Systems
Mission avionics
systems
Shipboard computing
environments
Traffic management
systems
Emergency response
systems
Manufacturing
control systems
7. Shortcomings of Conventional Testing
Techniques
Conventional testing techniques
focus on functional concerns in the
development phase of the software
system
Functional Testing Techniques
(design) System Integration (production)
Software Lifecycle Timeline
8. Shortcomings of Conventional Testing
Techniques
Performance concerns
are not validated until
system integration
Performance Testing
Functional Testing
(design) System Integration (production)
Software Lifecycle Timeline
Conventional testing techniques make it hard to understand
performance properties of software systems early in the software
lifecycle
9. Cost of Fixing “Bugs”
Performance
testing/validation
typically begins
when a system is
completely
developed
In theory, you
already begin with
cost >150x
Boehm, B., Software Engineering Economics, Prentice Hall, New Jersey, 1981.
* Image from http://www.superwebdeveloper.com/2009/11/25/the-incredible-rate-of-diminishing-
returns-of-fixing-software-bugs/
10. CUTS System Execution Modeling
Tool
Behavior/Workload
1 Model
Source Code
2
3
QoS Performance Graph
4
Developer This process is repeated throughout the entire software
lifecycle
12. Areas of Research
• Measuring & reducing modeling effort
Model-Driven
Engineering
• Accurately emulating resource usage for
early performance testing
Software System
Emulation
• Techniques for integrating software systems
built from heterogeneous technologies
Software System
Integration
• Methods for increasing how much data is
collected without decreasing performance
Real-time Software
Instrumentation
• Understanding performance properties from
system execution traces
Software Performance
Analytics
13. Areas of Research
• Measuring & reducing modeling effort
Model-Driven
Engineering
• Accurately emulating resource usage for
early performance testing
Software System
Emulation
• Techniques for integrating software systems
built from heterogeneous technologies
Software System
Integration
• Methods for increasing how much data is
collected without decreasing performance
Real-time Software
Instrumentation
• Understanding performance properties from
system execution traces
Software Performance
Analytics
15. Context: MDE & DSMLs
15
Model-driven engineering (MDE)
powered by domain-specific
modeling languages (DSMLs)
capture the syntax & constraints
of target domain
Constraints enforce “correct-by-construction”
constraints
Graphical artifacts make it more
intuitive to construct models for
the target domain
DSML Environments: Generic
Modeling Environment, Domain-specific
Language Tools, &
Generic Eclipse Modeling
System
16. Downside of Using a DSML
DSMLs increase the level of abstraction, but there can be a “high”
cost associated with using a DSML
i.e., high modeling effort
Question:
Where is each
component
deployed?
17. Downside of Using a DSML
DSMLs increase the level of abstraction, but there can be a “high”
cost associated with using a DSML
i.e., high modeling effort
&'( ) *+, -', . $/ -/0 /1*$
2'33/1$/ -/0 /1*$
Solution: Right-click
to show
deployments
Question: Is
there a way to
know before
hand how much
effort a modeling
task will take?
18. Modeling Effort in DSMLs
Modeling effort. The number of
steps it takes a modeler to
complete a high-level modeling
goal (or a task)
Adding an element, clicking an
element, typing, etc.
19. Peoples’ Thoughts Really Do
Matter
In the domain of Software Performance Engineering…
Human
Think Time
Computer
Processing
Time
Measured
Performance
20. Measuring Modeling Effort
Modeling tasks are a series of actions composed of 2 independent
components:
variant human component, i.e., think time
measurable computer component
å
M(T) = (Zt +time(t))
tÎT
where,
T = task
t = action
Zt = think time for a given action
time(t) = computer time to complete its portion of the task
Example. M(2) means it takes 2 actions to complete task; M(n) means
it takes n actions to complete task.
21. Reducing Modeling Effort
å
M(T) = (Zt +time(t))
tÎT
It is hard to impact think time
because it is variable
i.e., dependent on the
modelers experience
i.e., M(n+2) ≠ M(n)
True impact of think time
requires focus groups &
unbiased feedback
One can remove think time
associated with a given action
by removing the action
completely from the task!
Approach allows DSML developers
to reduce modeling effort without
requiring external assistance…
22. Approaches for Reducing
Modeling
Model observers are
artifacts that run in the
background and observe
modeling events
GME Add-on
Model decorators are
artifacts that manage the
visualization aspect of a
model element
GME Decorator
Model solvers are artifacts
that complete parts of a
model based on the current
state of the mode
23. Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CCM
specification
24. Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CORBA
Component Modeling (CCM) specification
Task
Modeling Effort
Original Approach Optimized
Defining unique identifier
for new components
Defining the value of a
component’s attribute
Changing an attributes
type
Determining component
deployment status
Modeling a component’s
implementation
25. Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CORBA
Component Modeling (CCM) specification
Task
Modeling Effort
Original Approach Optimized
Defining unique identifier
for new components
M (n + 2)
Defining the value of a
component’s attribute
M (5)
Changing an attributes
type
M (n + 1)
Determining component
deployment status
M (n x m)
Modeling a component’s
implementation
M (4n + 6)
26. Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CORBA
Component Modeling (CCM) specification
Task
Modeling Effort
Original Approach Optimized
Defining unique identifier
for new components
M (n + 2) observer
Defining the value of a
component’s attribute
M (5) solver
Changing an attributes
type
M (n + 1) observer
Determining component
deployment status
M (n x m) decorator
Modeling a component’s
implementation
M (4n + 6) solver
27. Reducing Modeling Effort in
PICML
Platform Independent Component Modeling Language (PICML) is a DSML
within CoSMIC for modeling component-based system based on the CORBA
Component Modeling (CCM) specification
Task
Modeling Effort
Original Approach Optimized
Defining unique identifier
for new components
M (n + 2) observer M (1)
Defining the value of a
component’s attribute
M (5) solver M (3)
Changing an attributes
type
M (n + 1) observer M (1)
Determining component
deployment status
M (n x m) decorator M (n)
Modeling a component’s
implementation
M (4n + 6) solver M (2)
29. Lessons Learned
Model observers can reduce
modeling effort to a constant
Most effective when auto-completing
manual actions
by inferring future actions
based on the models state
Model decorators can remove
a factor from modeling effort
Provides a mechanism for
exposing hidden information
Does not remove actions
from a given task
30. Past & Current Research
Projects
Parameterized
Modeling
• Method for
supporting many
different
configurations of
same model
without modifying
the original model
• Realized using
parameterized
attributes
Proactive Modeling
• Reduce the
amount of manual
modeling required
when using a
graphical domain-specific
modeling
language
• Assist in step-by-step
creation of a
model
State-based
Decorators
• Allow domain-specific
modeling
language creators
to easily create
decorators base on
the state of the
model
• Scripting
framework for
implementing
decorators for the
GME modeling tool
31. Past & Current Research
Projects
Parameterized
Modeling
• Method for
supporting many
different
configurations of
same model
without modifying
the original model
• Realized using
parameterized
attributes
Proactive Modeling
• Reduce the
amount of manual
modeling required
when using a
graphical domain-specific
modeling
language
• Assist in step-by-step
creation of a
model
State-based
Decorators
• Allow domain-specific
modeling
language creators
to easily create
decorators base on
the state of the
model
• Scripting
framework for
implementing
decorators for the
GME modeling tool
32. Why Automate Modeling?
Modeling can be tedious, time-consuming
and error-prone—
especially when dealing with
complex DSMLs & large models
Current MDE approaches mainly
focus on checking a model for
correctness, but many enterprise
domains require finding an error-free
model
MDE techniques are gaining
popularity and thus, it is necessary
to increase its usability for less-skilled
demographic
33. Current: Constraint Solvers
Manually create a
partial model using
the DSML
Constraint Solver
Solution 1 Solution 2 ……………………… Solution
n
Limitations:
Manually creating the partial model
Modeler’s must fully understand how to use the DSML
Modeler’s must identify an appropriate partial model
34. Current: Model Guidance
Engines
Modelers select a model
element and guidance
engines can either highlight
valid associations or provide
valid editing operations
Limitations:
Existing implementations
are based on “fixing” an
existing model and more
reactive…
38. Current Limitations in Context of
LMS
Modeler must have in-depth
knowledge of the
metamodel and constraints
Current model intelligence
approaches do not address
the following aspects:
Managing the model
proactively (e.g.,
required elements that
must be added
manually)
Executing valid actions
inferred from the
metamodel
39. Our Approach: Proactive
Modeling
Proactive modeling means “foreseeing modeling”
Automate—as much as possible—the modeling process by foreseeing
valid model transformations and automatically executing them
Proposed Benefits
Reduce modeling effort by removing many tedious and error-prone
modeling actions from the modeling process, such as manually creating
required model elements
Assist novice modelers who are not familiar with a DSML
40. Proactive Modeling: Automation
Aspects
Automated Modeling
Automatically managing
different model elements when
an associated model element
is created/deleted/modified
Decision-Making
Presenting the modeler with a
list of valid model actions
based on the current model
state
41. Proactive Modeling: Automation
Process
Constraint Analysis
Parsing and analyzing a constraints
collected during the syntax analysis
process
Supports model guidance and
resolution
Syntax Analysis
Analyzing metamodel at run-time
to discover static
information
Operates independent of the
target metamodel
42. Proactive Modeling: Constraint
Types
Reference
Constraints
Containmen
t
Constraints
Attribute
Constraints
Association
Constraints
44. PME: Containment Handler
Responsible for auto-generating
model
elements by resolving the
containment relationships
between model elements.
45. PME: Association Handler
Responsible for identifying
valid destination model
elements for a given
source model element
when making a
connection between the
two model elements
46. PME: Modeler Guidance
Handler
Responsible for providing a
modeler with a list of possible
operations to choose from
when the proactive modeling
engine has finished auto-generating
model elements.
Supports adding an
element, deleting an
element, and creating a
connection
47. Modeling Effort for Generic
Tasks
Task
Modeling effort
Without PME With PME
Adding a model
element such as
Model, Atom, and
Reference
M (1) M (1)
Setting a reference M (mn+2) M (n+1)
Creating a connection
M (n+3) M (2)
between two elements
48. Modeling Effort in the LMS
Task Actions
Modeling effort
Without
PME
With PME
Creating Library
model
• Adding 1 Library element;
• Adding 3 Book elements;
• Adding 2 HRStaff
elements;
• Adding 2 Librarian
elements;
• Adding 2 Shelf elements.
M (10) M (1)
Creating and
setting Patronref
• Adding 1 Patronref
element;
• Finding and dragging the
element to be referred.
M (mn+3) M (n+2)
49. Modeling Effort in the LMS
Task Actions
Modeling effort
Without PME With PME
Creating a
Borrows
connection
• Selecting a Patron
element and checking
his/her major;
• Selecting and checking
all the books in the
library for their
department;
• Creating a connection
between the two.
M (n+3) M (2)
Creating a Hires
connection
• Finding the librarians;
• Creating a connection
between Librarian and
HRStaff elements.
M (n+2) M (2)
51. System Execution Traces
...
activating ConfigOp
...
ConfigOp recv request 6 at 1234945638
validating username and password for request 6
username and password is valid
granting access at 1234945652 to request 6
...
deactivating the ConfigOp
...
Example System Execution Trace
Generated by Distributed Components
(generates)
52. System Execution Trace
Collection
• Problem: How do you
reconstruct the all the execution
flows from the merged
execution traces?
DB
Logging
Client
Logging
Client
Logging
Client
Logging
Server
53. Dealing with Deployments
SensorA
SensorB
Config
Host 1 Host 2
Sensor components are
deployed on same host
54. Dealing with Deployments
Sensor components are
deployed on separate host
Config
SensorB SensorA
Host 1 Host 2 Host 3
55. Dealing with Deployments
SensorA
SensorB
Config
Host 1 Host 2
Config
• The dataflow model remains
the same irrespective of the
system’s composition and
structure
• e.g., addition/removal of
components at runtime
S C
Dataflow Model for this Example
SensorB SensorA
Host 1 Host 2 Host 3
56. Mapping Dataflow Models To
System Execution Traces
• System trace can contain many log messages that capture the behavior of
a system
...
activating ConfigOp
...
ConfigOp recv request 6 at 1234945638
validating username and password for request 6
username and password is valid
granting access at 1234945652 to request 6
...
deactivating the ConfigOp
...
Example System Trace Generated by
Distributed Components
• Log messages are instances of well-defined log formats that capture key
information
ConfigOp recv request {INT evid} at {INT recvTime}
The log formats can map directly to the nodes in the dataflow model. We
just need to show how each log format relates to one another.
57. The UNITE Approach
...
activating ConfigOp
...
ConfigOp recv request 6 at 1234945638
validating username and password for request 6
username and password is valid
granting access at 1234945652 to request 6
...
deactivating the ConfigOp
...
Example System Trace Generated by
Distributed Components
LF1: ConfigOp recv request {INT
evid} at {INT recvTime}
LF2: granting access at {INT
grantTime} to request {INT evid}
Example Log Formats for System Trace
Generated by Distributed Components
1.Identify the log messages in
system trace that contain metrics
of interest
2.Convert corresponding log
messages to log formats
58. The UNITE Approach
1.Identify the log messages in
system trace that contain metrics
of interest
2.Convert corresponding log
messages to log formats
3.Determine causality between log
formats using variables in the log
formats
4.Define evaluation function using
variables in log formats where
LF1.evid = LF2.evid
Avg. Authentication Time
AVG(LF2.grantTime – LF1.recvTime)
Use relational database theory
techniques to evaluate user-defined
functions
• O(n) processing time
• Does not require use of time
LF1: ConfigOp recv request {INT
evid} at {INT recvTime}
LF2: granting access at {INT
grantTime} to request {INT evid}
Example Log Formats for System Trace
Generated by Distributed Components
59. Requirements for UNITE
Method
Properties
• Dataflow models
possess properties to
support the UNITE
method
• e.g., keys and unique
relations
Dataflow Model
• There exists a
dataflow model that
can be used to define
analysis equations
• Becomes a challenge
as systems increase
in size and
complexity
60. Solutions for Requirements
Properties
• Adapt the system execution
trace at runtime to inject
missing properties
• Does not require
modification of existing
program
• System Execution Trace
Adaption Framework
(SETAF)
Dataflow Model
• Auto-reconstruct dataflow
model from system
execution trace
• We used frequent pattern
sequence mining and
Dempster-Shafer theory
• Dataflow Model Auto-
Constructor (DMAC)
• 94% coverage of existing
system execution trace
61. Forms of Software
Instrumentation
Intrusive
• The process of collecting data
from software for analytical
purposes by modifying the
original source code
Non-intrusive
• The process of collecting data
from software for analytical
purposes without modifying
the original source code
• e.g., Dynamic binary
instrumentation (DBI)
• Examples: Pin, Solaris
Dynamic Tracing (Dtrace),
DynamoRIO, DynInst
62. Dynamic Binary Instrumentation
Dynamic Binary Instrumentation (DBI) is a form of
non-intrusive software instrumentation where
instrumentation code is injected into a binary executable
at runtime
Monitor both application- and system-level behavior
E.g., resource usage, system calls, multi-threading behavior,
branching, & etc
Original Binary Binary with Injected
Instrumentation Code
63. ofstream OutFile;
// Running count of instructions is kept here make it static to help the compiler optimize
docount
static UINT64 icount = 0;
// This function is called before every instruction is executed
VOID docount() { icount++; }
// Pin calls this function every time a new instruction is encountered
VOID Instruction(INS ins, VOID *v) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}
// This function is called when the application exits
VOID Fini(INT32 code, VOID *v){
OutFile.setf(ios::showbase);
OutFile << "Count " << icount << endl;
OutFile.close();
}
64. KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
"o", "inscount.out", "specify output file name");
int main(int argc, char * argv[]) {
// Initialize pin
if (PIN_Init(argc, argv)) return Usage();
OutFile.open(KnobOutputFile.Value().c_str());
// Register Instruction to be called to instrument instructions
INS_AddInstrumentFunction(Instruction, 0);
// Register Fini to be called when the application exits
PIN_AddFiniFunction(Fini, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
65. Main Challenges Writing
Pintools
Hard to see the design and structure of a
Pintool
Tool$
<Class>$
//$no, fica, on$methods$
Instrument$
<Class>$
//$instrumenta, on$methods$
Callback$
<Class>$
//$analysis$methods$
Pintool$
<SharedLibrary>$
1$
*$
*$
*$
*$
1$
66. Main Challenges Writing
Pintools
Hidden constraints between analysis routine
definition and its registration with Pin
// This function is called before every instruction is executed
VOID docount() { icount++; }
// Pin calls this function every time a new instruction is
encountered
VOID Instruction(INS ins, VOID *v) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount,
IARG_END);
}
67. Pin++: Lightweight Framework for
Pin
A object-oriented framework for writing Pintools
Uses design patterns to promote reuse and reduce
complexity of Pintools
Uses template-metaprogramming to reduce potential
development errors and optimize the performance of a
Pintool at compile time
Designed to promote reuse of different components in a
Pintool
Codifies many requirements of a Pintool so developers to
not have to re-implement them for each and every tool
e.g., bootstrapping, initialization, registration, & etc
Freely available at github.com/SEDS/PinPP
68. An Example Pintool in Pin++
// Pin++ callback object
class docount : public Callback <docount(void)> {
public:
docount (void) : count_ (0) { }
void handle_analyze (void) { ++ count_; } // required analysis method
UINT64 count (void) const { return count_; }
private:
UINT64 count_;
};
// Pin++ instruction-level instrument
class Instruction : public Instruction_Instrument <Instruction> {
public:
void handle_instrument (const Ins & ins) { callback_.insert (IPOINT_BEFORE, ins);
}
UINT64 count (void) const {return cnt_.count();}
private:
docount cnt_;
};
Analysis callback
transformed into strongly
type object
Insert methods on callback
handle complex insert task
69. // one and only Pin++ tool object
class inscount : public Tool <inscount> {
public:
inscount (void) { enable_fini_callback (); }
Signals are implemented as
callback hooks on the tool object
void handle_fini (INT32 code) {
std::ofstream fout (outfile_.Value ().c_str ());
fout.setf (ios::showbase);
fout << "Count “ << instruction_.count () << std::endl;
fout.close ();
}
private:
// One and only instrument
Instruction instruction_;
static KNOB <string> outfile_;
};
Instrumentation methods automatically
KNOB <string> inscount::outfile_ (KNOB_MODE_WRITEONCE, "pintool", "o",
"inscount.out", "output file name");
// macro simplifying pintool initialization
DECLARE_PINTOOL (inscount);
registered with Pintool when
instantiated
Removes repetitive bootstrap code
70. A Pin++ Tool with Contextual
Data
class Mem_Read : public Callback <Mem_Read(ARG_INST_PTR,
ARG_MEMORYOP_EA)> {
public:
Mem_Read (FILE * file) : file_ (file) { }
void handle_analyze (arg1_type ip, arg2_type addr) { fprintf (file_, "%p: R %pn", ip,
addr); }
private:
FILE * file_;
};
class Instrument : public Instruction_Instrument <Instrument> {
public:
void handle_instrument (const Ins & ins) {
UINT32 operands = ins.memory_operand_count ();
for (UINT32 op = 0; op < operands; ++ op)
if (ins.is_memory_operand_read (op))
mr_.insert_predicated (IPOINT_BEFORE, ins, op);
}
private:
Mem_Read mr_;
71. Evaluating the Performance of
Pin++
Can Pin++ reduce the complexity of writing analysis
tools for Pin?
The lower the complexity, the easier it will be to write (or code) an
analysis tool for Pin
Lower complexity also is know to have less “bugs”
Can Pin++ improve the modularity of analysis tools for
Pin?
The more modular the source code, the more possible it is to
reuse the the source code
Can Pin++ improve the performance of analysis tools
for Pin?
Improving performance equates to reduce instrumentation
overhead
72. Reducing Complexity with Pin++
0
5
10
15
20
detach
emudiv
follow_child_tool
fork_jit_tool
imageload
inscount_tls
inscount0
inscount1
inscount2
invocation
isampling
itrace
malloc_mt
malloctrace
nonstatica
pinatrace
proccount
replacesigprobed
statica
staticcount
strace
Complexity
Name of the Pintool
Cyclomatic Complexity of Pintools from Pin's User Manual
Traditional
Pin++
Pin++ w/ Reuse
73. Improving Modularity with Pin++
0
5
10
15
20
detach
emudiv
follow_child_tool
fork_jit_tool
imageload
inscount_tls
inscount0
inscount1
inscount2
invocation
isampling
itrace
malloc_mt
malloctrace
nonstatica
pinatrace
proccount
replacesigprobed
statica
staticcount
strace
Modularity
Name of the Pintool
Modularity of Pintools from Pin's User Manual
Traditional
Pin++
Pin++ w/ Reuse