2. SWEBOK: the 10 Knowledge Areas
Software Requirements
Software Design
Software Construction
Software Testing
Software Maintenance
Software Configuration Management
Software Engineering Management
Software Engineering Process
Software Engineering Tools and Methods
Software Quality
27-Sep-11 Software Engineering / Fernando Brito e Abreu 2
3. Motivation - The Bad News ...
Software bugs cost the U.S. economy an
estimated $59.5 billion annually, or about 0.6%
of the gross domestic product.
Sw users shoulder more than half of the costs
Sw developers and vendors bear the remainder
of the costs.
Source:The Economic Impacts of Inadequate Infrastructure for
Software Testing, Technical Report, National Institute of
Standards and Technology, USA, May 2002
http://www.nist.gov/director/prog-ofc/report02-3.pdf
27-Sep-11 Software Engineering / Fernando Brito e Abreu 3
4. Motivation - The GOOD News!
According to the same report:
More than 1/3 of the costs (an estimated $22.2
billion) can be eliminated with earlier and more
effective identification and removal of software
defects.
Savings can mainly occur in the development
stage, when errors are introduced.
More than half of these errors aren't detected until
later in the development process or during post-sale
software use.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 4
5. Motivation
Reliability is one of the most important software
quality characteristics
Reliability has a strong financial impact:
betterimage of producer
reduction of maintenance costs
celebration or revalidation of maintenance contracts,
new developments, etc.
The quest for Reliability is the aim of V&V !
27-Sep-11 Software Engineering / Fernando Brito e Abreu 5
6. Verification and Validation (V&V)
Verification - product correctness and
consistency in a given development phase, face
to products and standards used as input to that
phase - "Do the Job Right"
Validation - product conformity with specified
requirements - "Do the Right Job"
Basically two complementary V&V techniques :
Reviews (Walkthroughs, Inspections, ...)
Tests
27-Sep-11 Software Engineering / Fernando Brito e Abreu 6
7. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 7
8. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 8
9. Testing is …
… an activity performed for evaluating product
quality, and for improving it, by identifying
defects and problems.
… the dynamic verification of the behavior of a
program on a finite set of test cases, suitably
selected from the usually infinite executions
domain, against the expected behavior.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 9
10. Dynamic versus static verification
Testing always implies executing the program on
(valued) inputs; therefore is a dynamic technique
The input value alone is not always sufficient to determine a
test, since a complex, nondeterministic system might react to
the same input with different behaviors, depending on its state
Different from testing and complementary to it are static
techniques (described in the Software Quality KA)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 10
11. Terminology issues
Error
the human cause for defect existence (although bugs walk …)
Fault or defect (aka bug)
incorrectness, omission or undesirable characteristic in a deliverable
the cause of a failure
Failure
Undesired effect (malfunction) observed in the system’s delivered service
Incorrectness in the functioning of a system
See: IEEE Standard for SE Terminology (IEEE610-90)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 11
12. Testing views
Testing for defect identification
A successful test is one which causes a system to fail
Testing can reveal failures, but it is the faults (defects) that
must be removed
Testing to demonstrate (that the software meets its
specifications or other desired properties)
A successful test is one where no failures are observed
Fault detection (e.g. in code) is often hard through failure
exposure
Identifying all failure-causing input sets (i.e. those sets of inputs that
cause a failure to appear) may not be feasible
27-Sep-11 Software Engineering / Fernando Brito e Abreu 12
13. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 13
14. Test Levels
Objectives of testing
Testing can be aimed at verifying different properties:
Checking if functional specifications are implemented right
aka conformance testing, correctness testing, or functional testing
Checking nonfunctional properties
E.g. performance, reliability evaluation, reliability measurement,
usability evaluation, etc
Stating the objective in precise, quantitative terms
allows control to be established over the test process
Often objectives are qualitative or not even stated explicitly
27-Sep-11 Software Engineering / Fernando Brito e Abreu 14
16. Test Levels – Objectives of testing
Acceptance / Qualification testing
Checks the system behavior against the
customer’s requirements
The customer may not exist yet, so someone has to
forecast his intended requirements
This testing activity may or may not involve the
developers of the system
27-Sep-11 Software Engineering / Fernando Brito e Abreu 16
17. Test Levels – Objectives of testing
Installation testing
Installation testing can be viewed as system
testing conducted once again according to
hardware configuration requirements
Usually performed in the target environment at the
customer’s premises
Installation procedures may also be verified
e.g. is the customer local expert able to add a new
user in the developed system?
27-Sep-11 Software Engineering / Fernando Brito e Abreu 17
18. Test Levels – Objectives of testing
Alpha and beta testing
Before the software is released, it is sometimes
given to a small, representative set of potential
users for trial use. Those users may be:
in-house (alpha testing)
external (beta testing)
These users report problems with the product
Alpha and beta use is often uncontrolled, and is not
always referred to in a test plan
27-Sep-11 Software Engineering / Fernando Brito e Abreu 18
19. Test Levels – Objectives of testing
Conformance / Functional / Correctness testing
Conformance testing is aimed at validating
whether or not the observed behavior of the
tested software conforms to its specifications
27-Sep-11 Software Engineering / Fernando Brito e Abreu 19
20. Test Levels – Objectives of testing
Reliability achievement and evaluation
Testing is a means to improve reliability
By randomly generating test cases according to
the operational profile, statistical measures of
reliability can be derived
Reliability growth models allow to express this
reality
27-Sep-11 Software Engineering / Fernando Brito e Abreu 20
21. Reliability growth models
Provide a prediction of reliability based on the
failures observed under reliability achievement
and evaluation
They assume, in general, that:
a growing number of well-succeeded tests increases
our confidence on the system’s reliability
the faults that caused the observed failures are fixed
after being found (thus, on average, product’s
reliability has an increasing trend)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 21
22. Reliability growth models
Many models were published, which are divided
into:
failure-count
models
time-between failure models
27-Sep-11 Software Engineering / Fernando Brito e Abreu 22
23. Test Levels – Objectives of testing
Regression testing (1/2)
Regression testing is:
The “selective retesting of a system or component to verify
that modifications have not caused unintended effects.”
(IEEE610.12-90)
Any repetition of tests intended to show that the software’s
behavior is unchanged, except insofar as required
A technique to combat side-effects!
In practice, the idea is to show that software which
previously passed the tests still does
27-Sep-11 Software Engineering / Fernando Brito e Abreu 23
24. Test Levels – Objectives of testing
Regression testing (2/2)
A trade-off must be made between:
theassurance given by regression every time a change is made
… and the resources required to do that
To allow regression tests we must build, incrementally, a
test battery
Regression testing is more feasible if we have tools to
record and playback test cases
Several commercial user interface event-caption tools (black-
box testing) exist
27-Sep-11 Software Engineering / Fernando Brito e Abreu 24
25. Test Levels – Objectives of testing
Performance testing / Stress testing
Aimed at verifying that the software meets the
specified performance requirements:
e.g.volume testing and response time
The performance degradation under increasingly
exigent scenarios should be plotted
If we exercise software at the maximum design
load (or beyond it), we call it stress testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 25
26. Test Levels – Objectives of testing
Back-to-back testing
A single test set is performed on two
implemented versions of a software product
The results are compared
Whenever a mismatch occurs, then one of the two
versions (at least) is probably evidencing failure
27-Sep-11 Software Engineering / Fernando Brito e Abreu 26
27. Test Levels – Objectives of testing
Recovery testing
Aimed at verifying software restart capabilities
after a “disaster”
Recovery testing is a fundamental step in
building a contingency plan
27-Sep-11 Software Engineering / Fernando Brito e Abreu 27
28. Test Levels – Objectives of testing
Configuration testing
When software is built to serve different users,
configuration testing analyzes the software under
the various specified configurations
The problem is similar when the hardware of software
platform varies somehow (e.g. different mobile phone
versions, different browsers)
This is one of the main issues in software
product lines development
See: http://www.sei.cmu.edu/plp/framework.html
27-Sep-11 Software Engineering / Fernando Brito e Abreu 28
29. Test Levels – Objectives of testing
Usability testing
This process evaluates how easy it is for end-
users to use and learn the software, including:
user documentation
initial installation and extension through add-ons
effectively support in user tasks
…
27-Sep-11 Software Engineering / Fernando Brito e Abreu 29
30. Test Levels
The target of the test
Unit testing
the target is a single module
Integration testing
the target is a group of modules (related by purpose, use,
behavior, or structure)
System testing
the target is a whole system
27-Sep-11 Software Engineering / Fernando Brito e Abreu 30
31. Test Levels – The target of the test
Unit testing
Verifies the functioning in isolation of software pieces
which are separately testable
Depending on the context, they can be individual subprograms
or a larger component made of tightly related units
Typically, unit testing occurs with:
access to the code being tested
support of debugging tools
the programmers who wrote the code
27-Sep-11 Software Engineering / Fernando Brito e Abreu 31
32. Test Levels – The target of the test
Integration testing
Is the process of verifying the interaction between
software components
Classical integration testing strategies
top-down or bottom-up, are used with hierarchically structured sw
Modern systematic integration strategies
architecture-driven, which implies integrating the software
components or subsystems based on identified functional threads
Except for small, simple software, systematic,
incremental integration testing strategies are usually
preferred to putting all the components together at once
The latter is called “big bang” testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 32
33. Test Levels – The target of the test
System testing
The majority of functional failures should already have
been identified during unit and integration testing
Main concerns:
Assessing if the system complies to the non-functional
requirements, such as security, speed, accuracy, and reliability
Assess if the external interfaces to other applications,
utilities, hardware devices, or the operating environment are
performed well
27-Sep-11 Software Engineering / Fernando Brito e Abreu 33
34. Test Levels
Identifying the test set
Test adequacy criteria
Isthe test set consistent?
How much testing is enough?
How many test cases should be selected?
Test selection criteria
How is the test set composed?
Which test cases should be selected?
27-Sep-11 Software Engineering / Fernando Brito e Abreu 34
35. Test case selection
Proposed test techniques differ essentially in
how they select the test set, which may yield
vastly different degrees of effectiveness
In practice, risk analysis techniques and test
engineering expertise are applied to identify the
most suitable selection criterion under given
conditions
27-Sep-11 Software Engineering / Fernando Brito e Abreu 35
36. How large should a test battery be?
Even in simple programs, so many test cases are
theoretically possible that exhaustive testing
could require months or years to execute
In practice the whole test set can generally be
considered infinite
Testing always implies a trade-off:
limitedresources and schedules on the one hand
inherently unlimited test requirements on the other
27-Sep-11 Software Engineering / Fernando Brito e Abreu 36
37. After testing …
Even after successful completion of extensive
testing, the software could still contain faults
The remedy for sw failures found after delivery is
provided by corrective maintenance actions
This will be covered in the Software Maintenance KA
27-Sep-11 Software Engineering / Fernando Brito e Abreu 37
38. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 38
39. Test Techniques
Functional / Black box (based on user’s intuition
and experience)
Based on tester's intuition and experience
Specification-based
Code-based
Usage-based
Fault-based
Based on nature of application
Selecting and combining techniques
Software Engineering / Fernando Brito e Abreu 39
27-Sep-11
40. Functional Tests (Black-Box) actors
A relevant aspect of black-
box testing is that it is not
compulsory to use
programming experts to
produce a test battery
Extensive invalid input
characterization heavily
relies on tester experience
27-Sep-11 Software Engineering / Fernando Brito e Abreu 40
42. Functional Test Tools - Visual Test
27-Sep-11 Software Engineering / Fernando Brito e Abreu 42
43. Functional Test Tools
Grouping of test cases
Test cases
Test battery
(test suite)
Reusable test code
27-Sep-11 Software Engineering / Fernando Brito e Abreu 43
45. Functional Test Tools
Integration with other
Rational tools
27-Sep-11 Software Engineering / Fernando Brito e Abreu 45
46. Functional Test Tools
Integration with other
Rational tools
Test cases to
execute in this suite
Reported failures
27-Sep-11 Software Engineering / Fernando Brito e Abreu 46
47. 27-Sep-11 Software Engineering / Fernando Brito e Abreu 47
48. Assessing Functional Test Coverage
The ReModeler tool from the
QUASAR team takes an
innovative model-based
approach to represent this
kind of testing coverage
The color represents the
percentage of the scenarios of
each use case that were
executed by a given test suite
27-Sep-11 Software Engineering / Fernando Brito e Abreu 48
49. Test Techniques
Based on tester's intuition and experience
Ad hoc testing
Perhaps the most widely practiced technique remains
ad hoc testing
Tests are derived relying on the software engineer’s
skill, intuition, and experience with similar programs
Ad hoc testing might be useful for identifying special
tests, those not easily captured by formalized
techniques
27-Sep-11 Software Engineering / Fernando Brito e Abreu 49
50. Test Techniques
Based on tester's intuition and experience
Exploratory testing
Simultaneous learning, test design and execution
The tests are not defined in advance in an established test
plan, but are dynamically designed, executed, and modified
The effectiveness of this approach relies on the tester
knowledge, which can be derived from many sources:
observed product behavior during previous version testing
familiarity with the application, platform, failure process
type of possible faults and failures
the risk associated with a particular product
…
27-Sep-11 Software Engineering / Fernando Brito e Abreu 50
51. Test Techniques
Specification-based
Equivalence partitioning
Boundary-value analysis
Decision table
Finite-state machine-based
Testing from formal specifications
Random testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 51
52. Test Techniques – Specification-based
Equivalence partitioning
The input domain is subdivided into a collection
of subsets, or equivalent classes, which are
deemed equivalent according to a specified
relation, and a representative set of tests
(sometimes only one) is taken from each class.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 52
53. Test Techniques – Specification-based
Boundary-value analysis
Test cases are chosen on and near the boundaries of
the input domain of variables, with the underlying
rationale that many faults tend to concentrate near the
extreme values of inputs
An extension of this technique is robustness testing,
wherein test cases are also chosen outside the input
domain of variables, to test program robustness to
unexpected or erroneous inputs
27-Sep-11 Software Engineering / Fernando Brito e Abreu 53
55. Triangle Classifier
Classic problem proposed in [Myers79] and
[Hetzel84]:
Distinct classification criteria:
dimension of sides - equilateral, isosceles or scalene
bigger angle - acute, rectangle or obtuse
27-Sep-11 Software Engineering / Fernando Brito e Abreu 55
56. Triangle Classifier: specification
Input:
dimensions of the three sides: three numbers,
separated by commas (or two angles instead).
Algorithm:
If the dimension of one side is superior to the sum
of the other two, then write ”Not a triangle!"
If it is a valid triangle, then write its classification:
according to the biggest angle - obtuse, rectangle or
acute
according to the side dimension - scalene, isosceles or
equilateral
27-Sep-11 Software Engineering / Fernando Brito e Abreu 56
57. Triangle Classifier: specification
Output: Write a test case battery for the triangle
classifier
For each test case, specify:
inputvalues (including invalid or unexpected
conditions)
corresponding expected output values
Example: 3,4,5 -> scalene, rectangle
27-Sep-11 Software Engineering / Fernando Brito e Abreu 57
58. Triangle Classifier
equivalence partitioning
For a complete test battery, we need to:
divide the solution space in partitions
identify typical cases for each partition
identify frontier cases
identify extreme cases
identify invalid cases.
Now it is your time to work ...
Don’t turn the page until you finished!
27-Sep-11 Software Engineering / Fernando Brito e Abreu 58
60. Triangle Classifier
boundary values
4.001, 4, 3.999 almost equilateral (scalene acute)
4.0001, 4, 4 almost equilateral (isosceles acute)
3, 4.9999, 5 almost isosceles (scalene acute)
9, 4.9999, 5 almost isosceles (scalene obtuse)
5, 3.9999, 3 almost rectangle (scalene acute)
5.0001, 4, 3 almost rectangle (scalene obtuse)
1, 1, 1.4141 almost rectangle (isosceles acute)
1, 1, 1.4143 almost rectangle (isosceles obtuse)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 60
61. Triangle Classifier
extreme cases
1, 2, 3 line segment!
0, 0, 0 point!
Note: extreme cases are not invalid!
27-Sep-11 Software Engineering / Fernando Brito e Abreu 61
62. Triangle Classifier:
invalid cases
6, 4, 0 null side!
12, 4, 3 not a triangle!
5, 3, 2, 5 four sides!
2, 5 one side missing!
3.45 only one side!
No value!
3, , 4, 6 incorrect format
4A, 3, 7 invalid value
6, -1, 4 negative value
27-Sep-11 Software Engineering / Fernando Brito e Abreu 62
63. Triangle Classifier
invalid cases
As we saw, apparently simple problems, often
have some subtleties that make testing more
complex than expected!
Frontier values and invalid input state spaces are
the most likely situations producing failures
27-Sep-11 Software Engineering / Fernando Brito e Abreu 63
64. Test Techniques – Specification-based
Decision table
Decision tables represent logical relationships between
conditions (roughly, inputs) and actions (roughly,
outputs)
Test cases are systematically derived by considering
every possible combination of conditions and actions
A related technique is cause-effect graphing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 64
65. Test Techniques – Specification-based
Finite-state machine-based
By modeling a program as a finite state machine,
tests can be selected in order to cover states and
transitions on it
27-Sep-11 Software Engineering / Fernando Brito e Abreu 65
66. Test Techniques – Specification-based
Testing from formal specifications
Giving the specifications in a formal language
allows for automatic derivation of functional
test cases
Atthe same time, provides a reference output, an
oracle, for checking test results
This is an active research topic
27-Sep-11 Software Engineering / Fernando Brito e Abreu 66
67. Test Techniques – Specification-based
Random testing
Tests are generated in a stochastic (non-deterministic)
way
This form of testing falls under the heading of the
specification-based entry, since at least the input
domain must be known, to be able to pick random
points within it
27-Sep-11 Software Engineering / Fernando Brito e Abreu 67
68. Test Techniques – Specification-based
Random testing
We simulate the data input by generating sequences
of values that may occur in practice
This process must be repeated on and on because, in the
long term, we can generate all possible input combinations
This approach is only feasible with a tool, a case test
generator - its input is some sort of description of
possible input values input, their sequence and
probability of occurrence
27-Sep-11 Software Engineering / Fernando Brito e Abreu 68
69. Test Techniques – Specification-based
Random testing
Random tests are often used to test compilers,
through the generation of random programs
The description of possible input sequences can be made
with BNF (Backus Naur Form)
Random testing can also be used in testing
communications protocol software
The description of possible input sequences can be made
out of the state machines that describe each of the involved
parties
27-Sep-11 Software Engineering / Fernando Brito e Abreu 69
70. Test Techniques
Code-based (aka white box)
Control-flow-based criteria
Data flow-based criteria
27-Sep-11 Software Engineering / Fernando Brito e Abreu 70
71. Test Techniques – Code-based
Control-flow-based criteria
Several testing tools allow the generation of
Control Flow Graphs from source code.
By instrumenting source code these tools allow to
verify graphically the execution of each edge and
node in the network
27-Sep-11 Software Engineering / Fernando Brito e Abreu 71
72. Test Techniques – Code-based
Control-flow-based criteria
The strongest control-flow-based criteria is path
testing, which aims executing all entry-to-exit
control flow paths in the flowgraph
Full path testing is generally not feasible because of
loops
27-Sep-11 Software Engineering / Fernando Brito e Abreu 72
73. Test Techniques – Code-based
Control-flow-based criteria
Control-flow-based coverage criteria is aimed at
covering all the statements or blocks of statements in a
program
Several coverage criteria have been proposed, like
condition/decision coverage
A test battery coverage is the percentage of the total code (e.g.
statements or branches/decisions coverage) which is exercised
by that battery
Code coverage is a much less stringent criteria than path
coverage
27-Sep-11 Software Engineering / Fernando Brito e Abreu 73
75. Control flow graphs
Are a graphical representation of programs that
traduces the ways they can be transversed
during execution
nodes represent decisions
oriented edges represent sets of sequential
instructions
In more complex code segments, the graph looks like
spaghetti - more tests are needed
27-Sep-11 Software Engineering / Fernando Brito e Abreu 75
77. Example: tax calculation
Consider an IRS tax system that reads annual
income revenues and determines the
corresponding tax due:
If the total income is less than 25K EUROS no tax is
deducted
If it is above that, but less than 100K EUROS, the tax
is 7%
otherwise is 15%
27-Sep-11 Software Engineering / Fernando Brito e Abreu 77
78. Example: tax calculation
1
Function Calculates_Tax ( Int n) 2
Array of Int income; 5 6
Int total,tax; 3
1. total, tax = 0;
2
2. for i=1 to n 7
5
4
3. {read(income[i]);
7
4. total = total + income[i]};
5. if total >= 25000 then 9 8 9
6. tax = total * 0.07
7. else if total >= 100000 then Note: the problem solution is wrong,
8. tax = total * 0.15; because the condition for the 100K
9. return( tax) EURO limit should be tested first.
This defect would be caught by
structural testing.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 78
79. Example: how many test cases?
Based on graph theory, Tom McCabe proposed
the cyclomatic complexity metric that
expresses the minimum number of test cases for
100% test coverage:
v(G) = # edges - # nodes + # inputs and outputs
In the current case we obtain:
11 - 9 + 2 = 4 (complete graph)
6 - 4 + 2 = 4 (reduced graph)
Therefore we should be able to produce 4 test cases
that when applied would lead to a 100% coverage.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 79
80. Call graphs
Are a graphical representation of the
dependences of functions, procedures or
methods on each other
nodes (boxes) represent functions, methods, etc
oriented edges represent invocations made
This kind of white box testing is often used for
profiling execution snapshots
27-Sep-11 Software Engineering / Fernando Brito e Abreu 80
81. Call graph based testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 81
82. Call graph based testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 82
83. Call graph based testing
Colors are often used to
represent coverage
percentages
27-Sep-11 Software Engineering / Fernando Brito e Abreu 83
84. Assessing structural test coverage
The ReModeler tool from
the QUASAR team uses
a model-based approach
to represent this kind of
testing coverage
Each class or package is
colored according to the
percentage of executed
methods
27-Sep-11 Software Engineering / Fernando Brito e Abreu 84
85. Test Techniques – Code-based
Data-flow-based criteria
In data-flow-based testing, the control flowgraph is
annotated with information about how the program
variables are defined, used, and killed (undefined)
The strongest criterion, all definition-use paths, requires
that, for each variable, every control flow path segment
from a definition of that variable to a use of that
definition is executed
In order to reduce the number of paths required, weaker
strategies such as all-definitions and all-uses are
employed
27-Sep-11 Software Engineering / Fernando Brito e Abreu 85
86. Test Techniques
Fault-based
With different degrees of formalization, fault-
based testing techniques devise test cases
specifically aimed at revealing categories of likely
or predefined faults
Two main techniques exist:
Errorguessing
Mutation testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 86
87. Test Techniques – Fault-based
Error guessing
In error guessing, test cases are specifically
designed by software engineers trying to figure
out the most plausible faults in a given program
A good source of information is the history of
faults discovered in earlier projects, as well as
the software engineer’s expertise
27-Sep-11 Software Engineering / Fernando Brito e Abreu 87
88. Test Techniques – Fault-based
Mutation testing
A mutant is a slightly modified version of the program under test, differing from it by a
small, syntactic change
Every test case exercises both the original and all generated mutants: if a test case is
successful in identifying the difference between the program and a mutant, the latter is
said to be “killed”
Originally conceived as a technique to evaluate a test set, mutation testing is also a
testing criterion in itself: either tests are randomly generated until enough mutants have
been killed, or tests are specifically designed to kill surviving mutants
In the latter case, mutation testing can also be categorized as a code-based technique
The underlying assumption of mutation testing, the coupling effect, is that by looking for
simple syntactic faults, more complex but real faults will be found
For the technique to be effective, a large number of mutants must be automatically
derived in a systematic way.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 88
89. Test Techniques
Usage-based
Operational profile
Software Reliability Engineered Testing
27-Sep-11 Software Engineering / Fernando Brito e Abreu 89
90. Test Techniques – Usage-based
Operational profile
In testing for reliability evaluation, the test
environment must reproduce the operational
environment of the software as closely as
possible
The idea is to infer, from the observed test
results, the future reliability of the software when
in actual use
To do this, inputs are assigned a probability
distribution, or profile, according to their
occurrence in actual operation
27-Sep-11 Software Engineering / Fernando Brito e Abreu 90
91. Test Techniques – Usage-based
Software Reliability Engineered Testing
Software Reliability Engineered Testing (SRET)
is a testing method encompassing the whole
development process, whereby testing is
“designed and guided by reliability objectives and
expected relative usage and criticality of different
functions in the field.”
27-Sep-11 Software Engineering / Fernando Brito e Abreu 91
92. Test Techniques
Based on nature of application
Object-oriented testing
Component-based testing
Web-based testing
GUI testing
Testing of concurrent programs
Protocol conformance testing
Testing of real-time systems
Testing of safety-critical systems
27-Sep-11 Software Engineering / Fernando Brito e Abreu 92
93. Test Techniques
Selecting and combining techniques
Specification-based and code-based test
techniques are often contrasted as functional vs.
structural testing
These two approaches to test selection are not to
be seen as alternative but rather complementary
infact, they use different sources of information and
have proved to highlight different kinds of problems
they should be used in combination, depending on
budgetary considerations
27-Sep-11 Software Engineering / Fernando Brito e Abreu 93
94. Automatic Construction of Test Cases
Test generation is possible from:
model-based specifications
algebraic (formal) specifications
Segmentation (“slicing”) and ramification
(“branch analysis”) techniques are used to
identify partitions
27-Sep-11 Software Engineering / Fernando Brito e Abreu 94
95. Automatic Construction of Test Cases
TTCN (Tree and Tabular Combined Notation)
1983: ISO TC 97/SC 16 and later in ISO/IEC JTC 1/SC
21 and in CCITT SG VII as part of the work on OSI
conformance testing methodology and framework
Has been widely used since then for describing protocol
conformance test suites in standardization organizations such
as ITU-T, ISO/IEC, ATM Forum, ETSI and industry
1998: TTCN-2, in ISO/IEC and in ITU-T
New features: concurrency mechanism, concepts of module and package,
manipulation of ASN.1 encoding
TTCN-3
27-Sep-11 Software Engineering / Fernando Brito e Abreu 95
96. Automatic Construction of Test Cases
TTCN (Tree and Tabular Combined Notation)
TTCN is a standardized test case format
The main characteristics of TTCN are that:
its Tabular Notation allows its user to describe easily and
naturally in a tree form all possible scenarios of stimulus and
various reactions to it between the tester and the target
its verdict system is designed such that to facilitate
conformance judgment on the test result agrees against the
test purpose, and
it provides a mechanism to describe appropriate constraints on
received messages so that conformance of the received
messages can be automatically evaluated against the test
purpose
27-Sep-11 Software Engineering / Fernando Brito e Abreu 96
97. TTCN-3 example
The following is an example of an
Abstract Test Suite (ATS)
where we are trying to test a
weather service
The tester sends a request
consisting of a location, a date
and a kind of report to some on-
line weather service and receives
a response with confirmation of
the location and date along with
the temperature, the wind
velocity and the weather
conditions at this location
27-Sep-11 Software Engineering / Fernando Brito e Abreu 97
98. TTCN-3 example
A TTCN-3 ATS is always composed of four sections:
1. type definitions: data structures like in C but also an easy to use
concept of lists and sets
2. template definitions: A TTCN-3 template consists of two separate
concepts merged into one:
test data definition
test data matching rules
3. test cases definitions: specifies the sequences and alternatives of
sequences of messages sent and received to and from the System Under
Test (SUT)
4. test control definitions: defines the order of execution of various test
cases
27-Sep-11 Software Engineering / Fernando Brito e Abreu 98
99. Sample TTCN-3 Abstract Test Suite
module SimpleWeather { type record weatherResponse {
charstring location,
type record weatherRequest { charstring date,
charstring location, charstring kind,
charstring date, integer temperature,
charstring kind integer windVelocity,
} charstring conditions
}
template weatherRequest
ParisWeekendWeatherRequest := { template weatherResponse ParisResponse := {
location := "Paris", location := "Paris",
date := "15/06/2006", date := "15/06/2006",
kind := "actual" kind := "actual",
} temperature := (15..30),
windVelocity := (0..20),
conditions := "sunny"
}
27-Sep-11 Software Engineering / Fernando Brito e Abreu 99
100. Sample TTCN-3 Abstract Test Suite
type port weatherPort message { testcase testWeather() runs on MTCType {
in weatherResponse; weatherOffice.send(ParisWeekendWeatherRequest);
out weatherRequest; alt {
} [] weatherOffice.receive(ParisResponse) {
setverdict(pass)
type component MTCType { }
port weatherPort weatherOffice; [] weatherOffice.receive {
setverdict(fail)
} }
}
}
control {
execute (testWeather())
}
}
27-Sep-11 Software Engineering / Fernando Brito e Abreu 100
101. Automatic Construction of Test Cases
Implies the resolution of several problems:
program decomposition (slicing)
classification of partitions found
selection of test paths
test case generation to exercise those paths
validation of generated cases
Last problem is solved by the construction of an oracle
(software) whose function is to find if, for a given test,
the program responds according to its specification.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 101
102. Automatic Construction of Test Cases
An example Who? » Siemens + Swiss PTT
What? » SAMSTAG (Sdl And Msc
baSed Test cAse Generation)
How to model system & tests?
Target system (SDL)
Test scenarios (MSC)
SDL - Specification and Description
Language [ITU Z.100]
MSC - Message Sequence Chart
[ITU Z.120]
TTCN (Tree and Tabular Combined
Notation) [ISO/IEC JTC1/SC21]
27-Sep-11 Software Engineering / Fernando Brito e Abreu 102
103. Automatic Construction of Test Cases
Some tools
Validator (Aonix)
SoftTest (?)
ObjectGEODE TestComposer
(Verilog)
TestFactory (Rational)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 103
104. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 104
105. Test-related Measures
Evaluation of the program under test
Program measurements to aid in planning and
designing testing
To guide testing we may use measures based
on:
program size
E.g. SLOC or function points
program structure
E.g. McCabe’s metrics or frequency with which modules
call each other
27-Sep-11 Software Engineering / Fernando Brito e Abreu 105
106. Test-related Measures
Evaluation of the program under test
Fault types, classification, and statistics
Testing literature is rich in classifications and
taxonomies of faults
To make testing more effective, it is important to know:
which types of faults could be found in the software under test
the relative frequency with which these faults have occurred in
the past
This information can be very useful in making quality
predictions, as well as for process improvement
27-Sep-11 Software Engineering / Fernando Brito e Abreu 106
107. Test-related Measures
Evaluation of the program under test
Fault density
A program under test can be assessed by counting and
classifying the discovered faults by their types
For each fault class, fault density is measured as the
ratio between the number of faults found and the size of
the program
27-Sep-11 Software Engineering / Fernando Brito e Abreu 107
108. Test-related Measures
Evaluation of the tests performed
Coverage/thoroughness measures
Several test adequacy criteria require that the test cases
systematically exercise a set of elements identified in the program
or in the specifications
To evaluate the thoroughness of the executed tests, testers can
monitor the elements covered, so that they can dynamically
measure the ratio between covered elements and their total
number
For example, it is possible to measure the percentage of covered branches
in the program flowgraph, or that of the functional requirements exercised
among those listed in the specifications document
Code-based adequacy criteria require appropriate
instrumentation of the program under test
27-Sep-11 Software Engineering / Fernando Brito e Abreu 108
109. Example:
Static and dynamic metrics
used to guide white-box testing
110. Static Metrics Collection
Some examples collected by White-box tools:
– Number of private, protect and public attributes
– Overloading, overriding and visibility of operations
– Comments density (eg. JavaDoc comments per class)
– Inheritance metrics (ex: depth,width,inherited features)
– MOOSE metrics (Chidamber and Kemerer)
– MOOD metrics (Brito e Abreu)
– QMOOD metrics (Jagdish Bansiya)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 110
111. Static Metrics - ex: Cantata++
27-Sep-11 Software Engineering / Fernando Brito e Abreu 111
112. Dynamic Metrics Collection
Class, Operation, Branch, Exception clause coverage
Example: Multiple Condition Coverage
Measures whether each combination of condition
outcomes for a decision has been exercised; are f() and
g() called in the following code extract?
if ((a == b || f()) && (c == d || g()))
x();
else
y();
Note that the expression can be evaluated to true without
calling f() or g().
27-Sep-11 Software Engineering / Fernando Brito e Abreu 112
113. Test-related Measures
Evaluation of the tests performed
Fault seeding
Some faults are artificially introduced into the program before test
When the tests are executed, some of these seeded faults will be
revealed, and possibly some faults which were already there will
be as well
depending on which of the artificial faults are discovered, and how many,
testing effectiveness can be evaluated, and the remaining number of
genuine faults can be estimated
Problems:
distribution and representativeness of seeded faults relative to original ones
small sample size on which any extrapolations are based
inserting faults into software involves the obvious risk of leaving them there
27-Sep-11 Software Engineering / Fernando Brito e Abreu 113
114. Test-related Measures
Evaluation of the tests performed
Mutation score
In mutation testing, the ratio of killed mutants to
the total number of generated mutants can be a
measure of the effectiveness of the executed test
set
27-Sep-11 Software Engineering / Fernando Brito e Abreu 114
115. Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 115
116. Test Process – Practical Considerations
Attitudes / Egoless programming
A very important component of successful testing is a
collaborative attitude towards testing and quality
assurance activities
Managers have a key role in fostering a generally
favorable reception towards failure discovery during
development and maintenance
for instance, by preventing a mindset of code ownership
among programmers, so that they will not feel responsible for
failures revealed by their code
27-Sep-11 Software Engineering / Fernando Brito e Abreu 116
117. Test Process – Practical Considerations
Test guides
The testing phases could be guided by various
aims:
in risk-based testing, which uses the product risks
to prioritize and focus the test strategy
in scenario-based testing, in which test cases are
defined based on specified software scenarios
27-Sep-11 Software Engineering / Fernando Brito e Abreu 117
118. Test Process – Practical Considerations
Test documentation and work products
Documentation is an integral part of the formalization of
the test process
Test documents may include:
Test Plan
Test Design Specification
Test Procedure Specification
Test Case Specification
Test Log
Test Incident or Problem Report
27-Sep-11 Software Engineering / Fernando Brito e Abreu 118
119. Test Process – Practical Considerations
Internal vs. independent test team
External members, may bring an unbiased, independent
perspective
Decision on internal, external or a blend of teams,
should be based upon considerations of:
cost
schedule
maturity levels of the involved organizations
criticality of the application
27-Sep-11 Software Engineering / Fernando Brito e Abreu 119
121. Test Process – Practical Considerations
Cost/effort estimation and other process measures
Several measures related to the resources spent
on testing, as well as to the relative fault-finding
effectiveness of the various test phases, are
used by managers to control and improve the
test process, such as:
number of test cases specified
number of test cases executed
number of test cases passed
number of test cases failed
27-Sep-11 Software Engineering / Fernando Brito e Abreu 121
122. Test Process – Practical Considerations
Cost/effort estimation and other process measures
Evaluation of test phase reports can be combined with
root cause analysis to evaluate test process
effectiveness in finding faults as early as possible
Such an evaluation could be associated with the analysis of
risks
Moreover, the resources that are worth spending on
testing should be commensurate with the use/criticality
of the application:
different techniques have different costs and yield different
levels of confidence in product reliability
27-Sep-11 Software Engineering / Fernando Brito e Abreu 122
123. Test Process – Practical Considerations
Termination
A decision must be made as to how much testing is
enough and when a test stage can be terminated
Thoroughness measures, such as …
achieved code coverage
functional completeness
estimates of fault density or of operational reliability
… provide useful support, but are not sufficient in
themselves
27-Sep-11 Software Engineering / Fernando Brito e Abreu 123
124. Test Process – Practical Considerations
Termination
The decision also involves considerations about the
costs and risks incurred by the potential for remaining
failures, as opposed to the costs implied by continuing
to test
There are two possible approaches to this problem
Termination based on test efficiency
Termination based on test effectiveness
27-Sep-11 Software Engineering / Fernando Brito e Abreu 124
125. Test efficiency-based termination
To decide on test termination or to compare
distinct V&V procedures and tools we need to
know their Efficiency
walkthroughs, inspections, black-box, white-box ?
Efficiency = work produced / resources spent
» Test efficiency = defects found / effort spent
= benefit / cost
27-Sep-11 Software Engineering / Fernando Brito e Abreu 125
126. Test efficiency-based termination
As testing proceeds …
defect density decreases
test efficiency decreases - more and more
effort is spent (cost) to find new defects
(benefit)
reliability grows - probability that users
experience defect effects (failures) reduces
27-Sep-11 Software Engineering / Fernando Brito e Abreu 126
128. Case Study 300
Defects found per week
Defects Found 250
200
150
100
50
0
Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Cumulative Defects (Benefit)
2000
1500
1000
500
0
Week 1 2
27-Sep-11 3 4 5 6 7 8 9 10 11 12 / 13 14 15 16
Software Engineering Fernando Brito e Abreu 128
129. Case Study Cost / Benefit Ratio
2500
Test Efficiency 2000
1500
1000
500
0
Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Benefit / Cost Ratio (Test Efficiency)
140
120
100 These ratios can be
80
used to set test
60
40
stopping thresholds
20
0
Week 1 2
27-Sep-11 3 4 5 6 7 8 Software 11 12 13 Fernando Brito e Abreu
9 10 Engineering / 14 15 16 129
130. Test effectiveness-based termination
"Testing can only show the presence of bugs but never their absence"
Dijkstra
Is this statement correct ?
27-Sep-11 Software Engineering / Fernando Brito e Abreu 130
131. Test effectiveness-based termination
Test effectiveness allows to decide when tests
should be stopped
test plan should indicate that level (e.g. 90%)
Effectiveness = achieved effect / desired effect
» Test effectiveness = percentage of total defects found
27-Sep-11 Software Engineering / Fernando Brito e Abreu 131
132. Test Effectiveness - Case Study
Weekly % of Defects Found
(Weekly Test Effectiveness)
14% Conclusion: it is not
12%
10%
worth testing beyond a
8% certain point; that point
6%
4% can be based on a
2% given effectiveness
0%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 threshold
Week Cumulative % of Defects Found
(Cumulative Test Effectiveness)
100%
80%
60%
40%
20%
0%
27-Sep-11 Software Engineering / Fernando Brito e Abreu 132
Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
133. Test Effectiveness
To calculate it we need to know:
the total number of defects
or the number of remaining defects
total = found + remaining
Remaining defects can be known à posteriori
Simply wait by user action (not a good choice ...)
Even then, we have to set an observation period
Obs. period = f (system complexity, transaction rate)
some defects may only cause failures after intensive use
27-Sep-11 Software Engineering / Fernando Brito e Abreu 133
134. Defect Injection Technique
This technique allows to estimate remaining defects and
therefore obtain test effectiveness
1. A member of the development team (not necessarily the
producer) includes deliberately some defects in the target
system, neither condensed nor in a captious way.
2. He documents and describes the localization of injected
defects and delivers that information to the project leader.
3. The target system is passed on to the testing team.
4. Test process efficiency is verified through the percentage of
injected defects that were found.
5. Remaining defects (not injected) are then estimated
27-Sep-11 Software Engineering / Fernando Brito e Abreu 134
135. Defect Injection (continued)
Before the beginning of the test we have:
DOi Original Defects (unknown !)
DIi Injected Defects (known)
At all moments after the beginning of the test we have:
DOe Original defects found
DIe Injected defects found
DOr = DOi - Doe Original defects remaining (not found)
DIr = DIi - DIe Injected defects remaining (not found)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 135
136. Defect Injection (continued)
Let:
ERO = DOe / DOi Effectiveness in Original Defects
Removal (unknown !)
ERI = DIe / DIi Effectiveness in Injected Defects Removal
(known !)
Considering ERO ERI which will be close to truth if the
number of injected defects is sufficiently large:
DOi = DOe / ERO DOe / ERI
DOr = DOi ( 1 - ERO ) = DOe ( 1 / ERO - 1 ) DOe ( 1 / ERI - 1 )
27-Sep-11 Software Engineering / Fernando Brito e Abreu 136
137. Test Process – Test activities
Defect tracking
Detected defects can be analyzed to determine:
when they were introduced into the software
what kind of error caused them to be created
E.g. poorly defined requirements, incorrect variable
declaration, memory leak, programming syntax error, …
when they could have been first observed in the
software
27-Sep-11 Software Engineering / Fernando Brito e Abreu 148
138. Test Process – Test activities
Defect tracking
Defect-tracking information is used to determine
what aspects of software engineering need
improvement and how effective previous
analyses and testing have been
This causal analysis allows introducing prevention
actions
Prevention is better than the cure and is a typical
characteristic of higher levels of maturity in the
software development process
27-Sep-11 Software Engineering / Fernando Brito e Abreu 149
139. Defect prevention in CMMI
27-Sep-11 Software Engineering / Fernando Brito e Abreu 150
140. Bibliography
[Bec02] K. Beck, Test-Driven [Lyu96] M.R. Lyu, Handbook of Software
Development by Example, Addison- Reliability Engineering, Mc-Graw-
Wesley, 2002. Hill/IEEE, 1996, Chap. 2s2.2, 5-7.
[Bei90] B. Beizer, Software Testing [Per95] W. Perry, Effective Methods for
Techniques, International Thomson Software Testing, John Wiley &
Press, 1990, Chap. 1-3, 5, 7s4, 10s3, Sons, 1995, Chap. 1-4, 9, 10-12,
11, 13. 17, 19-21.
[Jor02] P. C. Jorgensen, Software [Pfl01] S. L. Pfleeger, Software
Testing: A Craftsman's Approach, Engineering: Theory and Practice,
second edition, CRC Press, 2004, 2nd ed., Prentice Hall, 2001, Chap.
Chap. 2, 5-10, 12-15, 17, 20. 8, 9.
[Kan99] C. Kaner, J. Falk, and H.Q. [Zhu97] H. Zhu, P.A.V. Hall and J.H.R.
Nguyen, Testing Computer Software, May, “Software Unit Test Coverage
2nd ed., John Wiley & Sons, 1999, and Adequacy,” ACM Computing
Chaps. 1, 2, 5-8, 11-13, 15. Surveys, vol. 29, iss. 4 (Sections 1,
[Kan01] C. Kaner, J. Bach, and B. 2.2, 3.2, 3.3), Dec. 1997, pp. 366-
Pettichord, Lessons Learned in 427.
Software Testing, Wiley Computer
Publishing, 2001.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 151
141. Applicable standards
(IEEE610.12-90) IEEE Std 610.12- (IEEE1044-93) IEEE Std 1044-1993
1990 (R2002), IEEE Standard (R2002), IEEE Standard for the
Glossary of Software Engineering Classification of Software
Terminology, IEEE, 1990. Anomalies, IEEE, 1993.
(IEEE829-98) IEEE Std 829-1998, (IEEE1228-94) IEEE Std 1228-1994,
Standard for Software Test Standard for Software Safety
Documentation, IEEE, 1998. Plans, IEEE, 1994.
(IEEE982.1-88) IEEE Std 982.1-1988, (IEEE12207.0-96) IEEE/EIA
IEEE Standard Dictionary of 12207.0-1996 //
Measures to Produce Reliable ISO/IEC12207:1995, Industry
Software, IEEE, 1988. Implementation of Int. Std.
(IEEE1008-87) IEEE Std 1008-1987 ISO/IEC 12207:95, Standard for
(R2003), IEEE Standard for Information Technology-
Software Unit Testing, IEEE, Software Life Cycle Processes,
1987. IEEE, 1996.
27-Sep-11 Software Engineering / Fernando Brito e Abreu 152
142. Black-Box Tools - Web Links
JavaStar (http://www.sun.com/workshop/testingtools/javastar.html)
JavaLoad (http://www.sun.com/workshop/testingtools/javaload.html)
VisualTest, Scenario Recorder, Test Suite Manager
(http://www.rational.com/)
SoftTest (http://www.softtest.com/pages/prod_st.htm)
AutoTester (http://www.autotester.com/)
WinRunner (http://www.merc-int.com/products/winrunguide.html)
LoadRunner (http://www.merc-int.com/products/loadrunguide.html)
QuickTest (http://www.mercury.com)
TestComplete (http://www.automatedqa.com)
S-Unit test framework (http://sunit.sourceforge.net)
eValid™ Automated Web Testing Suite (http://www.soft.com/eValid/)
27-Sep-11 Software Engineering / Fernando Brito e Abreu 153