Rapid Software Testing: Reporting

MP
PM Tutorial
9/30/2013 1:00:00 PM

"Rapid Software Testing:
Reporting"
Presented by:
James Bach
Satisfice Inc

Brought to you by:

340 Corporate Way, Suite 300, Orange Park, FL 32073
888-268-8770 ∙ 904-278-0524 ∙ sqeinfo@sqe.com ∙ www.sqe.com

James Bach
Satisfice, Inc.
James Bach is founder and principal consultant of Satisfice, Inc., a software testing and quality
assurance company. In the eighties, James cut his teeth as a programmer, tester, and SQA
manager in Silicon Valley in the world of market-driven software development. For nearly ten
years, he has traveled the world teaching rapid software testing skills and serving as an expert
witness on court cases involving software testing.

Rapid Software Testing:
Reporting
James Bach, Satisfice, Inc.
james@satisfice.com
www.satisfice.com

Rapid Testing
Rapid testing is a mind-set
and a skill-set of testing
focused on how to do testing
more quickly,
less expensively,
with excellent results.
This is a general testing methodology.
It adapts to any kind of project or product.

The Premises of Rapid Testing
1.

2.
3.
4.
5.

6.

7.
8.

Software projects and products are relationships between people, who are
creatures both of emotion and rational thought.
Each project occurs under conditions of uncertainty and time pressure.
Despite our best hopes and intentions, some degree of inexperience,
carelessness, and incompetence is normal.
A test is an activity; it is performance, not artifacts.
Testing’s purpose is to discover the status of the product and any threats to its
value, so that our clients can make informed decisions about it.
We commit to performing credible, cost-effective testing, and we will inform
our clients of anything that threatens that commitment.
We will not knowingly or negligently mislead our clients and colleagues.
Testers accept responsibility for the quality of their work, although they
cannot control the quality of the product.

What is a test report?


A test report is any description, explanation, or justification of the
status of a test project.



A comprehensive test report is all of those things together.



A professional test report is one competently, thoughtfully, and
ethically designed to serve your clients in that context.



A test report isn’t “just the facts.” It’s a story about facts.

Learn to tell the testing story!

Advice for Test Reporting









Build crediblity (by being credible).
Know the context of your tests (test framing).
Never use a number out of context (e.g. no test case counts).
Highlight general test activities (put tests in context).
Highlight product risk (put bugs in context).
Practice “safety language” (avoid misleading speech)
Tell a three-level testing story. (status testing value)
Don’t waste peoples’ time. (fit the report to the context)

The First Law of Reporting:
Be Credible!


They won’t listen to uncomfortable information,
unless you are credible.



They’ll assume you’re mistaken about surprising information,



They’ll assume you’re exaggerating about risks,



They’ll micro-manage your reporting,













Actually care about the project.
Actually care about people on the project.
Actually know how to do your job.
Do not tell lies or exaggerate.
Sweat the details in your own work.
Gain experience.
Study the technology.
Read all documents carefully.
Find things to appreciate about the work of others.
Acknowledge mistakes, correct them and learn from them.
Keep a journal and become the historian of your project.











A Narrative Model of Testing




This is a map of the Rapid
Testing methodology that
I teach.
It is organized in the
structure of a story,
because story construction
is at the heart of what it
means to test.











Let’s Count Unicorns!

Do you know what a
Unicorn is? Okay.
Answer this question:
How many
unicorns will
fit into your
cubicle?

In the absence of context…
test case counts mean NOTHING!
How much testing is 40 test cases?
 How much is 400?
 How about 40,000 test cases?


“Pass Rate” is a Stupid Metric.
Pass Rate
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
2/1
2/3
2/5
2/7
2/9
2/11
2/13
2/15
2/17
2/19
2/21
2/23
2/25
2/27
3/1
3/3
3/5

Pass Rate

You shouldn’t take
test case counts seriously because…











Test cases are not independent.
Test cases are not interchangeable.
Test cases vary widely in value from case to case, tester to
tester, product to product, project to project, test technique to
test technique, and over time.
Test case design is subjective, so counts are easy to inflate.
Test cases do not— and can not—capture all the testing that
occurs (example: bug investigation)
Testers often don’t follow the test cases, anyway.
Automated test cases are fundamentally different from sapiently
executed tests.
Test cases represent what’s easy to put into a test case.

Testing Dashboard

Updated:

Build:

2/21

38

Area

Effort C. Q. Comments

file/edit
view
insert
format
tools
slideshow
online help
clipart
converters
install
compatibility
general GUI

high
low
low
low
blocked
low
blocked
none
none
start 3/17
start 3/17
low

1
1+
2
2+
1
2
0
1
1
0
0
3

1345, 1363, 1401
automation broken
crashes: 1406, 1407
animation memory leak
new files not delivered
need help to test...
need help to test...
lab time is scheduled

15











Activity-based test management
is designed to facilitate reporting
 Thread-based Test Management:
This means organizing your whole test effort around test
activities that comprise your testing story. You manage
testing AND report status from a mind-map.
 Session-based Test Management:
This means organizing testing into “sessions” which are
normalized units of uninterrupted test time. You can count
these more safely.

Visualizing Test Progress











Risk-Based Testing Makes
Reporting More Relevant
Risk Area 1

Status of the product
and what we did to
test it….

Risk Area 2

and what we did to
test it….

Risk Area 3

and what we did to
test it…

(I rarely make a grid like this with a written report, because the artifacts I
use to manage testing, day-to-day, are focused on activities, not risks, and
I would have to create a special document to do a risk-based report.)











Safety Language
(aka “epistemic modalities”)
“Safety language” in software testing, means to
qualify or otherwise draft statements of fact so
as to avoid false confidence.
 Examples:
I think…
It appears…
So far…
I infer…
It seems…
apparently…
I assumed…


The feature worked

I have not yet seen any
failures in the feature…

Safety Language In Action











To test is to construct three stories

(plus a bit more)

Level 1: A story about the status of the PRODUCT…
…about how it failed, and how it might fail...
…in ways that matter to your various clients.

Level 2: A story about HOW YOU TESTED it…
…how you configured, operated and observed it…
…about what you haven’t tested, yet…
…and won’t test, at all…

Level 3: A story about the VALUE of the testing…
…what the risks and costs of testing are…
…how testable (or not) the product is…
…things that make testing harder or slower…
…what you need and what you recommend…

Why should I be pleased
with your work?

(Level 3+: A story about the VALUE of the stories.)
…do you know what happened? can you report? does report serve its purpose?











James Bach and Chris Ojaste

12/25/08

james@satisfice.com
godai92@live.com

Incident Report
Analysis and Repair of Kraft “Grate-It Fresh” Parmesan Cheese
Dispenser
Overview
We fixed a broken Kraft “Grate-It Fresh” self-contained disposable parmesan cheese dispensing unit.
This report details the incident, including the problem as it presented to us, analysis of the problem,
and corrective action we took.

Situation and Problem
The investigators (Chris and James) were attending a Christmas banquet at which was served pasta
along with grated parmesan cheese. The cheese was dispensed from a self-contained disposable unit,
inside of which there appeared to be a block of cheese.
“KRAFT Grate-It-Fresh Parmesan Cheese is the easy way to get the bold flavor of
freshly grated Parmesan cheese. This unique and convenient all-in-one package,
with 100% pure Parmesan cheese and a built-in grater, dispenses freshly grated
Parmesan cheese with each easy turn. It’s the most convenient way to top off all
your favorite dishes with the dynamic flavor of freshly grated Parmesan.”
(http://brands.kraftfoods.com/KraftParm/parmProducts.htm)
By rotating the dial on the bottom of the unit in a clockwise fashion, the cheese is
shaved off the block and delivered to the plate by means of gravity. However, our
cheese dispenser was not working. Multiple rotations of the dial delivered no
cheese at all.
Someone had to save Christmas! We resolved to investigate and repair the problem if possible.

Analysis and Repair Process
1. External physical inspection ruled out the possibility of cheese exhaustion as a cause of the
problem. By the weight of the unit and by visual inspection through the plastic case, we determined
that about 1/3 of a block of cheese remained to be grated.
2. Also by visual inspection we determined the apparent mechanism by which the grater works is
consistent with the cheese grater described in US patent 6,412,717 . Specifically, a rotatable grating
plate is attached to a threaded spindle that passes through the cheese and through a pressure plate
on the opposite side of the cheese. By rotating the grating plate, the pressure plate is forced toward
the grating plate by the threads on the spindle. This pushes the cheese into the blades of the grating
plate. The grating plate and blades are plastic. The spindle and the pressure plate is also plastic.
The spindle seems to be made of a softer plastic than that of the pressure plate.
3. Experimentation established that the mechanism was functioning at least at a minimal level by
turning the grater in reverse and observing that the pressure plate pulled away from the cheese.
Turning the grater in the correct direction (clockwise) brought the pressure plate back into contact
with the cheese, pushing it into the grater. We then noted an increased resistance to turning
- 1 -

consistent with the pressure being placed on the cheese. However, the pressure approached a
maximum, then eased, as if the pressure plate was slipping on the threads of the spindle. We
conjectured that the threads were stripped.
4. Our first repair strategy was to push the cheese into the grater by hand. We thought that might
move the pressure plate past the point where the spindle threads were stripped (assuming that the
pressure plate itself was not damaged). To get at the cheese, we removed the grating cap with brute
force (surprisingly this did not appear to damage it), which freed the entire mechanism from the
enclosing plastic case. This allowed us to provide a great deal of pressure to the pressure plate, in
addition that that of the damaged threads on the spindle. This strategy failed. No matter how much
pressure we applied, very little cheese came through the grater.
5. This led us to a systematic examination of possible failure mechanisms . Here’s what we came up
with:
The grater blades may be damaged.
The grating plate may be warped so that the grater blades fail to engage.
The shape of the cheese face may cause the grater blades to fail to engage.
6. Visual inspection of the blades and grating plate failed
to corroborate the hypothesis that the problem lay
with the grater mechanism, whereas examination of
the cheese block revealed grooves in the cheese face
that perhaps could account for the blades failing to get
any bite.
7. Our second repair strategy was to remove the
cheese from the spindle, flip it over, and replace it so
that the grater engaged a pristine face of cheese. This
improved the grating by a little bit. At this point we
returned to our first strategy and applied manual force
to the pressure plate. This improved grating
effectiveness dramatically, and slowly moved the
pressure plate past the damaged portion of the spindle.
We then reassembled the unit.

Outcome
The grater appeared to work.
Subsequent web searches on the product name suggested
the probable cause of the initial failure : The downward
facing part of the cheese block dried out and became too
hard to grate. (Interesting that we did not consider the
possibility of dried out cheese in our list of failure modes,
in step 5. However, our repair strategy coincidentally
worked, even though we misunderstood the root cause.)
Other people online have experienced this. Apparently,
the cheese is meant to be used within 14 days of breaking
the seal. This seems like an unrealistic requirement.

Contrast-enhancement of low-res
photo of spindle we were examining,
showing healthy threads below the
region of stripped threads. The
pressure plate (at bottom) now rests
on healthy threads.

Development Notes on “Incident Report”
By James Bach
Overview
I wrote this report as an exercise to help teach the art of performing an investigation and reporting
upon it.
Maybe you are young, inexperienced, or a self-taught thinker. Maybe you’d like to compete better to get
a job doing something that involves problem-solving or rapid learning. If so, then look for opportunities
in your own daily experience to perform an investigation such as this, and write a report about it. Do
several of them, and you will have a portfolio of your work to show prospective employers. Regardless
of your formal educational background, showing examples of your work speaks boldly about what you
can do.
Although this report describes an investigation. The general approach I’ve taken here can be applied to
many kinds of reports.

General Approach to Reporting
I begin with the question “who am I serving with this report?” and then “what is my goal in making this
report?” Usually, I am serving a paying client and my primary goal has to do with helping them solve
some specific problem. That’s a start. In this case, however, my clients are my students and colleagues.

My goal, here, is to successfully tell the story of a thought process. Success means several things:
The reader
The reader
The reader
The reader

obtains a clear picture of the investigation.
obtains a useful example of a report.
feels able to contribute to or criticize the investigation, based on the report.
learns how a simple event might become a showcase for scientific thinking.

In writing reports, there is nearly always another goal. The author may not be aware of this goal, but
here it is:
The author’s own reputation as a thinker is enhanced and not diminished.
Remember: every report you write affects how people think about you. Your ability to reason, your eye for
detail, your commitment, your professionalism, your care for others—all of these qualities and more are
being evaluated in the minds of your readers.

I want to write a clean, simple report. I try to minimize clutter and text. I want it to be short, punchy,
and readable. I use formatting to help the reader’s eye find relevant information quickly, but I try to
reduce the number of formatting elements in the document to avoid slipping into visual confusion. I’m
not always sure if I succeed, but that’s my goal.

Speaking of formatting, I used the “modern report” template, from Microsoft Word, as a base. Then I
changed the fonts to Cambria and Calibri. I use Calibri for bold facing, since a non-serif font looks better
in bold and helps to distinguish text from the un-bolded serifed text in the body of paragraphs. Also
notice that when I bold text inside a paragraph, I increase the size by one point. I use bolding for
emphasis, occasionally italics, but never underlining. Underlining is messy and old-fashioned. I often
highlight key ideas with bolding, so that the body of the text will not look like a big gray mass. This
improves readability and browsability.

I want to help the reader come to his own conclusions even if they might differ from mine. To do

that, I include not only my observations, but also information about how my observations were
obtained and how they might be mistaken. I separate my inferences from the observations on which
they are based (example “by visual inspection and weight…) and show how one follows from the other. I

also consider including background information that will help the reader make a better assessment of
what I did, such as the references to the patent and to the Kraft website.

The structure of the report should support the thinking the reader needs to do. As I design the
report, I anticipate the questions the reader will have, and arrange for the answers to those questions
to “pop out” from the text. In this case, I felt that a play-by-play narrative of the investigation would
serve that need best.
I want to use professional vocabulary. Although it can be perfectly fine to write a report in an

informal tone, I felt in this case it would be amusing to apply a more formal writing style to this trivial
investigation. I was going for something like the rhetorical tone of an NTSB accident investigation.
Aside from tone, I also wanted to practice “talking like a tester.” That means speaking with extra
precision and objectivity, as compared to casual conversation.

Walkthrough
Let me walk you through the report to show you how I did it and why I did it that way.

This is the masthead that comes with the “modern report” template in Microsoft Word. I like to use a
minimalist approach: author, contact information, and date. In some situations I may include more
information, such as who commissioned the report, or the version number of the report.

Sometimes I struggle with the title of a report. The title is important, because the report may be sitting
on a desk with lots of other papers. The title will be the part that catches the eye first. One way to title
the report would be to make it quite specific. This can be fine for a one-off report, but usually a report I
write is part of a series, or one example from a category of reports. So, I generally prefer a short title
that identifies the type of report this is, then I provide specific information in the sub-title.
Incident report is an okay title. But I fear it’s a little too generic. Incident could mean anything.
Investigation report might be better. I chose “incident” because one typical investigative situation is a
customer coming to a technical support organization with a problem. These are often called incidents.

The purpose of the overview is to communicate the essence of the whole report so that the reader may
decide if it’s worth reading at all. The essence of the report is that we found a problem and fixed it. But I
can’t just write “Overview: we found a problem and fixed it.” I don’t want my report to sound generic—
as if I’ve simply copied the text from another report. Anytime I write something that seems generic, I
want to replace it with something that gives at least a bit of detail that is specific to the situation at
hand. That’s why I named and described the object that we fixed.

Also notice that there is no table of contents in this report. The biggest problem with a table of contents
in a short document is that it conveys the subtle message to the reader that the report is full of fluff that
must be puffed up as much as possible to make it look more impressive. I’m annoyed with tables of
contents in reports that are less than about fifty pages long. I think they are a waste of space. If the
report is more than about 7 or 8 pages long, then I will list the sections of the report in the overview,
but I won’t give page numbers. It’s a simple matter for the reader to find the sections in a short
document.

The reason I describe the situation and problem is to show the focus and motivation of the
investigation. This creates a tension that is resolved in the meat of the report. At the end of the report I
go back to the top and ask myself if I have answered the questions or dealt with the challenges posed in
the situation and problem section.
I initially expected to have separate sections to describe the situation and the problem, but there
seemed to be too little to say about the each of those things, individually. Combining them created a
better flow and a critical mass of content.
One of the little challenges in writing this was to describe the object we repaired. After trying to
describe it in original words, I realized that I could use an official description of it, and a few moments
of web searching brought me to the Kraft site. The description was brief enough that I could include it
handily in the report. Anything included must be properly attributed, of course. In this case, the full link
to the web page makes sense to include, so the reader can look up more information.
I took a cell phone picture of the actual unit we repaired, but when I discovered the Kraft website had a
handsome official picture, I used that one instead.

I initially expected to have separate sections for analysis and repair, but as in the case of situation and
problem, I ended up combining them. In this case, analysis and repair activities were intertwined. I
didn’t see a graceful way of detangling them.
I numbered the paragraphs to convey a sense of step-by-step order. In fact, the investigation bounced
around a lot and branched. Reality is complicated, but part of the reporting process is to organize what
happened into a comprehensible narrative. That means the flow of events I report are going to be a bit
simpler than it happened in real life. In a complicated investigation I will often film it or take detailed
notes to preserve the sequence of events.

In a narrative style of reporting, I strive to create anticipation and interest in the mind of the readers.
That keeps them reading and thinking. I want them to follow along and get a sense of the things I
considered, and the false steps I made as well as the productive steps.
The highlight of the first step of the investigation is the method we used to examine the grater. I wrote
“external physical inspection” to distinguish what we did from plausible alternatives such as
disassembling the unit, or reading about the unit online.
Note on phrasing: See the words “cheese exhaustion.” I suppose I could have written “…the unit had run
out of cheese.” That would have been simpler and more accessible, but I was going for a more scientific
tone. I once saw an NTSB report refer to “fuel exhaustion” as a cause of an airplane accident, so I
emulated that.

In order to report credibly about the investigation and repair of a mechanical problem, I needed to
describe the mechanism with sufficient detail to allow the reader to appreciate the situation. As I tried
to do that, I found myself making up my own terms to describe the various parts of the grater. After a
few attempts writing in my own words, I realized that there might be a patent associated with the
grater. That patent may include exactly the description I needed.
I went to Google patent search and quickly discovered a cheese grater from 1978 (patent 4,082,230)
that looked something like the one we had repaired. I thought I would use that patent, until a few
minutes later I thought perhaps I should search for “food grater” or just “grater” instead of specifying a
cheese grater. This is because patents are sometimes written from the most general standpoint possible
in order to maximize the scope of the patent. That search turned up exactly the invention I was looking
for.
I considered pasting the exact description of the invention from the patent into the report. That didn’t
work well. The text was too long and complicated. Therefore, I settled for summarizing it using
technical terms drawn from the two patents.
In making my description, I referred to the patent. That way I have a good reason not to explain the
mechanism in any great detail, since the details are implicitly included by reference.

I tried to make the steps consistent by putting the action first in each step. Each step begins with some
variation of “we did this.” Here the experiment is briefly described. Just enough to create a reasonably
detailed mental image in the minds of readers.

The first repair strategy failed. In a report that seeks to describe only the problem and the solution, it is
not necessary to describe failed strategies. I included it because this report is also concerned with
demonstrating the investigative process itself.

In real life, we did not say “let’s systematically examine all possible failure mechanisms.” What we did
was bat around some ideas while each of us tried to force the cheese through the grater. In retrospect,
however, our chatter seemed equivalent to an open brainstorm of reasons why the product was failing.

The narrative would be incomplete unless I show how we ruled out the various possible causes of the
problem. That’s done in paragraph 6, which leads into the second, successful repair strategy.
The picture of the spindle is crude. It was based on the photo, below, taken with my Blackberry. I should
have photographed the spindle outside of the plastic case. It would have been much sharper. I didn’t
realize I was going to be writing a full report on this incident at the time of the investigation, or I would
have taken (and included) many more photos. Photographs, diagrams, and video bring a wonderful
dimension to investigative reports.
Because the photo of the spindle was so blurry, I used an image enhancement program to play with the
contrast and color balance until I was able to see the threads. Then I added annotation using Microsoft
Paint.

In the first draft of the report, I forgot to include the simplest information about the outcome: that the
grater appeared to be working. On reading through the draft several times, I fixed that.
Only as I was finishing up the report did it occur to me that I could use Google to discover whether
anyone else had been experiencing problems with the Kraft grater. Sure enough there are several
reports online. My first reaction to these was “don’t people have better things to do than to complain
about a trivial food product on their blogs?” and then I remembered that is sort of what I’m doing by
writing this report. Heh heh. People are motivated by lots of different things, I guess.
A troublesome element of the report is that it reveals a major oversight of the investigators: we failed
to consider over-dried cheese as the cause of the problem. This makes us look bad, in a way, but in
another way, including that information as a post-script shows that we might accept our mistakes and
learn from them.

Potential Improvements to the Investigation
It can be difficult to decide how much investigation is enough. We felt satisfied with achieving the
repair of the unit, but we hardly exhausted the possible branches of exploration and learning. Here are
some ideas for what we could have done:
We could attempt to measure the properties of the cheese block to quantify the amount of
drying that has occurred. We could perform experiments to track the drying process. We could
attempt to develop home-spun countermeasures to prevent the drying from taking place or
reverse the drying process, then report on their efficacy.
We could interview the homeowners to determine the history and provenance of the cheese
grating unit. How long had they owned it? When did they first open it?
We could search for more information online about the properties of the product and its
reported problems.
We could contact Kraft directly and ask about the product.
We could try the dried cheese with traditional metal graters to see if part of the blame lies with
the plastic grating plate.
We could have consulted other guests at the dinner.
We could have purchased several units and tested them in parallel.

OEW Case Tool
QA Analysis, 8/26/94

Summary
OEW is a complex application that is fairly stable, although not up to our standards for fit and
finish.
There are no existing tests for the product, only a rudimentary test outline that will need to be
translated from German. One full-time and one part-time tester work on the project. Those testers
are neither trained nor particularly experienced. The vendor’s primary strategy for quality
assurance is a fairly extensive beta test program.
We suggest a minimum of one tester to validate the changes to OEW. We also
suggest that the developer of OEW work onsite with our test team under our
supervision.

Feature Analysis
Complexity

This is a complex application.
8
68
40
5
27
120

Functionality

interesting menus
interesting menu items
obvious dialogs
kinds of windows
buttons on the speedbar
thousand lines of code

This application has substantial functionality.
Code Generation
Code Parsing
Code Diagramming
Build Invocation

Volatility

The changes in the codebase will be minor.
Bug fixes.
Smallish U.I. tweaks.
Disable support for various things, including build invocation.

Operability

The application is ready for testing immediately.
It operates like a late beta or shipping application.
The proposed changes will be unlikely to destabilize the app.

Customers

We expect that large codebases will be generated, parsed or
diagrammed with this application.
About 25% of our beta testers have codebases larger than 200,000 lines.
The parsing capability will encourage customers to import their apps.

Risk Analysis


The risk of catastrophes occurring due to changes in the codebase is small.



The risk that the much larger and probably more demanding Borland market will be dissatisfied with
OEW is significant.

QA Strategies









Get this into beta 2, or send a special beta 2B to our testers who have large codebases.
Find beta bangers with large codebases and have them import into OEW.
Perform rudimentary performance analysis with big codebases.
Bring the existing OEW testers from Germany onsite.
Hire a dedicated OEW tester (contractor, perhaps).
Participate in a doc. and help review.
Translate existing test outline from German.
Perform at least one round of compatibility testing.

Schedule



The QA schedule will track the development schedule.
It may take a little while to recruit a tester.

Issues


Are there international QA issues?

1. No access to LAN, access to PCE over Internet
2. Access to LAN, but no account on PCE
Levels of Required Access

3. PCE Account, but no rights within account
4. Rights to some projects, not others
5. No rights for particular action within project
6. All rights and access

1. No special knowledge/accidental

Prioritizing Security Problems

2. Casual hacker knowledge
Level of Attack

3. User level knowledge of PCE
4. Special hacker knowledge
5. Developer level knowledge of PCE

Levels of Damage
Levels of Responsibility

Web Client
API
Server-to-Server Communication
Database Direct Attack
MS Project Attack
Testers must learn security testing basics

LDAP attack

Attack Vectors

Man-in-the-middle

Produce a security-specific test coverage outline

DNS Poisoning

Document a concise security-specific test strategy
Consider security implications for testing
of each fix and enhancement

Shoulder Surfing

Testing and Analysis Activities

Social Engineering

Periodically perform general security
regression testing

Keyloggers and Malware

Monitor and apply patches to platform elements

Efforts Going Forward

Unconstrained input

PCE Security

Create installation notes that clearly
delineate security issues

Obscure functions
Low level error messages

Explain security architecture to testers.
Make finding obscure problems easier.
Consider reviewing Microsoft security
design checklists

Technically informative error messages

Development Activities

Third-party components and interfaces

"Blood in the Water"

Review internal permissions architecture

Generic O/S features and interfaces
Default configurations
Source Code
Security based on assumption of no malice

Security Observed

Degrees of freedom in input
Recent vulnerability disclosure in platform component

Sniffing/Man-in-Middle Attack
Documentation Review
Whitebox Hazard Analysis
Fingerprinting
Google Hacking
Vulnerability Scanning/Lookup

Testing Activities

SQL Injection
Directory Traversal
Cross-site Scripting
Input Constraint Attacks
HTTP Manipulation
Session Hijacking
Permissions Testing

Problems Found

PCE Security.mmap - 4/9/2011 -

Spot Check Test Report
Prepared by James Bach, Principal Consultant, Satisfice, Inc.

8/14/11

1. Overview
This report describes one day of a paired exploratory survey of the Multi-Phasic Invigorator and
Workstation. This testing was intended to provide a spot check of the formal testing already routinely
performed on this project. The form of testing we used is routinely applied in court proceedings and
occasionally by 3rd-party auditors for this purpose.
Overall, we found that there are important instabilities in the product, some of which could impair
patient safety; many of which would pose a business risk for product recall.
The product has new capabilities since August, but it has not advanced much in terms of stability since
then. The nature of the problems we found, and the ease with which we found them, suggest that these
are not just simple and unrelated mistakes. It is my opinion that:


The product has not yet been competently tested (or if it has been tested, many obvious
problems have not been reported or fixed).



The developers are probably not systematically anticipating the conditions and orientations and
combinations of conditions that product may encounter in the field. Error handling is generally
weak and brittle. It may be that the developers are too rushed for methodical design and
implementation.



The requirements are probably not systematically being reviewed and tested by people with
good competency in English. (e.g. the “Pulse Transmitter” checkbox works in a manner that is
exactly opposite to that specified in the requirements; error messages are not clearly written.)

These are fixable issues. I recommend:


Pair up the developers and testers periodically for intensive exploratory testing and fixing
sessions lasting at least one full day, or more.



Require the testers to be continuously on guard for anomalies of any kind, regardless of the test
protocol they are following at any given moment. Testers should be encouraged to use their
initiative, vary their use of the product, and speak up about what they see. Do not postpone the
discovery or reporting of any defect, even small ones—or else they will build up and the
processes creating these defects will not be corrected.



The requirements should be reviewed by testers who are fluent in English.



The developers should carefully diagram and analyze the state model of the product, and redesign the code as necessary to assure that it faithfully implements that state model.



Unit-level testing by the developers, and systematic code inspection, as per FDA guidance.

2. Test Process
The test team consisted of consulting tester James Bach (who led the testing) and Satisfice, Inc. intern
Oliver Bach.
The test session itself spanned about seven hours, most of which consisted of problem investigation.
Finding the problems listed below took only about two hours of that time.
The process we used was a paired exploratory survey (PES). This means two testers working on the same
product at the same time to discover and examine the primary features and workflows of the product
while evaluating them for basic capability and stability. One tester “plays” while the other leads,
organizes and records the work. A PES session is a good way to find a lot of problems quickly. I have
used this method on court cases and other consulting assignments over the years to evaluate the
quality of testing. The process is similar to that published by Microsoft as the General Functionality and
Stability Test Procedure (1999).
In this method of testing, we walk through the features of the product that are readily accessible,
learning about them, studying their states and interactions, while continuously applying consistency
heuristics as test oracles in our search for bugs. Ten such heuristics in particular are on our minds. These
ten have been published as the “HICCUPP” model in the Rapid Software Testing methodology. (See
http://www.satisfice.com/rst.pdf for more on that.)
We filmed most of the testing that we did, and delivered those videos to Antoine Rubicam.
We did not test the entire product during our one-day session. However, we sampled the product
broadly and deeply enough to get a good feel for its quality.

3. Test Results
The severe problems we found were as follows:
1. System crash after switching probes. If the orientation mode is improperly configured with the
circular probe such that there are no flip-flop mode cathodes active, and the probe is then
switched to “dissipated”, the application will crash at the end of the very next exfoliation
performed. (This is related to problems #6 and #7)
Risk: delay of procedure, loss of user confidence, potential violation of essential performance
standard of IEC60601, product recall
Implications: The developer may not have anticipated all the necessary code modifications
when dissipated mode probe support was added. Testers may not be doing systematic probe
swap testing.
2. No error displayed after ion transmitter failure during exfoliation. By pressing the start button
more than once in quick succession after an ion transmitter error is cleared, an exfoliation may
begin even though the transmitter was not in the correct pulse mode. The system is now in a
weird state. After that point, manually stopping the transmitter, changing the pulse rate, or
cutting power to the transmitter will not result in any error message being displayed.

Risk: patient death from skin abrasions formed due to unintentionally intensified exfoliation,
loss of user confidence, violation of IEC60601-1-8 and 60601-1-6, product recall
Implications: There seems to be a timing issue with error handling. The product acts differently
when buttons are pressed quickly than when buttons are pressed slowly. Testers may not be
varying their pace of use during testing.
3. Error message that SHOULD put system in safe mode does NOT. Ion transmitter error
messages can be ignored (e.g. "Exfoliation stopped. Ion flow is not high!"). After two or three
presses of the start button, exfoliation will begin even though multiple error messages are still
on the screen.
Risk: Requirements violation, violation of IEC 60601-1-8 and 60601-1-6, product recall.
Implications: Suggests that the testers may not be concerned with usability problems.
4. Can start exfoliation while exit menu is active (and subsequently exit during exfoliation). It
should not be possible to press the exit button while exfoliating. However, if you press the exit
button before exfoliating and the exit menu appears, the start button has not been disabled,
and the exfoliation will begin with the exit menu active. The user may then exit.
Risk: unintentional exfoliation, loss of user confidence, violation of IEC60601-1-6, product recall
Implications: Problems like this are why a careful review of the product state model and redesign of the code would be a good idea. The bug itself is not likely to cause trouble, but the
fact that this bug exists suggests that many more similar bugs also exist in the product.
5. Probe menu freezes up after visiting settings screen (and at other apparently random times).
Going to settings screen, then returning, locks the probe mode menu until an exfoliation is
started, at which point the probe mode frees up again. We found that the menu may also lock
at apparently random intervals.
Risk: loss of user confidence
Implications: Indicates state model confusion; variables not properly initialized or re-initialized.
6. Partial system freeze after orientation mode failure. When in orientation mode with no
cathodes selected for flip-flop, an exfoliation session can be started, which is allowed to
proceed until flip-flop phase is activated. At that point, an error message displays and system is
locked with "orientation and flip-flop" modes both selected on the exfoliation mode menu. The
settings and exit buttons are also inoperative at that point. (This state can also be created by
switching probes. It is related to problems #1 and #7.)
Risk: Procedure delay, loss of user confidence, product recall
Implications: Indicates state model confusion; variables not properly initialized or re-initialized.
7. No error is displayed when orientation session begins and flip-flop cathodes are not activated.
When in orientation mode with no cathodes selected for flip-flop, an exfoliation session can be
started. Instead, an error message should be generated. (This is related to problems #1 and #6.)

Risk: loss of user confidence, creates opportunity for worse problems
Implications: Suggests the need for a deeper analysis of required error handling. Testers may
not be reviewing error handling behaviors.
8. Cathode 10 active in standing mode after deactivating all cathodes in flip-flop mode. Deselection of cathodes in flip-flop or standing mode should cause de-selection of corresponding
cathodes in the other mode. However, de-selecting all flip-flop cathodes leaves cathode 10 still
active in standing mode. It’s easy to miss that cathode 10 is still active.
Risk: creates opportunity for confusion, possible inadvertent exfoliation with cathode 10,
possible violation of IEC60601-1-6
9. Error message box can be shown off-screen. Error message boxes display at the location where
the previous box was dragged. This memory effect means that a message box may be dragged
to the side, or even off the screen, and thus the next occurrence of an error may be missed by
the operator.
Risk: creates opportunity for confusion, possible for operator to miss an error, violation of
IEC60601-1-8 and 60601-1-6, when combined with bug #3, it could result in potential harm to
the patient.
10. Behavior of the "Pulse Transmitter" checkbox is the opposite of that specified in the FRS. The
FRS states "By selecting Pulse Transmitter checkbox application shall allow to perform
exfoliation session with manual controlled transmitter.” However, it is actually de-selecting the
checkbox which allows manual control.
Risk: business risk of failing an audit. It is potentially dangerous, as well as illegal, for the
product to behave in a manner that is the opposite of its Design Inputs and Instructions for Use.
Implications: This is a common and understandable problem in cases where the specifications
are written by someone not fluent in English. It is vital, however, to word requirements
precisely and to test the product against them. Bear in mind that the FDA personnel probably
will be native English-speakers.
11. Setting power to zero on an cathode does not cause the power to be less than 10 watts.
According to the log file, the power is well above the standard for “0” laid out in IEC60601.
(Also, displaying a “---“instead of “0” does not get around the requirement laid out in the
standard. This is true not only because it violates the spirit of the standard, but also because the
target value is displayed as “0” and the log file lists it as “0”.)
Risk: violation of IEC60601, product recall
Implications: The testers may not be familiar with the requirements of IEC60601. They may not
be testing at zero power because the formal test protocol does not require it.
Here are the lower severity problems we found:

12. "Time allocated for cathode 10 is too short" message displays when time is rapidly dialed
down. The message only displays when the time is dialled down rapidly, and we were not able
to get it to display for any cathode other than 10.
13. Pressing ctrl key from exit menu causes immediate exit.
14. Exfoliation tones mysteriously change when only one cathode is active in standing mode. The
exfoliation tone for flip-flop mode is sounded for standing mode when all but one cathode is deactivated.
15. Power can be set to zero during exfoliation without cancelling exfoliation. Since an exfoliation
cannot be started without at least one cathode set to a power greater than 0, and since deactivating an cathode during an exfoliation session prevents it from being re-activated, it is
inconsistent to allow cathodes to be set to “0” power during an exfoliation unless they are
subsequently de-activated.
16. Power can be set to 1, which is unstable. Does it make sense to allow a power level of 1? The
display keeps flickering between 1 and “---“.
17. If orientation is used, the user may inadvertently fail to set temperature limit on one of the
exfoliation modes. Flip-flop and standing have different temperature limit settings. In our
testing, we found it difficult to remember to set the limit on both modes before beginning the
exfoliation session. This is a potential usability issue.
18. "Error-flow in standby mode should be low" message displayed at the same time as
"Exfoliation stopped. Transmitter flow is not high!" This is a confusing pair of messages, which
seem to require that the transmitter be in low flow and high flow at the same time.
19. Error messages stack on top of each other. If you press start with 0 power more than once,
then more than one error message is displayed. As many times as you press, more error
messages are displayed.

Rapid Software Testing: Reporting

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Rapid Software Testing: Reporting

Ähnlich wie Rapid Software Testing: Reporting (20)

Mehr von TechWell

Mehr von TechWell (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Rapid Software Testing: Reporting