1. Encapsulation and Abstraction for
Modeling and Visualizing
Information Uncertainty
Alexander Streit
Bachelor of Information Technology (Honours)
Queensland University of Technology
A thesis submitted in partial fulfilment of the requirements for the degree of
Doctor of Philosophy
November 2007
Principal Supervisor: Prof. Binh Pham
Associate Supervisor: Dr. Ross Brown
Faculty of Information Technology
Queensland University of Technology
Brisbane, Queensland, AUSTRALIA
7. Abstract
Information uncertainty is inherent in many real-world problems and adds a layer of
complexity to modeling and visualization tasks. This often causes users to ignore
uncertainty, especially when it comes to visualization, thereby discarding valuable
knowledge. A coherent framework for the modeling and visualization of information
uncertainty is needed to address this issue
In this work, we have identified four major barriers to the uptake of uncertainty
modeling and visualization. Firstly, there are numerous uncertainty modeling tech-
niques and users are required to anticipate their uncertainty needs before building their
data model. Secondly, parameters of uncertainty tend to be treated at the same level
as variables making it easy to introduce avoidable errors. This causes the uncertainty
technique to dictate the structure of the data model. Thirdly, propagation of uncertainty
information must be manually managed. This requires user expertise, is error prone,
and can be tedious. Finally, uncertainty visualization techniques tend to be developed
for particular uncertainty types, making them largely incompatible with other forms
of uncertainty information. This narrows the choice of visualization techniques and
results in a tendency for ad hoc uncertainty visualization.
The aim of this thesis is to present an integrated information uncertainty modeling
vii
8. and visualization environment that has the following main features: information and
its uncertainty are encapsulated into atomic variables, the propagation of uncertainty is
automated, and visual mappings are abstracted from the uncertainty information data
type.
Spreadsheets have previously been shown to be well suited as an approach to visu-
alization. In this thesis, we devise a new paradigm extending the traditional spreadsheet
to intrinsically support information uncertainty.
Our approach is to design a framework that integrates uncertainty modeling tech-
niques into a hierarchical order based on levels of detail. The uncertainty information
is encapsulated and treated as a unit allowing users to think of their data model in terms
of the variables instead of the uncertainty details. The system is intrinsically aware of
the encapsulated uncertainty and is therefore able to automatically select appropriate
uncertainty propagation methods.
A user-objectives based approach to uncertainty visualization is developed to guide
the visual mapping of abstracted uncertainty information. Two main abstractions of
uncertainty information are explored for the purpose of visual mapping: the Unified
Uncertainty Model and the Dual Uncertainty Model. The Unified Uncertainty Model
provides a single view of uncertainty for visual mapping, whereas the Dual Uncertainty
Model distinguishes between possibilistic and probabilistic views. Such abstractions
provide a buffer between the visual mappings and the uncertainty type of the underly-
ing data, enabling the user to change the uncertainty detail without causing the visual-
ization to fail.
Two main case studies are presented. The first case study covers exploratory
and forecasting tasks in a business planning context. The second case study inves-
tigates sensitivity analysis for financial decision support. Two minor case studies are
also included: one to investigate the relevancy visualization objective applied to busi-
ness process specifications, and the second to explore the extensibility of the system
through General Purpose Graphics Processor Unit (GPGPU) use. A quantitative anal-
ysis compares our approach to traditional analytical and numerical spreadsheet-based
viii
9. approaches. Two surveys were conducted to gain feedback on the from potential users.
The significance of this work is that we reduce barriers to uncertainty modeling
and visualization in three ways. Users do not need a mathematical understanding of
the uncertainty modeling technique to use it; uncertainty information is easily added,
changed, or removed at any stage of the process; and uncertainty visualizations can be
built independently of the uncertainty modeling technique.
ix
11. Publications
1. Pham, B. and Streit, A. and Brown, R. “Visualisation of Information Uncertainty: Progress and
Challenges,” in Interactive Visualisation: A State-of-the-Art Survey, Elena Zudilova-Seinstra,
Tony Adriaansen and Robert van Liere (eds.), 2007, Springer, UK. In Print.
2. Streit, A. and Pham, B. and Brown, R. “A Spreadsheet Approach to Facilitate Visualization of
Uncertainty in Information,” IEEE Transactions on Visualization and Computer Graphics, 11
July 2007. IEEE Computer Society Digital Library. IEEE Computer Society, 30 September
2007 <http://doi.ieeecomputersociety.org/10.1109/TVCG.2007.70426>
3. Streit, A. and Pham, B. and Brown, R. Visualisation Support for Managing Large Business Pro-
cess Specifications. International Conference on Business Process Management (BPM). Nancy,
France, September 6-8, 2005. Lecture Notes in Computer Science, Springer. Acceptance rate:
13%
4. Campbell, A. and Berglund, E. and Streit, A. Graphics Hardware Implementationof the Parameter-
Less Self-Organising Map. International Conference on Intelligent Data Engineering and Au-
tomated Learning (IDEAL’05). Brisbane, July 6-8, 2005. Pages 343-350. Lecture Notes in
Computer Science, Springer.
xi
13. Acknowledgments
This thesis would not have been possible without my principal supervisor, Prof. Binh
Pham, and my associate supervisor, Dr. Ross Brown. Both collaborated to teach me
their process for completing research projects. Invaluable knowledge for which I am
very grateful.
I wish especially to thank Fral, who supported me even when it didn’t seem ratio-
nal to do so. My mother, Jilli, who should really be receiving this degree herself. I
also wish to thank my Honors supervisor, Ruth Christie, who inspired me to pursue
postgraduate studies in the first instance.
I wish to thank my colleague, Dr. Robert Smith, who provided me with extensive
insight and feedback. Alexander Campbell for his many comments and suggestions.
Finally, I wish to thank my business associate, Dr. Andy Boud, for acting as an unof-
ficial mentor.
xiii
15. Abbreviations
ASP Analytical Spreadsheet Package
BPM Business Process Management
DUM Dual Uncertainty Model
EBNF Extended Backus-Naur Form
GIS Geographic Information Systems
GPGPU General Purpose Graphics Processing Unit
LIC Line Integral Convolution
NaN Not A Number
NIST National Institute of Standards and Technology
NPV Net Present Value
PDF Probability Density Function
PMF Probability Mass Function
QUM Quad Uncertainty Model
SI The Spreadsheet for Images
SIV Spreadsheet for Information Visualization
xv
16. UML Unified Modeling Language
UUM Unified Uncertainty Model
VTK The Visualization Toolkit
xvi
27. Statement of Original Authorship
The work contained in this thesis has not been previously submitted for a
degree or diploma at any other higher education institution. To the best of
my knowledge and belief, the thesis contains no material previously pub-
lished or written by another person except where due reference is made.
Signature:
Alexander Streit
Date:
xxvii
29. CHAPTER 1
Introduction
1.1 Motivation
The term information uncertainty refers to vagueness, imprecision, fuzziness, likeli-
hood, and related uncertainty as it is present in information. Many problems are subject
to information uncertainty and, in response, numerous techniques have been developed
to model this uncertainty. Modeling information uncertainty not only provides greater
confidence in results, but can also give an indication of how much confidence to place
in the results. While visualization is a popular tool, information uncertainty visualiza-
tion is far less widespread.
In this work we have identified four major barriers to the uptake of information
uncertainty modeling and visualization. Firstly, there are numerous information uncer-
tainty modeling techniques, each of which are treated differently. This forces users to
anticipate their information uncertainty needs before building their data model. Sec-
ondly, parameters of the uncertainty space tend to be treated at the same level as vari-
ables, which makes it easier to introduce avoidable errors and causes the information
30. 2 Chapter 1. Introduction
uncertainty modeling technique to dictate the structure of the user’s model. Thirdly,
propagation of uncertainty information must be manually managed by the user, which
requires expertise, is error prone, and can be tedious. Fourthly, uncertainty visualiza-
tion techniques tend to be developed for particular information uncertainty types and
they are largely incompatible with other forms of uncertainty information. This nar-
rows the selection of visualization techniques available and results in a tendency for ad
hoc information uncertainty visualization techniques.
Information uncertainty modeling makes it more difficult to manage the data model
due to increased information. Furthermore, it is common that a chosen uncertainty
modeling technique will subsequently need to be changed, since knowledge about the
uncertainty changes as more information becomes available. This is currently a diffi-
cult and error prone process.
Visualization of information uncertainty poses its own unique challenges. Existing
visualization techniques may not be appropriate for uncertainty information and there
are issues with information overloading and interpretatability of results. On a practical
level, there is a lack of tools that are conducive to visualizing information uncertainty.
To ease the burden of managing the information overload in modeling and visual-
ization requires an integrated system that covers the entire workflow cycle from data
acquisition to visualization. Tools are also needed to help users with higher-level tasks
such as selection of modeling and propagation options, and organization and compar-
ison of visual mappings. More specifically, the architecture should support automated
uncertainty propagation and allow easy switching between different uncertainty mod-
els, and different methods of display.
Spreadsheets are often used to perform uncertainty based analysis and they have
previously been shown to be well suited as an approach to visualization. However, the
benefit of a spreadsheet approach to uncertainty modeling and visualization has not yet
been explored. This thesis extends the spreadsheet paradigm to support information
uncertainty modeling and visualization in an integrated whole.
31. 1.2 Aims 3
1.2 Aims
The overall aim for this thesis is to devise an integrated information uncertainty mod-
eling and visualization environment that has the following features:
Hierarchical structure: The system should differentiate between levels of detail in
the data model. Uncertainty information is of a lower level of detail than the
variables.
Reduce data-type lock-in: If a data model is constructed using particular informa-
tion uncertainty modeling techniques, the cost to change to another modeling
technique should be minimized.
Adaptive: Information about the uncertainty space of a variable should be easy to add,
change, or remove at any stage of the modeling and visualization process.
Seamless integration of information and its uncertainty: There should not be an ar-
tificial separation between the information and its uncertainty.
Simplify information uncertainty modeling: Users should not be required to have
an intimate understanding of the modeling technique mechanics in order to use
it.
Automate propagation: Uncertainty information needs to be propagated and the sys-
tem should carry this out automatically.
Less error prone: The system should reduce the potential for user induced errors.
Flexible: Users should be able to map uncertainty information into alternative models
and visual features so that they can explore the impacts of different modeling
and visualization techniques.
Robust: When the uncertainty information changes, the existing data model and vi-
sualizations should continue to function correctly.
32. 4 Chapter 1. Introduction
Extensible: There are numerous information uncertainty modeling techniques and the
design of the system should allow for more to be added.
In order to achieve this aim, the following tasks are performed:
• Examine the field to determine the current state of play, covering information
uncertainty modeling techniques, visualization processes and practices, and un-
certainty visualization;
• Design an integrated information uncertainty modeling and visualization frame-
work
• Investigate how the spreadsheet paradigm can be extended to intrinsically sup-
port information uncertainty modeling and visualization;
• Explore uncertainty encapsulation as an approach to semantic association of in-
formation and its uncertainty;
• Develop an automated propagation mechanism and a method for resolving un-
usual modeling technique combinations;
• Design uncertainty abstractions that enable visualization mappings to be data-
type independent;
• Explore the user-objectives approach as a means for defining visualization char-
acteristics;
• Conduct a quantitative analysis comparing the cost of our approach to existing
methods;
• Analyze feedback from potential users;
• Conduct case studies on financial decision support and business planning to es-
tablish the viability of the spreadsheet for commercial uses;
33. 1.3 Scope 5
• Investigate of the capability for the architecture to be applied to non-uncertainty
uses through a case study; and
• Draw conclusions and make recommendations for future work.
1.3 Scope
This thesis deals with information uncertainty, which is uncertainty about the true value
of a unit of information. The intrinsic connection between uncertainty and information
is the basis for our encapsulation approach, which underpins the automatic propagation
and visualization-oriented uncertainty abstraction. However, there exist several other
forms of uncertainty, such as uncertainty arising from interpretation, for which the
encapsulation approach may not be suitable. The methods presented in this thesis is
limited to those forms of uncertainty that can be parametrized in some quantifiable
way.
Modeling of uncertainty has its foundation in mathematics. This project is con-
cerned with the frameworks, approaches, and methods for applying these modeling
techniques. As such, mathematical issues will be touched on, however, detailed cov-
erage of mathematical models is beyond the scope of this work and it is assumed that
users will use the mathematical techniques appropriate to their problem.
1.4 Original Contribution
Current investigations into information uncertainty visualization have focused on vi-
sualization techniques for particular information uncertainty data types. We approach
the problem of information uncertainty visualization holistically, from modeling and
automated propagation through to user-objectives in visualization.
We produce an integrated information uncertainty modeling and visualization frame-
work and design the information uncertainty visualization spreadsheet, which intrin-
sically support information uncertainty modeling, automated uncertainty propagation,
and uncertainty model abstracted visualization.
34. 6 Chapter 1. Introduction
To achieve this we extend the spreadsheet paradigm to incorporate information un-
certainty and visualization features. This requires a number of components. Firstly,
our encapsulation of uncertainty information approach semantically links the infor-
mation to its uncertainty. Secondly, we introduce the uncertainty propagation model
to manage the mechanics of propagating uncertainty, including operations involving
mixed data-type parameters. Thirdly, we present hierarchical heterogeneous propaga-
tion, which automatically determines suitable combinations of the available methods
to ensure that the propagation can be achieved. Fourthly, we produce uncertainty ab-
straction models, which abstract the uncertainty information for visual mapping in vi-
sualizations by providing a common plural value type. Fifthly, we incorporate flexible
visualization capabilities into the spreadsheet using a visualization sheet.
Abstraction from the information uncertainty data type means that traditional data-
type specific visual mapping criteria may no longer be applicable, leaving a gap in the
knowledge. To address this, we investigate user-objectives for information uncertainty
visualization, which describe the characteristics of uncertainty space that the user is
seeking to visualize. User-objectives provide a data-type abstracted means of describ-
ing, executing, and evaluating visualizations.
1.5 Significance
The significance of this work is that it provides the means for intuitive and non-
intrusive environment for modeling and visualizing information uncertainty. This has
three major effects. Firstly, access to information uncertainty visualization is designed
into the system from the outset and it does not require user expertise in uncertainty
techniques to manage information uncertainty. Secondly, uncertainty information is
easily added, changed, or removed at any stage of the process. Thirdly, information
uncertainty visualizations can be built independently of the modeling technique, pro-
viding a coherent foundation for the development of visualization techniques while
reducing their tendency to be ad hoc.
35. 1.6 Organization of the Thesis 7
Information uncertainty is a problem in many fields. Overcoming barriers to its
modeling and visualization is an important step in managing a difficult problem.
1.6 Organization of the Thesis
The organization of this thesis is as follows. Chapter 2 introduces background mate-
rial on uncertainty modeling techniques, visualization techniques, and what has been
done to visualize uncertainty. Chapter 3 describes the framework that integrates infor-
mation uncertainty modeling and visualization tasks together into a coherent whole.
Chapters 4 through 6 cover the components of the framework: Chapter 4 elaborates
on the spreadsheet paradigm as a mechanism for integrating and managing these tasks.
Chapter 5 investigates the encapsulation approach to information uncertainty, which
includes the unified hierarchy and automated propagation. Chapter 6 explores the ab-
straction approach, which includes uncertainty abstraction models and user-objectives
for visualization. Chapter 7 integrates the components into a core system, covering
the requirements, design, and architecture. Chapter 8 considers advanced features and
extensibility of the system. Chapter 9 presents the evaluations of the system, with a
comparative analysis of different approaches, a discussion of a survey, and a case study
in business planning. Chapter 10 provides a conclusion and points to future work.
37. CHAPTER 2
Background
“As far as the laws of mathematics refer to reality, they are not certain;
and as far as they are certain, they do not refer to reality.”
– Albert Einstein1
2.1 Introduction
Information uncertainty is a complex subject that is inherent in many real-world prob-
lems. The uncertainty comes from different sources and can be interpreted and mod-
eled in various ways. There are often subtle interactions between variables and uncer-
tainty, which can be difficult to understand. Visualization of information uncertainty
presents an opportunity to provide deeper insights into the nature of the information,
its uncertainty, and the impact it has on outcomes. However, the difficulty of adopt-
ing information uncertainty and the lack of visualization tool support has caused many
practitioners to ignore the uncertainty completely or to ignore situations where the
1In J. R. Newman (ed.) The World of Mathematics, New York: Simon and Schuster, 1956
38. 10 Chapter 2. Background
uncertainty is deemed too high. This practice results in valuable knowledge being dis-
carded and reduced quality in outcomes, or worse, can even result in entirely wrong
outcomes.
There are aspects of information uncertainty that have been given considerable at-
tention, particularly in the field of mathematics. Two aspects have been especially
well developed: the first includes the various mathematical models that exist for repre-
senting, measuring, and recording uncertainty. The second aspect is the collection of
rules and techniques for propagating, estimating, and minimizing information uncer-
tainty. These models and techniques range from the statistical methods and probabil-
ities through to fuzzy models. Research into visualization of information uncertainty
has only been carried out sporadically during the last decade. Earlier work has focused
on a data-driven approach, with visual data representations for particular data types or
responding to the needs of specific applications. More recent work has investigated
task-based approaches and sought to integrate higher-level issues, such as software
architectures and frameworks for visualization systems.
The aim of this chapter is to provide background for understanding information un-
certainty modeling and visualization by examining relevant works and identifying key
issues. The chapter is organized as follows. Section 2.2 describes information uncer-
tainty in general, covering sources of information uncertainty, understanding informa-
tion uncertainty and its usage, and information uncertainty modeling techniques. Sec-
tion 2.3 discusses relevant issues in visualization, focusing on the process and sense-
making cycle, and visualization techniques. Section 2.4 examines current progress and
key techniques in information uncertainty visualization. A summary of this chapter is
given in Section 2.5.
2.2 Information Uncertainty
In many circumstances the true value of a variable is not fully known, giving rise
to information uncertainty. The information that is known about the variable can be
39. 2.2 Information Uncertainty 11
stored and this technique is referred to as information uncertainty modeling. As an
example, information uncertainty modeling can be used to aid analysis of the potential
environmental impact of a new road. Data is required about the type, amount, and
distribution of vegetation; the variety, location, and habits of local animals; and how
all of these interact. The data that is collected will only be accurate to a certain level
of precision, which can be modeled. Further, much of the information derived from
expert knowledge will be qualitative in nature and thus dependent on interpretation.
It is already a significant task to understand the structure, characteristics, trends,
and interdependency of data. However, information uncertainty serves to complicate
things even further as it requires an understanding of the propagation of uncertainty,
the potential for variation in outcomes, and impacts due to changes in the level of
information uncertainty. Effective visualization of the information and its uncertainty
can help to overcome this problem.
Historically, uncertainty had been regarded as an undesirable factor that is to be
avoided. Only in the 20th century has it become a fundamental component of sci-
ence [62]. However, the term uncertainty itself can vary depending on the author and
the field. For example, Hunter and Goodchild, dealing with spatial databases, reserve
the term uncertainty to refer exclusively to unknown inaccuracy and instead use the
term error for objectively known inaccuracy [47]. Pang et al. use the term uncer-
tainty to cover three categories [86]: statistical, including probabilistic and confidence
methods; error, which refers to differences between estimates and actual values; and
range, which covers intervals of possible values. Klir [58, 60] and Gershon [33] offer
a more general definition of uncertainty as some deficiency in information, and from
there define a measure of information in terms of reduction in uncertainty. Standards
and guidelines have been developed for the management of uncertainty in measure-
ment. One such guide by the National Institute of Standards and Technology (NIST)
describes measurements as approximations and contends that “the result is complete
only when accompanied by a quantitative statement of its uncertainty” [110, pp. 1]. A
40. 12 Chapter 2. Background
similar guide that was issued for analytical chemistry by Eurachem defines measure-
ment uncertainty as a parameter “that characterizes the dispersion of the values that
could reasonably be attributed to the measurand” [29, pp. 4]. The common theme is
that uncertainty can be characterized for a particular unit of information, and we use
the term information uncertainty to refer to situations where this condition holds.
2.2.1 Sources of Information Uncertainty
Pang et al. [86] investigated uncertainty visualization and categorized sources of in-
formation uncertainty based on the point of the visualization process in which it is
introduced. The resulting three categories are acquisition, where information uncer-
tainty is introduced from the measurements and models; transformation, introduced
during the information processing step for visualization; and visualization, referring to
the uncertainty introduced through the act of the visualization itself. These categories
are helpful in characterizing the introduced uncertainty for visualization, but lack gran-
ularity in describing the reason for the uncertainty. Thomson et al. [112] focused on
the tasks of information analysts in the field and used their descriptive terms to derive
a categorization for uncertainty in geospatially referenced information.
Information uncertainty can arise due to a number of reasons. Whenever predic-
tions are made, they are uncertain. Errors and imprecision in measurement are another
common source. The Eurochem guide lists eleven sources of measurement uncer-
tainty [29], but is careful to point out that these may not necessarily be independent.
While their list includes “operator effects” to cover human introduced uncertainty, the
sources are mostly concerned with acts of measurement. Pham and Brown [90] pro-
vide a categorization of uncertainty into three categories: factual, pseudo-measurement
and pseudo-numerical, and perceptual-based. Factual information is numerical and
measurement-based. Pseudo-measurement and pseudo-numerical information are nu-
meric approximations. Perceptual-based information is typically linguistic, but can
also be image- or sound-based. Table 2.1 lists typical sources of information un-
certainty and examples of causes (from [91]). Earlier work by Reznik and Pham
41. 2.2 Information Uncertainty 13
matched nine similar categories of uncertainty sources to uncertainty modeling tech-
niques [100].
Sources of information uncertainty Causes
Limited accuracy
Limitation in measuring instruments,
or computational processes, or
standards.
Missing data
Physical limitation of experiments;
limited sample size or
non-representative sample.
Incomplete definition
Impossibility or difficulty in
articulating exact functional
relationships or rules.
Inconsistency
Conflicts arisen from multiple sources
or models.
Imperfect realisation of a definition Physical or conceptual limitation.
Inadequate knowledge about the
effects of the change in environment
Model does not cover all influence
factors; or is made under slightly
different conditions; or is based on the
views of different experts.
Personal bias Differences in individual perception
Ambiguity in linguistic descriptions
A word may have many meanings; or
a state may be described by many
words.
Approximation or assumptions
embedded in model design methods or
procedures
Requirements or limitations of models
or methods.
Table 2.1: Sources and Causes of Information Uncertainty
2.2.2 Understanding Information Uncertainty
The search for truth is a goal of science and the presence of uncertainty can imply
a deficiency in our understanding. This explains why throughout most of recorded
history scientific thought has sought to avoid uncertainty2. However, attitudes toward
uncertainty have begun to shift, partly due to discoveries such as the Heisenberg un-
certainty principle. Today, uncertainty is viewed as an intrinsic property of problems
in most fields. For example, Couclelis noted that considerable effort had been devoted
to fighting uncertainty in Geographic Information Systems (GIS), but that there are
2The interested reader is directed to Appendix A of [2] for history of perspectives on knowledge.
42. 14 Chapter 2. Background
many things that cannot be known, and the inability to know was not due to human
limitation [19].
Many disciplines use information uncertainty modeling techniques to manage un-
certainty. The incorporation of information uncertainty techniques enables practition-
ers to describe and quantify the uncertainty space. Uncertainty information at the
inputs can then be propagated through the model to the outputs. The output can now
provide additional information. For example, how much confidence we should place in
the result, what alternatives the result may have, and others depending on the modeling
technique and the inputs.
A recent use for information uncertainty techniques is to simplify systems by re-
moving less important information. This mirrors human reasoning, where we reserve
detail for items of interest. For example, when ascertaining whether to jump out of the
way of a moving vehicle, a rough estimate of the vehicle’s velocity is usually sufficient
to determine the appropriate action [77] and precise knowledge of the actual velocity
is usually not necessary.
There are two main approaches to information uncertainty modeling and propaga-
tion. The first approach is to use analytical techniques, which require an understanding
of mathematical principles involved. The second approach is to use numerical tech-
niques, such as Monte-Carlo simulation. Sometimes numerical techniques are used
implicitly without the user realizing, usually by manually varying the inputs and ob-
serving their effects.
Uncertainty is so intrinsic to information that Klir [62, 63, 58, 61, 59, 60] has been
working on a generalized information theory, which has the aims of incorporating
uncertainty and information into a unified theory. Their approach is to conceive of
uncertainty-based information as being the result of a reduction in uncertainty. As a
result of some action, the a priori uncertainty U1 becomes the a posteriori uncertainty
U2 and the information derived from this action is therefore given by U1 −U2 [59].
Using information uncertainty modeling techniques not only provides greater con-
fidence in results, but can also give an indication of how much confidence to place in
43. 2.2 Information Uncertainty 15
the result.
2.2.3 Approaches to Modeling Information Uncertainty
There are numerous uncertainty modeling techniques and we will describe several of
the major ones here. Since the sources and causes of uncertainty are different, various
mathematical models have been developed to faithfully represent different types of
information. We summarize these models into four common types:
Probability denotes the likelihood of an event to occur or a positive match. Prob-
ability theory provides the foundation for statistical inference (e.g. Bayesian
methods) [1, 5].
Possibility provides alternative matches, e.g. a range of errors in measurement [2].
Provability is a measure of the ability to express a situation where the probability of
a positive match is exactly one. Provability is the central theme of techniques
such as Dempster-Shafer calculus.
Membership denotes the degree of match and allows for partially positive matches,
e.g. fuzzy sets [3, 6], rough sets [4].
Probability theory models uncertainty in terms of anticipation: the expectation that an
outcome will eventuate is characterized by a probability.
Classical probability theory describes the ratio between favorable and indifferent
outcomes, which has several shortcomings. This led to the development of frequentest
probability theory, which defines the chance of a given result under random conditions.
Thus, in a repeated experiment the probability of an event will tend toward the ratio
between the number of times it occurs to the number of times the experiment was run:
Pr(x) =
x
x+x
Where Pr : X → [0,1] is the probability function, x ∈ X is the event, and x = X −{x}
is all other outcomes. A probability distribution completely describes the expected
44. 16 Chapter 2. Background
outcomes of a random variable. For real-valued random variables, the probability dis-
tribution can be defined by
F(x) = ∑
xi≤x
Pr(xi)
for discrete probabilities and
F(x) =
x
−∞
f(t)dt
for continuous probabilities, where f is the probability density function. A proba-
bility density function (PDF) is effectively a histogram of expected outcomes, with a
scale such that the integral is unity, f(t)dt = 1.
Probability distributions can take any form, however several well studied distri-
butions exist. Of these, the two most commonly used distributions are the uniform
distribution and the normal distribution. The uniform distribution assigns every out-
come an equal probability. The normal distribution3 has a PDF of
F(x) =
1
σ
√
2π
e
−
(x−μ)2
2σ2
where μ is the mean and σ is the standard deviation. Normal distributions find
common use because the sum of many independent random variables will approximate
a normal distribution.
Monte-Carlo simulation is a numerical approach to uncertainty that uses probabil-
ity distributions [76, 3]. Input variables are assigned a probability distribution, com-
monly a uniform or normal distribution. Numerous random instances are chosen for
the inputs according to these distributions, and the outputs that are calculated can then
be used to characterize the outputs of the system.
The frequentest view of probability models an expectation based purely on fre-
quency of events. An alternative view is Bayesian probability theory, which has found
widespread use in fields such as machine learning and computer vision [89], and
3also known as the Gaussian distribution after Gauss
45. 2.2 Information Uncertainty 17
econometrics [35]. The mathematician Thomas Bayes introduced a theorem that was
generalized by Laplace but is still referred to as Bayes’ Theorem. The theorem relates
the conditional and marginal probability of events of two random variables, x and y:
Pr(x|y) =
Pr(y|x)Pr(x)
Pr(y)
Bayes’ theorem enabled a new philosophical view of probabilities as modeling
belief, using what is called Bayesian inference. Thus, our expectation of an event can
be revised: Pr(x|y) is the posterior probability, our revised expectation of event x given
evidence y; Pr(y|x) is the conditional probability of seeing y given the hypothesis that
x is true; Pr(x) is the prior probability of x; and Pr(y) is the marginal probability of y,
whether or not x is true.
Probabilities, whether frequentest or Bayesian, are not the only means of describ-
ing uncertainty. Classical sets are a means of defining uncertainty. For example, the
diagnosis made by a medical doctor could be that a patient suffers from the flu or the
cold. In this situation there is a possibility that either (or both) be true and there is un-
certainty about which it is. Lotfi Zadeh, in his seminal 1965 paper, proposed fuzzy sets,
which along with fuzzy logic enable human-like reasoning using partial truth4. A fuzzy
set is a set where each member can be assigned a truth value. This can be expressed as a
fuzzy membership function:μ : X → [0,1], where 0 indicates not a member, 1 indicates
definitely a member, and all numbers in between indicate a partial membership.
A good example of fuzzy sets is given by Mendel in [77]. College students were
asked to rank words such as “somewhat” and “quite a bit” against a numerical scale of
quantity. Although individual answers varied significantly, a clear ordering emerged
and Mendel was able to produce a mapping between these words and their indication of
quantity. From there fuzzy sets can be constructed, such as the set lots, which capture
the degree to which numeric values can be representations for each of the words.
4Zadeh was not the first to investigate partial truth and interested readers are directed to the works of
Łukasiewicz [71]
46. 18 Chapter 2. Background
For example, assuming that 24◦C is normal room temperature and 30◦C is com-
pletely hot, temperatures between 24◦C and 30◦C are partially hot. The set of hot
temperatures might therefore be given by the following membership function, illus-
trated graphically in Figure 2.1:
μHot(x) =
⎧
⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎩
1 x > 30
x−24
30−24 24 ≤ x ≤ 30
0 x < 24
20 22 24 26 28 30 32
1
0
Membership
Temperature
Hot
Figure 2.1: Fuzzy Set for Hot
Fuzzy logic is the inference counterpart for fuzzy sets. A complete fuzzy logic sys-
tem consists of a fuzzifier, inference engine, and defuzzifier [77]. The fuzzifier maps
numeric values to partial memberships of fuzzy sets. The inference engine executes
operations and rules on fuzzified information. The defuzzifier converts a fuzzy repre-
sentation into a bi-valued representation, typically using an α-cut.
The graph in Figure 2.2 shows an example fuzzifier that maps room temperatures to
the fuzzy sets Cold, Normal, and Hot. A temperature of 25.5◦C will be wholly outside
the set Cold, mostly within the set Normal, and to a lesser extent partially within the
set Hot.
Methods are defined for several operations including set fuzzy AND (intersection),
OR (union), and NOT (complement)5. Traditional fuzzy logic, also called Zadeh fuzzy
logic after its inventor, is shown graphically in Figure 2.3. Given two fuzzy variables,
5Implementations for operators can vary depending on application and interested reader is directed
to [77]
47. 2.2 Information Uncertainty 19
1
0
Cold Normal Hot
16 20 24 28 32
Temperature
Figure 2.2: Fuzzification for Temperature
a and b:
a∪b = aORb = min(μ(a),μ(b))
a∩b = aANDb = max(μ(a),μ(b))
¬a = NOT a = 1− μ(a)
Set A Set B
A AND BA OR B
NOT A
1
0
1
0
1
0
1
0
1
0
Figure 2.3: Results of Fuzzy Operations are Shown by the Grey Shaded Regions
Thus rules can be established using constructs such as D = A ∧ B ∨ ¬C, where A,
B, C, and D are fuzzy sets.
48. 20 Chapter 2. Background
The defuzzifier typically uses an α-cut, which is a mechanism to translate the fuzzy
output into traditional bi-valued truth, most typically:
μ (d) =
⎧
⎪⎪⎨
⎪⎪⎩
1 μ(d) ≥ α
0 otherwise
where t ∈ [0,1] and α is called the “α-cut plane” and is in the range [0,1]. An example
of defuzzification with an α-cut of 0.5 is given graphically in Figure 2.4. As the graph
shows, the set of normal temperatures is mapped to the interval [21.5,26.5].
1
0
N o r m a l
1 6 2 0 2 4 2 8 3 2
T e m p e r a t u r e ( º C )
alpha cut = 0.5
Figure 2.4: Defuzzification Using an α-cut
A popular representation for uncertainty is the rough set [95]. Rough sets extend
classical sets to allow an element to be both inside and outside the set. Thus there are
three modes: inside, outside, and both. Three operations are defined that translate to
a classical set: the upper limit, the lower limit, and the boundary. The upper limit in-
cludes all items that are wholly inside, or both inside and out. The lower limit includes
only those items that are inside the set. The boundary includes only items that are both
inside and outside the set. This is illustrated graphically in Figure 2.5. An example
application of rough sets is in classifying customer details: the rough set information
provided will contain a customer if all information has been provided, not contain a
customer if no information is provided, and be in both states if some information is
provided but some is missing. The company can send letters requesting information to
49. 2.3 Visualization 21
¬LOWER(c) where c is the “information provided” rough set.
Boundary Lower
Figure 2.5: Example Rough Set for Containment of a Region
Another common classical set-based uncertainty modeling technique is the inter-
val. Intervals define the upper and lower boundaries on a continuum, most commonly
R. The boundaries themselves can be inclusive, which is indicated using square brack-
ets; or exclusive, indicated using rounded brackets. Thus, [0,1) includes zero but ex-
cludes one. Interval arithmetic defines the propagation of uncertainty under common
arithmetic operators. For example, addition is defined as:
[a,b]+[c,d] = [a+c,b+d]
2.3 Visualization
2.3.1 The Sensemaking Process
The user has a visualization objective. To achieve this objective they will use a visual-
ization technique. The technique transforms information and displays it according to
the parameters of the technique. To reach their objective, users will typically iteratively
adjust the parameters and view the results, repeating as often as necessary. There are
three general classes of user objectives, which can also be described as visualization
phases [56]:
50. 22 Chapter 2. Background
1. Exploration, searching the data for relations and patterns
2. Analysis, exploring known relations
3. Presentation, preparing the visualization to communicate information to others
Visualization requires an iteration of choose −inspect −view −ad just, which can be
cumbersome, particularly for novice visualization users. Several studies have there-
fore considered improving the user experience while going through the visualization
process [21, 49, 92, 11].
One study sought to encode the visualization exploration process using an XML-
based language [49]. The encoding captures the parameters used for each iteration of
the loop. By defining a parameter derivation calculus, the results of several visualiza-
tion sessions can then be visualized. Such visualizations of visualization sessions are
designed to aid the user in understanding the progression of their use of the system.
Although the work seeks to formalize the visualization process, it does not improve the
process and is limited to modulating parameters of a particular visualization technique.
A significant drawback of the work in [49] is that the ability to change to another type
of representation or selection of alternate data are not included in the model.
Visualization is a tool and not an objective as of itself. However, it has been ob-
served (e.g. [73]) that some visualizations are good for publications, tending to be
colorful and showy images, but not informative or applicable to real-world problem-
solving. Ma [73] argues that scientists need to be involved in evaluating the effec-
tiveness of visualization methods, and suggests working with users from application
domains both to devise the requirements of the visualization and to subsequently eval-
uate the techniques through case studies.
Other suggestions for overcoming these obstacles is for visualization to be task-
driven instead of data-driven [92, 11]. One approach to this is through an agent-based
framework [92] (see Figure 2.6), where a profile agent observes the user’s choice of
visualizations and adjusts the systems behavior to improve workflow.
51. 2.3 Visualization 23
Figure 2.6: The Multi-agent Visualization Support System
Another proposal is the “Visualization Task Network (VTN)” [11] (see Figure 2.7,
from [11, pp.603]), which can learn the requirements of the user. A VTN is a task-
oriented approach, where the user first selects the task to be achieved. For each chosen
task a set of techniques are proposed by the system. Once a technique is chosen, a
list of attributes (e.g. glyph information, grid spacing, and color) is presented. These
parameters are similar to those in the work of Jankun-Kelly et al. [49]. Each time the
user selects a {task,technique,attribute} set for visualization, the system can increase
the weight of that combination. When the user selects a task, the techniques with the
highest weighting are shown first. Similarly, once a task and technique are chosen, the
attributes with the highest weighting are shown first.
One approach to mapping visual features to visualization techniques takes an objective-
oriented viewpoint, and is derived from the visualization data ontology outlined in [11].
The mapping begins with the choice of data attributes to be represented: relation-
ships, resemblances, order and proportion [34]. These attributes are then mapped to
visual features depending on the visualization task to be performed. The visualization
task is chosen to enhance the perception of the information required by the viewer for
their specific objective in performing the data analysis. The knowledge required for
such a task-oriented approach is encapsulated in an agent-based visualization architec-
ture [11].
A workshop that was held [27] established the visualization ontology represented
in Figure 2.8 (adapted from [27]). Development of the ontology involved investigation
52. 24 Chapter 2. Background
Figure 2.7: The Visualization Task Network (VTN) Learns Task-oriented Visualization
Parameters
of visualization from multiple perspectives. The result produces a clear anatomy of
visualization, except that it is missing one vital part: the role of the user. The user
plays an integral role in the visualization process, driving parameters and the visual-
ization tasks. By excluding the user from the ontology the authors have neglected to
consider not only usability and cultural issues, but also opportunities such as adaptive
visualization systems.
is aboutuses
Representation Data
Visualisation
Task Transformation
supported-by
input to
output from
Visual Haptic
Isosurface
Technique
is realised through
Key
A B A is-a B
A B
p
A
property p between
concepts A and B
Elided hierarchy
Figure 2.8: An Ontology of Visualization
2.3.2 Visualization Techniques
The topic of visualization is traditionally introduced through reference to a taxonomy
of visualization techniques [41, 98, 15, 16]. SIGGRAPH’s visualization education
53. 2.3 Visualization 25
program [41] introduces visualization techniques through a data-type based classifica-
tion, which is reproduced in Figure 2.9. Examples of selected techniques are given in
Figure 2.10. The limitation of this classification is that it deals only with continuous
ordinal values. Visualizations for other types of data, such as trees, are not included.
Figure 2.9: Visualization Techniques Categorized by the Type of Data to be Visualized
Shneiderman [106] recognized the lack of trees and network graphs and addressed
this by including non-ordinal types in their taxonomy. However, the taxonomy itself
continues to be based on the data-type being visualized. The data-types identified are
[106, pp. 337-339]:
54. 26 Chapter 2. Background
(a) 2D Line based Contouring (b) 2D Histogram (c) 3D Streamlines
Figure 2.10: Examples of Selected Visualization Techniques
• 1-Dimensional, such as textual documents, program source code, and alphabeti-
cal lists of names.
• 2-Dimensional, such as geographic maps, floor plans, and newspaper layouts.
• 3-Dimensional, real world objects such as molecules, the human body, and build-
ings.
• Temporal, such as medical records, project management, or hierarchical presen-
tations to create a data-type that is separate from 1-dimensional data.
• Multi-dimensional, such as records in relational databases.
• Trees, which are hierarchies where each item, except the root item, has a link to
its parent.
• Networks, which represent relations that cannot be captured as trees.
OLIVE [98] is an online catalog of visualization systems categorized according to
this taxonomy, although at the time of writing it is only current up to 1997. While
Shneiderman’s taxonomy covers a wider range of visualizations, not all visualization
systems fit conveniently. For example, visualizations that present temporally ordered
3-Dimensional data could fit into either the temporal- or the 3-Dimensional categories.
To overcome these inconsistencies Card and Mackinlay [15] offer a classification based
on additional factors that need to be considered during visualization. Their analysis of
55. 2.3 Visualization 27
visualization systems considers not only the type of data, but also the filtering functions
applied to them, the controlled (text) and automatic (glyph) processing techniques, the
viewing transformations, and the user interaction elements for every variable in the
visualization. Data types are classified as [15, pp. 92-93]:
• Nominal, meaning they are only equal or unequal to other values.
• Ordinal, meaning they obey a less-than relation.
• Quantitative, meaning it is possible to do arithmetic on them.
• Intrinsically spatial, which are the subset of quantitative types that represent spa-
tial points.
• Geographical, which are the subset of intrinsically spatial types that represent
geographic locations.
• A Set mapped to itself, which is the case in graphs and trees.
This taxonomy is cumbersome for purpose of categorizing visualization techniques.
Unlike the preceding taxonomies, there is no single category for a visualization tech-
nique. Instead, each variable used in the visualization is decomposed according to
twelve factors and presented in a matrix. The matrices of two techniques can be com-
pared to pinpoint the exact differences between them. The intent of the authors is not
only to describe the differences in visualization techniques, but also to suggest new
possibilities for visualization techniques [15, pp. 92].
Chi [16] provides a taxonomy to help implementers understand how to implement
visualization techniques. The proposed taxonomy is based on their earlier work on
the Data State Reference Model [18]. A visualization technique is broken down into
four stages according to the state of the data, as shown in Table 2.2, and the data
transformation operators that transform the data from one stage to another, as listed in
Table 2.3.
56. 28 Chapter 2. Background
Stage Description
Value The raw data.
Analytical Abstraction Data about data, or information, a.k.a.
meta-data
Visualization Abstraction Information that is visualizable on the
screen using a visualization technique
View The end-product of the visualization map-
ping, where the user sees and interprets
the picture presented to her
Table 2.2: Data Stages in the Data State Model
Processing Step Description
Data Transformation Generates some form of analytical ab-
straction from the value (usually by ex-
traction).
Visualization Transformation Takes an analytical abstraction and fur-
ther reduces it into some form of visual-
ization abstraction, which is visualizable
content.
Visual Mapping Transforma-
tion
Takes information that is in a visualizable
format and presents a graphical view.
Table 2.3: Transformation Operators in the Data State Model
Tory and Möller [113] argue that these taxonomies are vague because of the termi-
nology used. As an example they cite the use of the word “often” in Card and Mackin-
lay’s definition [113, pp.1]. To reduce this ambiguity, their taxonomy is based on the
data model rather than the type of data itself. A data model is a representation of data
that may include structure, attributes, relationships, and the data values themselves.
Visualization algorithms create visual representation of data using a data model. The
taxonomy is outlined in Figure 2.11. The visualization algorithms are first classified
as continuous or discrete. Scientific visualization corresponds largely to continuous
models while information visualization corresponds largely to discrete models.
Unlike Card and Mackinlay’s taxonomy, the model-based taxonomy maintains
scalar, vector, and tensor categories for dependent variables. Additionally the taxon-
omy shows greater flexibility than Shneiderman’s taxonomy, categorizing temporally
ordered 3D data as nD data in the continuous model. A limitation of the taxonomy is
that it does not treat temporal data as distinct from 1D data.
57. 2.4 Information Uncertainty Visualization Approaches 29
Figure 2.11: The Model-based Visualization Taxonomy
2.4 Information Uncertainty Visualization Approaches
Johnson and Sanderson argue that “development of formal theoretical frameworks and
the new visual representations of error and uncertainty will be fundamental to a better
understanding of 3D experimental and simulation data” [51, pp. 5]. This is a relatively
new field in visualization, which is generally referred to as uncertainty visualization.
However, uncertainty and its sources are diverse and the term can have broad meaning.
On the other hand, error visualization (e.g. [51, 83]) and fuzzy visualization (e.g. [93,
39, 5]) imply particular uncertainty modeling techniques. In this thesis we use the
term information uncertainty visualization to refer to visualization of all modeling
techniques where the uncertainty can be codified in information. Thus error- and fuzzy-
visualization are sub-categories of information uncertainty visualization, which is itself
a sub-category of uncertainty visualization. This relationship is shown in Figure 2.12.
Visualization techniques map data variables and information to visual feature di-
mensions for the purpose of highlighting trends, making comparisons, establishing
outliers, examining data composition, and similar reasons. The introduction of uncer-
tainty requires that appropriate visual features be selected to represent it. Blurring sim-
ulates the visual precept caused by an incorrectly focused visual system and therefore
has the most immediately intuitive mapping for uncertainty [91]. Blurring effectively
58. 30 Chapter 2. Background
uncertainty visualization
information
uncertainty
visualization
error
visualization
f u z z y
visualization
Figure 2.12: Relationship between Uncertainty Visualization, Information Uncertainty
Visualization, Error Visualization, and Fuzzy Visualization
smears the boundary of the graphic representing the data value, creating a sense of
uncertainty as to where it begins and ends. A number of visual features may be used
in a similar manner, including hue, luminance, saturation, and can be extended into the
temporal domain through animation [67, 34, 10].
Brown [10, pp. 84] offers a summary of available features drawn from the literature
(e.g. [44, 33, 90]):
• Intrinsic representations - position, size, brightness, texture, color, orientation,
and shape;
• Further related representations - boundary (thickness, texture and color), blur,
transparency and extra dimensionality;
• Extrinsic representations - dials, thermometers, arrows, bars, different, shapes,
and complex objects - pie charts, graphs, bars, or complex error bars
2.4.1 Low-level Features
We now consider how several low-level features can be used to indicate uncertainty
within information. The features to be considered are: hue and luminance, opacity,
depth, texture, particles, glyphs, and sonification.
59. 2.4 Information Uncertainty Visualization Approaches 31
Hue and Luminance are commonly used to highlight data that is different, or to rep-
resent gradients in the data [115, 56]. Saturation of the hue can be used to high-
light the precision or certainty of the data. The more saturated the hue, the
more certain or crisp the value contained in that region is, while low saturation
regions have the appearance of washing into each other, and can be used to in-
dicate the fuzziness of spatial region boundaries [50, 42]. Variation in hue can
also be used to indicate precision. Regions of higher uncertainty can have fewer
shades, while more precise areas have a smoother appearance. A lack of back-
ground/foreground separation (e.g. red on purple) can also imply uncertainty, as
the region may only just be distinguishable [122]. Brown and Pham [93] used
the color hues to represent the membership values of data points. Color hues
were also used by Lowe et al. [70] to represent belief values in the form of a
flame to facilitate decision making in an anaesthetic monitoring system.
Opacity offers an intuitive method for creating blurriness. The more uncertain regions
can be shown with reduced opacity, creating a ghost-like effect. The inverse ap-
proach, used by Djurcilov et al. [24, 25], is to map regions of high uncertainty
to high opacity, thus drawing attention to the uncertain areas in volume visu-
alization (see Figure 2.13). Johnson and Sanderson [51] show an example of a
Magnetic Resonance Imaging (MRI) scan with an added error volume. The error
volume represents the space of possible variation and is transparent so that the
other data is still visible.
Depth can be used to indicate an order or spatial positioning for the data. Pang et
al. [86] and Brown [10] displayed intentionally different images to each eye,
exploiting a lack of binocular fusion to indicate fuzziness. Blurring or depth
of field effects from spatial frequency components being removed in the image
plane can be used to show the indistinct nature of data points [34, 64].
Texture may be applied to objects to indicate the level of precision, ambiguity or
fuzziness in the spatial location upon an object or upon a spatial location. Pang
60. 32 Chapter 2. Background
Figure 2.13: Using Opacity to Show the Structure of Uncertainty. Color Scheme (left),
Normal Rendering (center), Uncertainty Structure (right)
and Alper [84] used random normal perturbation to create a textured surface.
The effect was proportional to the amount of uncertainty, creating rough regions
where the uncertainty is high. Certain shimmering effects, usually to be avoided
in visualization [115], can be used to indicate ambiguity within the region [111].
Particles can be used to represent the uncertainty of a region or object by varying
their density, opacity, and color. Grigoryan and Rheingans [37, 38] use particle
density to indicate uncertainty. These particle clouds create a similar effect to
transparent volumes. Cartography often also uses a form of this by drawing
dashed lines to represent imprecise lines and boundaries, or by using different
dot densities to represent shading effects [36].
Glyphs are the most widespread methods for displaying uncertainty. The size of a
glyph is often used to indicate a scalar measure of uncertainty. For example, error
bars are a traditional technique for indicating errors in measurement [115]. The
larger the error bar, the more uncertainty there is. This concept was expanded
upon by Pang and Freeman [85], who used the size of spherical and ellipsoidal
glyphs to indicate uncertainty in radiosity applications. Lodha et al. [67] inves-
tigated uncertainty glyphs for flow visualizations, also using length to indicate
degree of disagreement. In separate work [68] they used glyphs to show variation
between surface interpolants, finding them to be more precise than using other
61. 2.4 Information Uncertainty Visualization Approaches 33
features. Wittenbrink et al. [123] mapped variation in vectors to glyph length
and width, to show uncertainty in magnitude and direction. In the same work
they explored glyphs in keyframed animation to expose differences between in-
terpolation techniques.
Sonification is an approach that was explored early on. There are two main meth-
ods, one is to map the uncertainty directly to the pitch or volume, while the
second uses the degree of uncertainty to regulate a noise generator. Fisher [31]
allowed the user to scan a cursor over a landscape while the program emitted
sound depending on the degree of uncertainty. Lodha et al. [66] went further by
allowing multiple sound variables to be mapped simultaneously, thus increased
the amount of information conveyed.
These low-level features offer an added dimension to which we can map uncertainty
information for a particular plot point. Zhou and Pang [124] looked at several examples
to visualize the level of error between original and reduced resolution meshes in a
multi-resolution mesh algorithm (see Figure 2.14). We now consider how these are
used in higher level constructions and methods that require multiple data points. In our
discussion we include different spatial arrangements, use of image based techniques,
addition and modification of geometry, and the use of animation.
Figure 2.14: Visual Mappings Showing Difference. From Left to Right: Overlay,
Rainbow Mapping, White-black-white Pseudo-coloring, Glyph (Hi-pass), Glyph (low-
pass)
62. 34 Chapter 2. Background
2.4.2 Higher-level Constructions
Uncertainty can be represented in several ways using 2D Cartesian graphs. Some
examples of graphs include histograms, bar charts, tree diagrams, time histories of 1D
slices, maps, iconic and glyph-based diagrams. For example, graphs are often used to
represent the fuzzy membership functions (e.g. Figures 2.1-2.4) or probability density
functions. The structure and inter-relationships of rules can be illustrated using graphs,
trees and flowcharts.
Fuzzy rules involving two inputs can be graphed in three dimensions. Figure 2.15
shows an example from the Matlab Fuzzy Toolbox [75], where the output shows the
amount of tip, as determined by the quality of the food and service. Nürnberger ex-
plored drawing such classifiers as overlapping pyramid shapes. 2D classifiers are visu-
alized as contours for a top-down view [80], whereas 3D classifiers are 3D shapes. An
extension to this work discusses the effects that antecedent pruning has on the shapes
[81]. Pruning of antecedents involves removal of restrictive rules and simplification
of existing rules with the aim of improving the ability of the classification system to
generalize to previously unseen input data. The authors argue that rule simplifications
can have a dramatic impact on results and that visualization of these changes can pro-
vide an intuitive aid for fuzzy classifier designers. While the technique produces an
intuitive aid, the authors have not gone far enough. Since the classifier is visualized as
a shape that occupies the same space as the data, it suggests that it can be visualized
together with the data. This would allow the user to observe how the data points of a
particular data set classify, particularly when combined with animation or interactive
techniques. Possible extensions include using size, color, and translucency to enhance
perception of the classification given to a data point. Cox et al. [20] applied thresholds
to produce convex hull plots of data point clusters, using glyphs of different shapes and
sizes for the data points.
A limitation of these techniques is that they are not well suited to multi-dimensional
data. Techniques such as multi-dimensional scaling [6] and parallel coordinates [39]
63. 2.4 Information Uncertainty Visualization Approaches 35
Figure 2.15: Tip Level Based on the Quality of the Food and Service Using Fuzzy
Inference
provide ways to display multi-dimensional fuzzy data in 2D without loss of informa-
tion. However, the degree of membership is not indicated in a standard parallel coordi-
nate plot. Berthold and Hall [5] use blurring to expose the level of fuzzy membership
on parallel coordinates. An alternative proposal by Pham and Brown [90] extends
coordinates to the third dimension, where the new dimension represents the member-
ship value. One technique for multi-dimensional scaling involves an algorithm that
minimizes the inter-point distances. The rule set is then visualized as a 2D scatter
plot, where gray scales denote different classes and the size of each square indicates
the number of examples [93]. Another technique for viewing high-dimensional fuzzy
rules in 2D places rules as shapes on a grid. The distance between rules in high-
dimensional space is mapped to their distance in 2D. The technique uses a gradient
descent algorithm to minimize the error between the 2D and actual distances [6].
When visualizing clusters it is often a requirement to find outliers in the data. One
method to improve the identification of outliers in fuzzy classification problems is to
modify the “objective function”. Keller proposed additional weighting parameters for
“representativeness” [55, pp. 143]. The application of this technique produces the
same principal clusters, but outliers are more easily detected since they are excluded
to a greater degree from the fuzzy clusters.
Fujiwara et al. [32] and Gershon [34] produced a 3D flowchart to represent rule
structure to facilitate understanding of rule-based programs. This is an extension of
the cone tree visualization technique [101]. Dickerson et al. [23] used a graph to
64. 36 Chapter 2. Background
encode relationships in a complex interacting system. This technique is useful for
encoding expert information which is commonly present in fuzzy control systems.
Brown and Pham [11] extended these techniques further to by mapping additional
features to uncertainty (such as opacity) for each node.
Image based techniques can also be used to convey uncertainty. These methods are
the uncertainty analogs of image based visualization techniques such as Line Integral
Convolution (LIC) [40]. In these methods a pattern is generated that abstractly reflects
the uncertainty. One difference between image based techniques and glyphs is that
image based techniques apply a regular pattern over a continuous area. This avoids
clutter sometimes experienced by glyph techniques where the glyphs obstruct one an-
other. Sanderson et al. [103] used reaction-diffusion models in flow visualizations and
conveyed uncertainty through spot size and orientation.
Pang and Freeman [85] (see also [86]) observed that geometry can be added or
modified to indicate uncertainty. An example of modification is to create a texturing-
like effect by perturbing the orientation of faces within a geometric mesh model. The
amount of perturbation is governed by the degree of uncertainty. There are two com-
mon examples of adding geometry. The first is to add geometry for a single data point,
typically to give a direct indication of the extent over which the object can exist. An-
other is to connect successive data point extents, simulating the volume of possibility.
An example of the latter was demonstrated by Lopes and Brodlie [69], who used tubes
for particle flow visualization.
All of the low-level visual features that have been discussed can be animated. For
example, using motion blur, flickering, animated glyphs, etc. to represent the precision
of the measurements of a moving object [123, 10]. Brown [10] explored temporal vi-
brations for conveying uncertainty. The vibrations oscillate between values fast enough
to be pre-attentive [114], causing a shimmering effect that implies uncertainty about
its true position. Figure 2.16 shows frames from a movie of the luminance oscillation
technique, with a region of high uncertainty framed within the dashed rectangle. This
technique can also be applied in stereographic displays to facilitate a lack of binocular
65. 2.4 Information Uncertainty Visualization Approaches 37
fusion [10].
Figure 2.16: Two Frames from a Luminosity Oscillation Animation
Probability distributions are often graphed as 2D line graphs. Kao et al. [52, 53,
54] explored showing multiple data points, each of which is subject to a probability
distribution. In one example they overlaid the probability density functions for points
of interest, as shown in Figure 2.17. Other approaches included using color, texture,
and heightmaps to indicate uncertainty. Luo et al. [72] plotted many small histograms
in a small multiples [115] technique.
Figure 2.17: Visualization With Probability Density Functions Over Associated Data
Points
The Geographic Information Systems (GIS) field has had a particular interest in
information uncertainty visualization. MacEachren et al. [74] and Slocum et al. [107]
methodically review the state of play with respect to uncertainty visualization in GIS.
66. 38 Chapter 2. Background
Outside of this field, the development of visualization techniques for information un-
certainty is typically ad hoc, being created for a specific modeling technique or appli-
cation. All of these represent important steps forward, however, an integrated frame-
work to manage the modeling and visualization of information uncertainty is currently
missing.
2.5 Summary
In summary, the information uncertainty modeling and general visualization fields
have separately been well studied. Several visualization techniques have been created
for the various information uncertainty models. However, which one is best suited to
the task at hand, and how will uncertainty modeling and propagation be tracked and
interpreted properly? What happens when the information uncertainty changes? In-
formation uncertainty modeling represents our knowledge and expectation about the
behavior of a variable under uncertainty, and this knowledge may be subject to change
over time, particularly as new information comes to light. Currently, there is no inte-
grated framework for the modeling, propagation, and visual mapping of information
uncertainty. Furthermore, there is no framework that can adapt to changes in informa-
tion uncertainty.
67. CHAPTER 3
Framework for Integrated Uncertainty
Modeling and Visualization
3.1 A New Approach to Information Uncertainty
Traditional visualization systems, which typically do not deal with information uncer-
tainty, can still be subject to dynamic data. For these systems the dynamism refers to
changes in value. The result is that the visualization needs to be recalculated, which is
a straight forward process. Changes in information uncertainty, on the other hand, pro-
vide a unique challenge: the actual modeling technique used can change in response to
changing information. Therefore, the data-type of the variable can be dynamic. This
is clearly illustrated by the case of prediction: before the event comes to pass, the un-
certainty can be modeled using a number of techniques; once the event has passed, the
prediction can be updated with the actual outcome. 1Thus the visualization must not
only be recalculated, but must also adapt to this new data-type.
Adapting to new data-types for a visualization is not a straight forward process.
1Assuming the outcome is known, the data-type then becomes one of absolute certainty.
68. 40 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
Visualization techniques are designed for a particular data-type and may not support
the new data-type without modification. For example, line graphs rely on a series of
values between which line segments are connected. Should the source of information
be defined by a series of intervals, then the traditional line graph is no longer appro-
priate. One suitable modification turns the line segments into convex polygons, whose
edges are defined by the upper and lower bounds of the interval.
It is recognized that it is important that visualization systems convey uncertainty [83].
Many problems are subject to uncertainty and, as a consequence, visualization research
has produced several visualization technique modifications to support the uncertainty.
For example, transparent volumes have been added to volume renderings to indicate
potential error (e.g. [51, pp.9]), and parallel plots were extended into the third dimen-
sion to handle fuzzy variables [93]. This type of work continues, and the outcomes
continue to be data-type specific.
The objective of this thesis is to integrate the process of modeling and visualizing
information uncertainty into an extensible and adaptive visualization framework. Such
a framework will provide greater uniformity for the field and enable both practitioners
and researchers to reduce the data-type and visualization technique dependency. The
process that the user follows when armed with such a tool can therefore change.
The typical process that a user follows when dealing with uncertainty consists of
the following steps.
1. Decide on variables
2. Decide on uncertainty data-type(s)
3. Build the data model, propagating uncertainty manually
4. Construct visualization(s), incorporating uncertainty where techniques are avail-
able and appropriate
In practice, steps 3 and 4 will be repeated as information changes. However, step 2 will
rarely be revisited. The significant point of this process is that the uncertainty model
69. 3.2 Analysis of Issues and Requirements 41
is decided upon before the user’s data model is built. This can be unintuitive, as the
amount of uncertainty can change depending on how the model pans out. If it were
easy to add, change, or remove uncertainty details at any point in the process, then the
typical process changes, as follows.
1. Decide on variables
2. Build an initial data model
3. Construct visualization(s)
4. Add/remove/change uncertainty information
Where step 4 can occur anywhere after step 1, and can be repeated as often as is
necessary. Under such a process the uncertainty information is viewed as a refinement
on details that does not fundamentally change the data model.
This chapter describes an integrated framework for the modeling and visualization
of information uncertainty. This framework is adaptive to changes in uncertainty in-
formation, allowing the user to select the appropriate techniques for the task at hand.
In Section 3.2 we consider the issues that must be overcome, from which we derive re-
quirements of this framework. Section 3.3 describes the components of the framework
to meet the requirements. Section 3.4 provides a summary of key points.
3.2 Analysis of Issues and Requirements
This section examines the issues that confront users when they seek to visualize infor-
mation uncertainty. From a theoretical perspective there are three main issues. Firstly,
visualization techniques are based around specific uncertainty data-types. Thus, vi-
sualization techniques tend to be ad hoc. Secondly, there is incoherence between in-
formation uncertainty modeling techniques. This locks users into a particular model-
ing technique, the appropriateness of which may change as the information evolves.
Thirdly, information uncertainty modeling and visualization is hampered by an artifi-
cial separation between the value of a variable and the uncertainty model of that value.
70. 42 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
This poses problems that affect the robustness of user models and the effort required
to maintain them.
From a practical point of view, the user is required to have both a comprehen-
sive understanding of uncertainty as well as sophistication with visualization tools.
Comprehensive understanding is required, because the user must manually encode and
propagate uncertainty information; sophistication with visualization tools is required
to allow for the unusual demands of mapping uncertainty to visual elements. Many
tools lack support for information uncertainty modeling and visualization, leading de-
termined users to cobble together multiple tools.
3.2.1 Ad hoc Visualization Techniques
Sensemaking Cycle and Changes in Uncertainty Information
Visualization is “the bringing out of meaning in information” [56]. It is performed it-
eratively and usually as part of the sensemaking cycle [102, 17]. The iterative looping
is not exclusive to mapping data into visual form; instead, users sometimes return to
the data model to gather or transform data. This is particularly true for information un-
certainty. For example: uncertainty details can be deemed to be more important later,
once the basic model is in place; or the uncertainty details may change as more be-
comes known about the variables. Therefore, frameworks for information uncertainty
visualization should ideally allow the user to go back to make changes with minimal
effort.
Flexibility
Visualization of information uncertainty is different to visualizing other forms of in-
formation for two main reasons. Firstly, information uncertainty is always associated
with a particular unit of information. This means that the uncertainty cannot be freely
visualized without regard to its interpretation relative to the information to which it
belongs. Secondly, information uncertainty is usually mapped differently to visual el-
ements. For example, uncertainty is commonly mapped to intrinsic properties, such as
71. 3.2 Analysis of Issues and Requirements 43
transparency or color; or by adding a dimension to geometry, such as using a surface
where there would otherwise be a line. Therefore, a visualization system for informa-
tion uncertainty requires the flexibility to allow users to map uncertainty to compound
visual elements, including intrinsic properties and adding dimensions to geometry.
Figure 3.1 demonstrates how information uncertainty is associated with informa-
tion, but typically mapped differently to visual elements. Four graph visualizations of
historical and predicted employment rates in California are shown. The first graph (a)
assumes that growth will continue at the average growth rate of the past 15 years and
is therefore visualized using traditional means. While the information in graph (a) is
modeled as not being subject to uncertainty, it requires the unreliable assumption about
employment rates to be made. The graph in (b) estimates that the growth will continue
at the average rate. The fact that the predictions are estimates is indicated by the line
stippling, an intrinsic property of the line. The graph in (c) shows the possible range
within the maximum and minimum growth rates experienced in the past 15 years. The
uncertainty is indicated by extending the one dimensional line into a two dimensional
polygon. The graph in (d) uses a normal distribution centered on the average growth
rate. The uncertainty is indicated by both extending the dimensionality of the line as
well mapping to the intrinsic property of opacity.
Heterogeneity in Uncertainty Information
Several uncertainty visualization techniques have been developed for particular uncer-
tainty types. However, in an environment where the uncertainty type can change to bet-
ter suit the needs of the user, such restrictive preconditions for visualization techniques
provide a return to the tyranny of uncertainty type lock-in. Therefore the approach to
visualization of information uncertainty requires visualization to provide greater con-
sistency across different uncertainty modeling techniques.
72. 44 Chapter 3. Framework for Integrated Uncertainty Modeling and
Visualization
(a) (b)
(c) (d)
Figure 3.1: Visualizations of Employment Numbers in California. Years 2005-2010
are Predicted. (a) Assuming Average Growth (b) Indicating Growth is Estimated (c)
Possible Growth (d) Likely Growth. (Data Source: California Employment Develop-
ment Department)
73. 3.2 Analysis of Issues and Requirements 45
Homogeneous Access
To enable the visual mappings that expose the uncertainty in variables, it is necessary
to have access to the associated uncertainty details. However, there are numerous un-
certainty modeling techniques that use different methods for encoding the uncertainty.
This creates a barrier to visualizing uncertain information because visual mappings
that work with one uncertainty modeling technique may not be easily transferable to
another. Such inconsistency creates a strong dependency between visualizations and
the data-types used in the model, limiting the user’s ability to update the data model.
Therefore, a generalized means for accessing uncertainty information should be sought
to enable a consistent environment information uncertainty visualization.
Plurality of Values
Fundamental to the concept of information uncertainty is the ability for a variable to
hold multiple values simultaneously; in other words, the variable has multiple possible
collapses. This plurality of values represents the deferral of the approximation decision
- the true value of a variable may be one of multiple candidates, each of which should
be considered a possibility.
3.2.2 Incoherence of Uncertainty Models
Uncertainty Data Type Lock-in
There is usually no support for changing from one uncertainty modeling technique to
another. Adding uncertainty information to data allows the user to specify a greater
level of detail about the data. However, changing the uncertainty data-type typically
requires users to reconstruct the affected portion of the data model, often involving a
fundamental change in form. This makes the data model rigid and, as a consequence,
users will typically need to anticipate their use of uncertainty and build their model
accordingly.