BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
Learning-Based Evaluation Visual Analytics Systems
1. Learning-Based Evaluation of Visual
Analytics Systems
Remco Chang, Caroline Ziemkiewicz, Roman
Pyzh, Joseph Kielman*, William Ribarsky
UNC Charlotte *Department of Homeland
Charlotte Visualization Center Security
2. Why Another Evaluation Method?
• Based on a discussion with Joe Kielman (DHS)
– Why is it difficult for agencies like the DHS to adopt
and use visual analytics systems?
• Most existing metrics are not indicative of
success of adoption
– Task completion time
– Errors
– Subjective preferences
– Etc.
3. Current Methods
• Methods for evaluating visual analytics
systems have been proposed. Each has its
unique perspective and goal. For example:
– Insight-based Evaluation (North et al.)
– Productivity-based Evaluation (Scholtz)
– MILC -- Multi-dimensional in-depth long-term case
studies (Schneiderman, Plaisant)
– Grounded Evaluation (Isenberg et al.)
4. Our Goal for Evaluation
• What Joe wants is:
– Proof that the user of the visual analytics system can gain
proficiency in solving a problem using the system
– By using the VA system, show that a user can gradually
change from being a “novice” to becoming an “expert”
• In other words, Joe wants proof that by using the VA
system, the user is gaining knowledge…
– The goal of visualization is to gain insight and knowledge
(ViSC report, 1987) (Illuminating the Path)
5. Learning-Based Evaluation
• In light of this goal, we propose a “learning-based
evaluation” that attempts to directly test the
amount of knowledge gained by its user.
• The idea is try to determine how much the user
has learned after spending time using a VA
system by:
– Giving a user a similar but different task.
– Directly testing if the user has gained proficiency in
the subject matter.
8. Types of Learning
• In designing either a new task or the
questionnaire, it is important to differentiate
and isolate what is being tested:
– Knowledge gained about the Interface
– Knowledge gained about the data
– Knowledge gained about the task (domain)
9. iPCA Example
• iPCA stands for
“interactive Principle
Component Analysis”. By
using it, the user can learn
about:
– The interface
– The dataset
• relationships within the
data
– The task
• What is principle
component analysis, and
• How can I use principle
component analysis to solve
other problems?
10. Application to the VAST Challenge
• Current method:
– Give participants a dataset and a problem
– Ask participants to develop VA systems to solve
the problem
– Ask participants to describe their systems and
analytical methods
– Judges score each submission based on the
developed systems and their applicability to the
problem
11. Application to the VAST Challenge
• Proposed method:
– Give participants a dataset and a problem
– Ask participants to develop VA system to solve the
problem
– Ask participants to bring their systems to VisWeek
– Give participants a similar, but different dataset and
problem
– Ask participants to solve the new problem using
their VA systems
– Judges score each participant based on the
effectiveness of each system in solving the new task.
12. Types of Learning
• In designing either a new task or the
questionnaire, it is important to differentiate
and isolate what is being tested:
– Knowledge gained about the Interface
– Knowledge gained about the data
– Knowledge gained about the task (domain)
13. Discussion/Conclusion
• This learning-based method seems simple and obvious
because it really is. Teachers have been doing this for
ages.
• The method is not unique. There are many aspects of
this proposed method that are similar to existing
methods. In spirit, we are all looking to address the
same problem.
• The difference is the perspective. If we think about the
problem from the perspective of a client (e.g., Joe at
DHS), what they look for in evaluation results currently
are not the same as what we as researchers give them.
14. Future Work
• Integrate the proposed learning-based
method to:
– Grounded Evaluation
– Long term effects (MILC)
16. The Classroom Analogy
• Say you’re a math teacher in middle school, and
you’re trying to decide which text book to use,
the blue one or the red one. You can:
– Ask your friends which book is better
• Analogous to an “expert-based evaluation”. Problem is that
the sample size is typically small, and the results difficult to
replicate.
– Ask your students which book they like
• Analogous to subjective preferences. The issue here is that
the students can prefer the blue text book because its blue.
– Test which text book is more effective by giving the
students tests.