SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Learning-Based Evaluation of Visual
Analytics Systems


   Remco Chang, Caroline Ziemkiewicz, Roman
     Pyzh, Joseph Kielman*, William Ribarsky

    UNC Charlotte                    *Department of Homeland
    Charlotte Visualization Center   Security
Why Another Evaluation Method?

• Based on a discussion with Joe Kielman (DHS)
   – Why is it difficult for agencies like the DHS to adopt
     and use visual analytics systems?


• Most existing metrics are not indicative of
  success of adoption
   –   Task completion time
   –   Errors
   –   Subjective preferences
   –   Etc.
Current Methods

• Methods for evaluating visual analytics
  systems have been proposed. Each has its
  unique perspective and goal. For example:

  – Insight-based Evaluation (North et al.)
  – Productivity-based Evaluation (Scholtz)
  – MILC -- Multi-dimensional in-depth long-term case
    studies (Schneiderman, Plaisant)
  – Grounded Evaluation (Isenberg et al.)
Our Goal for Evaluation

• What Joe wants is:
   – Proof that the user of the visual analytics system can gain
     proficiency in solving a problem using the system

   – By using the VA system, show that a user can gradually
     change from being a “novice” to becoming an “expert”

• In other words, Joe wants proof that by using the VA
  system, the user is gaining knowledge…
   – The goal of visualization is to gain insight and knowledge
     (ViSC report, 1987) (Illuminating the Path)
Learning-Based Evaluation

• In light of this goal, we propose a “learning-based
  evaluation” that attempts to directly test the
  amount of knowledge gained by its user.

• The idea is try to determine how much the user
  has learned after spending time using a VA
  system by:
   – Giving a user a similar but different task.
   – Directly testing if the user has gained proficiency in
     the subject matter.
Current Method
Our Proposed Method
Types of Learning

• In designing either a new task or the
  questionnaire, it is important to differentiate
  and isolate what is being tested:

  – Knowledge gained about the Interface
  – Knowledge gained about the data
  – Knowledge gained about the task (domain)
iPCA Example
• iPCA stands for
  “interactive Principle
  Component Analysis”. By
  using it, the user can learn
  about:
   – The interface
   – The dataset
      • relationships within the
        data
   – The task
      • What is principle
        component analysis, and
      • How can I use principle
        component analysis to solve
        other problems?
Application to the VAST Challenge

• Current method:
  – Give participants a dataset and a problem
  – Ask participants to develop VA systems to solve
    the problem
  – Ask participants to describe their systems and
    analytical methods
  – Judges score each submission based on the
    developed systems and their applicability to the
    problem
Application to the VAST Challenge

• Proposed method:
  – Give participants a dataset and a problem
  – Ask participants to develop VA system to solve the
    problem
  – Ask participants to bring their systems to VisWeek
  – Give participants a similar, but different dataset and
    problem
  – Ask participants to solve the new problem using
    their VA systems
  – Judges score each participant based on the
    effectiveness of each system in solving the new task.
Types of Learning

• In designing either a new task or the
  questionnaire, it is important to differentiate
  and isolate what is being tested:

  – Knowledge gained about the Interface
  – Knowledge gained about the data
  – Knowledge gained about the task (domain)
Discussion/Conclusion
• This learning-based method seems simple and obvious
  because it really is. Teachers have been doing this for
  ages.

• The method is not unique. There are many aspects of
  this proposed method that are similar to existing
  methods. In spirit, we are all looking to address the
  same problem.

• The difference is the perspective. If we think about the
  problem from the perspective of a client (e.g., Joe at
  DHS), what they look for in evaluation results currently
  are not the same as what we as researchers give them.
Future Work

• Integrate the proposed learning-based
  method to:
  – Grounded Evaluation
  – Long term effects (MILC)
Thank you!




         rchang@uncc.edu
http://www.viscenter.uncc.edu/~rchang
The Classroom Analogy

• Say you’re a math teacher in middle school, and
  you’re trying to decide which text book to use,
  the blue one or the red one. You can:
  – Ask your friends which book is better
     • Analogous to an “expert-based evaluation”. Problem is that
       the sample size is typically small, and the results difficult to
       replicate.
  – Ask your students which book they like
     • Analogous to subjective preferences. The issue here is that
       the students can prefer the blue text book because its blue.
  – Test which text book is more effective by giving the
    students tests.

Weitere ähnliche Inhalte

Was ist angesagt?

Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...
Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...
Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...Jakub Šimko
 
Usability Evaluation in Educational Technology
Usability Evaluation in Educational Technology Usability Evaluation in Educational Technology
Usability Evaluation in Educational Technology Alaa Sadik
 
Consumer Oriented Evaluation Ppt
Consumer Oriented Evaluation PptConsumer Oriented Evaluation Ppt
Consumer Oriented Evaluation PptSpeedballjr
 
Multiple Response Questions - Allowing for chance in authentic assessments
Multiple Response Questions - Allowing for chance in authentic assessmentsMultiple Response Questions - Allowing for chance in authentic assessments
Multiple Response Questions - Allowing for chance in authentic assessmentsMhairi Mcalpine
 
pilot testing of questionnaire
 pilot testing of questionnaire pilot testing of questionnaire
pilot testing of questionnaireSarojKumarBanjara
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Topic 7 Product Evaluation
Topic 7 Product EvaluationTopic 7 Product Evaluation
Topic 7 Product EvaluationJutka Czirok
 
Design of Experiments (DoE) Seminar Outline
Design of Experiments (DoE) Seminar OutlineDesign of Experiments (DoE) Seminar Outline
Design of Experiments (DoE) Seminar OutlineAccendo Reliability
 
Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking processUsing GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking processSara Marsham
 
Usability Testing
Usability TestingUsability Testing
Usability Testingmbrosset
 
Beat the odds evaluation model table
Beat the odds evaluation model table  Beat the odds evaluation model table
Beat the odds evaluation model table shirleydesigns
 
Analyzing data the data is coming
Analyzing data   the data is comingAnalyzing data   the data is coming
Analyzing data the data is comingmtimmermand155
 
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...Beste Ulus
 
Opening up multiple choice - assessing with confidence
Opening up multiple choice - assessing with confidenceOpening up multiple choice - assessing with confidence
Opening up multiple choice - assessing with confidenceJon Rosewell
 
Ch. 7 finish and review
Ch. 7 finish and reviewCh. 7 finish and review
Ch. 7 finish and reviewjbnx
 

Was ist angesagt? (20)

Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...
Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...
Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects @ I...
 
Usability Evaluation in Educational Technology
Usability Evaluation in Educational Technology Usability Evaluation in Educational Technology
Usability Evaluation in Educational Technology
 
Consumer Oriented Evaluation Ppt
Consumer Oriented Evaluation PptConsumer Oriented Evaluation Ppt
Consumer Oriented Evaluation Ppt
 
Nonnegative matrix-fact
Nonnegative matrix-factNonnegative matrix-fact
Nonnegative matrix-fact
 
Multiple Response Questions - Allowing for chance in authentic assessments
Multiple Response Questions - Allowing for chance in authentic assessmentsMultiple Response Questions - Allowing for chance in authentic assessments
Multiple Response Questions - Allowing for chance in authentic assessments
 
pilot testing of questionnaire
 pilot testing of questionnaire pilot testing of questionnaire
pilot testing of questionnaire
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Topic 7 Product Evaluation
Topic 7 Product EvaluationTopic 7 Product Evaluation
Topic 7 Product Evaluation
 
Introduction to Survey Data Quality
Introduction to Survey Data Quality  Introduction to Survey Data Quality
Introduction to Survey Data Quality
 
Overseas mkt research 2
Overseas mkt research 2Overseas mkt research 2
Overseas mkt research 2
 
Design of Experiments (DoE) Seminar Outline
Design of Experiments (DoE) Seminar OutlineDesign of Experiments (DoE) Seminar Outline
Design of Experiments (DoE) Seminar Outline
 
Meta-study results of participatory processes
Meta-study results of participatory processesMeta-study results of participatory processes
Meta-study results of participatory processes
 
Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking processUsing GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking process
 
Usability Testing
Usability TestingUsability Testing
Usability Testing
 
Beat the odds evaluation model table
Beat the odds evaluation model table  Beat the odds evaluation model table
Beat the odds evaluation model table
 
Analyzing data the data is coming
Analyzing data   the data is comingAnalyzing data   the data is coming
Analyzing data the data is coming
 
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...
Prediction-Next-Term Student Performance Prediction: A Recommender Systems Ap...
 
FYP ppt
FYP pptFYP ppt
FYP ppt
 
Opening up multiple choice - assessing with confidence
Opening up multiple choice - assessing with confidenceOpening up multiple choice - assessing with confidence
Opening up multiple choice - assessing with confidence
 
Ch. 7 finish and review
Ch. 7 finish and reviewCh. 7 finish and review
Ch. 7 finish and review
 

Andere mochten auch

How is a graphic like pumpkin pie? A framework for analysis and critique of v...
How is a graphic like pumpkin pie? A framework for analysis and critique of v...How is a graphic like pumpkin pie? A framework for analysis and critique of v...
How is a graphic like pumpkin pie? A framework for analysis and critique of v...BELIV Workshop
 
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.BELIV Workshop
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.BELIV Workshop
 
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.BELIV Workshop
 
Implications of Individual Differences on Evaluating Information Visualizatio...
Implications of Individual Differences on Evaluating Information Visualizatio...Implications of Individual Differences on Evaluating Information Visualizatio...
Implications of Individual Differences on Evaluating Information Visualizatio...BELIV Workshop
 
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.BELIV Workshop
 
Evaluating Information Visualization in Large Companies: Challenges, Experien...
Evaluating Information Visualization in Large Companies: Challenges, Experien...Evaluating Information Visualization in Large Companies: Challenges, Experien...
Evaluating Information Visualization in Large Companies: Challenges, Experien...BELIV Workshop
 
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...BELIV Workshop
 

Andere mochten auch (8)

How is a graphic like pumpkin pie? A framework for analysis and critique of v...
How is a graphic like pumpkin pie? A framework for analysis and critique of v...How is a graphic like pumpkin pie? A framework for analysis and critique of v...
How is a graphic like pumpkin pie? A framework for analysis and critique of v...
 
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.
Many Roads Lead to Rome. Mapping Users’ Problem Solving Strategies.
 
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.Proposed Working Memory Measures for Evaluating Information Visualization Tools.
Proposed Working Memory Measures for Evaluating Information Visualization Tools.
 
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.
Look Before You Link: Eye Tracking in Multiple Coordinated View Visualization.
 
Implications of Individual Differences on Evaluating Information Visualizatio...
Implications of Individual Differences on Evaluating Information Visualizatio...Implications of Individual Differences on Evaluating Information Visualizatio...
Implications of Individual Differences on Evaluating Information Visualizatio...
 
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.
Pragmatic Challenges in the Evaluation of Interactive Visualization Systems.
 
Evaluating Information Visualization in Large Companies: Challenges, Experien...
Evaluating Information Visualization in Large Companies: Challenges, Experien...Evaluating Information Visualization in Large Companies: Challenges, Experien...
Evaluating Information Visualization in Large Companies: Challenges, Experien...
 
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
 

Ähnlich wie Learning-Based Evaluation Visual Analytics Systems

evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2vrgokila
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical MehodsM Surendar
 
Needs Assessment
Needs AssessmentNeeds Assessment
Needs AssessmentLeila Zaim
 
Introduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey ResearchIntroduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey ResearchCaroline Jarrett
 
Session-Based Test Management
Session-Based Test ManagementSession-Based Test Management
Session-Based Test Managementcaltonhill
 
Online assessment
Online assessmentOnline assessment
Online assessmentNisha Singh
 
Learning Analytics
Learning AnalyticsLearning Analytics
Learning AnalyticsJames Little
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.pptKingSh2
 
Qualitative and quantitative analysis
Qualitative and quantitative analysisQualitative and quantitative analysis
Qualitative and quantitative analysisNellie Deutsch (Ed.D)
 
Research Methodology.pptx
Research Methodology.pptxResearch Methodology.pptx
Research Methodology.pptxssuser09a281
 
User Experiments in Human-Computer Interaction
User Experiments in Human-Computer InteractionUser Experiments in Human-Computer Interaction
User Experiments in Human-Computer InteractionDr. Arindam Dey
 
The why and what of testa
The why and what of testaThe why and what of testa
The why and what of testaTansy Jessop
 
Assessment Analytics - EUNIS 2015 E-Learning Task Force Workshop
Assessment Analytics - EUNIS 2015 E-Learning Task Force WorkshopAssessment Analytics - EUNIS 2015 E-Learning Task Force Workshop
Assessment Analytics - EUNIS 2015 E-Learning Task Force WorkshopLACE Project
 
Lecture-5.ppt
Lecture-5.pptLecture-5.ppt
Lecture-5.pptMcPoolMac
 

Ähnlich wie Learning-Based Evaluation Visual Analytics Systems (20)

evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
Needs Assessment
Needs AssessmentNeeds Assessment
Needs Assessment
 
Introduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey ResearchIntroduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey Research
 
Session-Based Test Management
Session-Based Test ManagementSession-Based Test Management
Session-Based Test Management
 
Issue-based metrics
Issue-based metricsIssue-based metrics
Issue-based metrics
 
Online assessment
Online assessmentOnline assessment
Online assessment
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Paper prototype evaluation
Paper prototype evaluationPaper prototype evaluation
Paper prototype evaluation
 
Learning Analytics
Learning AnalyticsLearning Analytics
Learning Analytics
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Evaluation techniques
Evaluation techniquesEvaluation techniques
Evaluation techniques
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.ppt
 
Qualitative and quantitative analysis
Qualitative and quantitative analysisQualitative and quantitative analysis
Qualitative and quantitative analysis
 
Research Week 1.pptx
Research Week 1.pptxResearch Week 1.pptx
Research Week 1.pptx
 
Research Methodology.pptx
Research Methodology.pptxResearch Methodology.pptx
Research Methodology.pptx
 
User Experiments in Human-Computer Interaction
User Experiments in Human-Computer InteractionUser Experiments in Human-Computer Interaction
User Experiments in Human-Computer Interaction
 
The why and what of testa
The why and what of testaThe why and what of testa
The why and what of testa
 
Assessment Analytics - EUNIS 2015 E-Learning Task Force Workshop
Assessment Analytics - EUNIS 2015 E-Learning Task Force WorkshopAssessment Analytics - EUNIS 2015 E-Learning Task Force Workshop
Assessment Analytics - EUNIS 2015 E-Learning Task Force Workshop
 
Lecture-5.ppt
Lecture-5.pptLecture-5.ppt
Lecture-5.ppt
 

Mehr von BELIV Workshop

Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...BELIV Workshop
 
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...BELIV Workshop
 
A Descriptive Model of Visual Scanning.
A Descriptive Model of Visual Scanning.A Descriptive Model of Visual Scanning.
A Descriptive Model of Visual Scanning.BELIV Workshop
 
Generating a synthetic video dataset
Generating a synthetic video datasetGenerating a synthetic video dataset
Generating a synthetic video datasetBELIV Workshop
 
Beyond system logging: human logging for evaluating information visualization.
Beyond system logging: human logging for evaluating information visualization.Beyond system logging: human logging for evaluating information visualization.
Beyond system logging: human logging for evaluating information visualization.BELIV Workshop
 
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.Scanning Between Graph Visualizations: An Eye Tracking Evaluation.
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.BELIV Workshop
 
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.BELIV Workshop
 
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.BELIV Workshop
 
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz EvaluationsBELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz EvaluationsBELIV Workshop
 

Mehr von BELIV Workshop (9)

Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
 
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...
Is Your User Hunting or Gathering Insights? Identifying Insight Drivers Acros...
 
A Descriptive Model of Visual Scanning.
A Descriptive Model of Visual Scanning.A Descriptive Model of Visual Scanning.
A Descriptive Model of Visual Scanning.
 
Generating a synthetic video dataset
Generating a synthetic video datasetGenerating a synthetic video dataset
Generating a synthetic video dataset
 
Beyond system logging: human logging for evaluating information visualization.
Beyond system logging: human logging for evaluating information visualization.Beyond system logging: human logging for evaluating information visualization.
Beyond system logging: human logging for evaluating information visualization.
 
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.Scanning Between Graph Visualizations: An Eye Tracking Evaluation.
Scanning Between Graph Visualizations: An Eye Tracking Evaluation.
 
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.
Focus Groups for Functional InfoVis Prototype Evaluation: A Case Study.
 
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.Visualization Evaluation of the Masses, by the Masses, and for the Masses.
Visualization Evaluation of the Masses, by the Masses, and for the Masses.
 
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz EvaluationsBELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
BELIV'10 Keynote: Conceptual and Practical Challenges in InfoViz Evaluations
 

Learning-Based Evaluation Visual Analytics Systems

  • 1. Learning-Based Evaluation of Visual Analytics Systems Remco Chang, Caroline Ziemkiewicz, Roman Pyzh, Joseph Kielman*, William Ribarsky UNC Charlotte *Department of Homeland Charlotte Visualization Center Security
  • 2. Why Another Evaluation Method? • Based on a discussion with Joe Kielman (DHS) – Why is it difficult for agencies like the DHS to adopt and use visual analytics systems? • Most existing metrics are not indicative of success of adoption – Task completion time – Errors – Subjective preferences – Etc.
  • 3. Current Methods • Methods for evaluating visual analytics systems have been proposed. Each has its unique perspective and goal. For example: – Insight-based Evaluation (North et al.) – Productivity-based Evaluation (Scholtz) – MILC -- Multi-dimensional in-depth long-term case studies (Schneiderman, Plaisant) – Grounded Evaluation (Isenberg et al.)
  • 4. Our Goal for Evaluation • What Joe wants is: – Proof that the user of the visual analytics system can gain proficiency in solving a problem using the system – By using the VA system, show that a user can gradually change from being a “novice” to becoming an “expert” • In other words, Joe wants proof that by using the VA system, the user is gaining knowledge… – The goal of visualization is to gain insight and knowledge (ViSC report, 1987) (Illuminating the Path)
  • 5. Learning-Based Evaluation • In light of this goal, we propose a “learning-based evaluation” that attempts to directly test the amount of knowledge gained by its user. • The idea is try to determine how much the user has learned after spending time using a VA system by: – Giving a user a similar but different task. – Directly testing if the user has gained proficiency in the subject matter.
  • 8. Types of Learning • In designing either a new task or the questionnaire, it is important to differentiate and isolate what is being tested: – Knowledge gained about the Interface – Knowledge gained about the data – Knowledge gained about the task (domain)
  • 9. iPCA Example • iPCA stands for “interactive Principle Component Analysis”. By using it, the user can learn about: – The interface – The dataset • relationships within the data – The task • What is principle component analysis, and • How can I use principle component analysis to solve other problems?
  • 10. Application to the VAST Challenge • Current method: – Give participants a dataset and a problem – Ask participants to develop VA systems to solve the problem – Ask participants to describe their systems and analytical methods – Judges score each submission based on the developed systems and their applicability to the problem
  • 11. Application to the VAST Challenge • Proposed method: – Give participants a dataset and a problem – Ask participants to develop VA system to solve the problem – Ask participants to bring their systems to VisWeek – Give participants a similar, but different dataset and problem – Ask participants to solve the new problem using their VA systems – Judges score each participant based on the effectiveness of each system in solving the new task.
  • 12. Types of Learning • In designing either a new task or the questionnaire, it is important to differentiate and isolate what is being tested: – Knowledge gained about the Interface – Knowledge gained about the data – Knowledge gained about the task (domain)
  • 13. Discussion/Conclusion • This learning-based method seems simple and obvious because it really is. Teachers have been doing this for ages. • The method is not unique. There are many aspects of this proposed method that are similar to existing methods. In spirit, we are all looking to address the same problem. • The difference is the perspective. If we think about the problem from the perspective of a client (e.g., Joe at DHS), what they look for in evaluation results currently are not the same as what we as researchers give them.
  • 14. Future Work • Integrate the proposed learning-based method to: – Grounded Evaluation – Long term effects (MILC)
  • 15. Thank you! rchang@uncc.edu http://www.viscenter.uncc.edu/~rchang
  • 16. The Classroom Analogy • Say you’re a math teacher in middle school, and you’re trying to decide which text book to use, the blue one or the red one. You can: – Ask your friends which book is better • Analogous to an “expert-based evaluation”. Problem is that the sample size is typically small, and the results difficult to replicate. – Ask your students which book they like • Analogous to subjective preferences. The issue here is that the students can prefer the blue text book because its blue. – Test which text book is more effective by giving the students tests.