Krist Wongsuphasawat's Dissertation Proposal Slides: Interactive Exploration of Event Sequences in Temporal Categorical Data
1. INTERACTIVE EXPLORATION
OF EVENT SEQUENCES
IN TEMPORAL CATEGORICAL DATA
KRIST WONGSUPHASAWAT
DEPARTMENT OF COMPUTER SCIENCE &
HUMAN-COMPUTER INTERACTION LAB
UNIVERSITY OF MARYLAND al!
pos
at ion Pro
Dissert 2010!
,
April 30
2. INTERACTIVE EXPLORATION
OF EVENT SEQUENCES
IN TEMPORAL CATEGORICAL DATA
KRIST WONGSUPHASAWAT
DEPARTMENT OF COMPUTER SCIENCE &
HUMAN-COMPUTER INTERACTION LAB
UNIVERSITY OF MARYLAND al!
pos
at ion Pro
Dissert 2010!
,
April 30
3. TEMPORAL CATEGORICAL DATA
â˘âŻ A type of time series Category!
Numerical Categorical
Event!
Stock: Microsoft Patient ID: 45851737
04/26/2010&10:00 &31.03& 12/02/2008&14:26 &Admit&
04/26/2010&10:15 &31.01& 12/02/2008&14:26 &Emergency&
04/26/2010&10:30 &31.02& 12/02/2008&22:44 &ICU&
04/26/2010&10:45 &31.08& 12/05/2008&05:07 &Floor&
04/26/2010&11:00 &31.16& 12/08/2008&10:02 &Floor&
04/26/2010&11:15 &31.15& 12/14/2008&06:19 &Exit&
& Time
Admit
Emergency
ICU
Floor
Exit
4. TEMPORAL CATEGORICAL DATA
â˘âŻ Electronic Health Records: symptoms, treatment, lab test
â˘âŻ Traffic incident logs: arrival/departure time of each unit
â˘âŻ Student records: course, paper, proposal, defense, etc.
â˘âŻ Usability study logs
â˘âŻ Etc.
6. INTERACTIVE EXPLORATION
OF EVENT SEQUENCES
IN TEMPORAL CATEGORICAL DATA
KRIST WONGSUPHASAWAT
DEPARTMENT OF COMPUTER SCIENCE &
HUMAN-COMPUTER INTERACTION LAB
UNIVERSITY OF MARYLAND al!
pos
at ion Pro
Dissert 2010!
,
April 30
8. RESEARCH STATEMENT
âmy research aims to design
effective visualization
and interaction techniques
to support users in exploring
event sequences in temporal categorical data.â
â˘âŻ a flexible temporal search approach
â˘âŻ a novel visualization that provides an overview
of multiple records
9. RESEARCH
QUESTION#1
PRELIM. + PROPOSED
MOTIVATION WORK CONCLUSION
RESEARCH RESEARCH
QUESTIONS QUESTION#2
PRELIM. + PROPOSED
WORK
10. TRADITIONAL SEARCH MODEL
Know exactly what they want
Query Data *
How to specify the query? How to display the data?
SQL, TSQL, GUI, ...
How to process the query?
How to make the search efficient?
Indexing methods, algorithms
11. EXPLORATORY SEARCH MODEL
Uncertain about what they
are looking for
Know exactly what they want
Insight, Hypotheses *
search
Query Data
refine,
formulate
12. UNCERTAIN
Floor ICU Exit-Alive
QUERY within 2 days
Find something
useful and display
RESULTS
Frustrated!
found 0 record
13. EXPLORATORY SEARCH MODEL
Uncertain about what they
are looking for
Know exactly what they want
Insight, Hypotheses *
search
Query Data
refine,
formulate
How to display the data?
Difficult to understand the
event sequences in the data
quickly
14. NEEDS A STARTING POINT
Show overview
or summary
Where should I start?
15. RESEARCH QUESTIONS
1.⯠How to support the users when they are
uncertain about what they are looking for?
2.⯠How to provide an overview of event
sequences for temporal categorical data?
16. RESEARCH QUESTIONS
1.⯠How to support the users when they are
uncertain about what they are looking for?
2.⯠How to provide an overview of event
sequences for temporal categorical data?
17. APPROACHES
Exact Search! Similarity-based Search!
MUST have A, B, C SHOULD have A, B, C
Query Query
Record#1 Record#2
more
Record#2 Record#1
similar
Record#3 Record#3
18. SIMILARITY-BASED SEARCH
How to specify query?
What is âsimilarâ?
How to display results?
User Interface!
Similarity Measure
& Visualization
19. SIMILARITY MEASURE
Gives a score that shows how much a record is similar to the query.
Min = 0 / Max = 1
Query Record#1 0.80
Record#2 0.63
Record#3 0.96
Record#4 0.77
20. SIMILARITY MEASURE (2)
Query Record#3 0.96
Record#1 0.80
more
similar
Record#4 0.77
Record#2 0.63
21. CHALLENGE
â˘âŻ What is âsimilarâ? depends&on&users/tasks&
Record#1
(Query) A B C
Record#2 missing
A B C
Record#3 extra
A B C D
Record#4
Time difference
A B C
time
22. MATCH & MISMATCH MEASURE
time
Record#1
(Query) A C B C
Record#2
A B C B
Matched Events Missing Event Extra Event
}
Time Difference
Number of Swapping Total Score
Number of Missing Events 0 to 1
Number of Extra Events
23. RELATED WORK
â˘âŻ Similarity measure
â⯠For numerical time series
â˘âŻ [Ding et al. 2008], [Liu et al. 2006], [Berndt and Clifford 1994], ...
â⯠Edit distance
â˘âŻ [Levenshtein 1966], [Winkler 1999], [Chen 2004], ...
â⯠Biological sequence searching
â˘âŻ [Pearson and Lipman 1988], [Altschul et al. 1990], ...
â⯠Approximate graph matching
â˘âŻ [Tian et al. 2007], [Tian et al. 2008] , ...
â⯠Temporal Categorical Data
â˘âŻ [Sherkat 2006], ...
25. DEMO - DATA
â˘âŻ Patient transfers in the emergency department
â˘âŻ Real data from hospital (Jan â Mar 2010)
ADMIT Admission time
EMERGENCY Emergency room
ICU Intensive Care Unit
INTERMEDIATE Intermediate Medical Care
FLOOR Normal room
EXIT-ALIVE Leave the hospital alive
EXIT-DEAD Leave the hospital dead
28. X = Exact Search , S = Similarity-based Search
RESULTS
Sequence Counting Time const. Uncertainty
Sequence CountSpeed (seconds)
Time Uncertain
60 15 150 150
40 10 100 100
20 5 50 50
0 0 0 0
X S X S
R X S X S
Preference (7-points likert scale)
8 8 8 8
6 6 6 6
4 4 4 4
2 2 2 2
0 0 0 0
X S X S X S
R X S
30. PROPOSED WORK: FTS
â˘âŻ Draw an example to specify query
â˘âŻ Specify flexible and inflexible constraints
MUST HAVE RANGE ?
MUST NOT HAVE optional
â˘âŻ Algorithms
31. PROPOSED WORK: FTS (2)
â˘âŻ Combine exact and similarity-based search
Primary Bin
similar
Pass all constraints
Secondary Bin
similar
The rest
32. RESEARCH QUESTIONS
1.⯠How to support the users when they are
uncertain about what they are looking for?
2.⯠How to provide an overview of event
sequences for temporal categorical data?
33. WHEN YOU READ A PAPERâŚ
â˘âŻ What is it about?
INFORMATION VISUALIZATION MANTRA
OVERVIEW FIRST,
ZOOM AND FILTER,
THEN DETAILS ON DEMAND
34. RELATED WORK
â˘âŻ Single-record visualizations
Patient ID: 45851737
12/02/2008&14:26 &Admit&
12/02/2008&14:26 &Emergency& Single-record"
12/02/2008&22:44 &ICU& Visualization
12/05/2008&05:07 &Floor&
12/08/2008&10:02 &Floor&
12/14/2008&06:19 &Exit&
&
â˘âŻ E.g. LifeLines, MIDGAARD, etc.
39. LifeLines2âs Temporal Summary [Wang et al. 2009]
Continuumâs Histogram [Andre 2007]
Not so useful with many event types
Does not help exploring event sequences
40. OVERVIEW OF EVENT SEQUENCES
â˘âŻ Scalability vs. Loss of information
Patient ID: 45851737
Patient ID: 45851737
Patient ID: 45851737
12/02/2008&14:26 &Admit&
12/02/2008&14:26 &Admit&
12/02/2008&14:26 &Emergency&
?
12/02/2008&14:26 &Admit&
12/02/2008&22:44 &Emergency&
12/02/2008&14:26 &ICU&
12/02/2008&14:26 &Emergency&
12/02/2008&22:44 &ICU&
12/05/2008&05:07 &Floor&
12/02/2008&22:44 &ICU&
12/08/2008&10:02 &Floor&
12/05/2008&05:07 &Floor&
12/05/2008&05:07 &Floor&
12/08/2008&10:02 &Floor&
12/14/2008&06:19 &Exit&
12/08/2008&10:02 &Floor&
12/14/2008&06:19 &Exit&
& 12/14/2008&06:19 &Exit&
&
&
â˘âŻ Goal: Show common trends, outliers
41. LIFEFLOW
AGGREGATE
Merge multiple records into tree
VISUALIZE
Display the tree
56. EXPLORATORY SEARCH MODEL
Uncertain about what they
are looking for
Know exactly what they want
Insight, Hypotheses *
search
Query Data
refine,
formulate
How to display the data?
Difficult to understand the
event sequences in the data
quickly
57. RESEARCH QUESTIONS
1.⯠How to support the users when they are
uncertain about what they are looking for?
Flexible Temporal Search
2.⯠How to provide an overview of event
sequences for temporal categorical data?
LifeFlow: an overview visualization
58. EXPECTED CONTRIBUTIONS
1.⯠Design of visual representations, user
interfaces and interaction techniques
2.⯠Algorithms for flexible temporal search
3.⯠Evaluation results
4.⯠Open new directions for exploring
temporal categorical data
59. ACKNOWLEDGEMENT
DR. PHUONG HO, DR. MARK SMITH, DAVID ROSEMAN
WASHINGTON HOSPITAL CENTER
http://www.whcenter.org
NATIONAL INSTITUTES OF HEALTH (NIH)
http://www.nih.gov
MICHAEL PACK, MICHAEL VANDANIKER
CENTER FOR ADVANCED TRANPORTATION TECHNOLOGY LAB
(CATT LAB)
http://www.cattlab.umd.edu