Visualization plays two essential roles in data-driven scientific discovery. First, visualization is a key tool for data exploration and hypothesis generation. Second, visualization facilitates communication of insights and findings. In a typical analysis scenario, however, visualization for exploration and visualization for communication are two separate processes. They often involve different software tools and data representations. Even though sophisticated interactive visualization tools are available to explore data sets, findings are usually shared in form of static images or functionally limited interactive visualizations. While these capture a particular state, they do not include any information about the exploration process that lead to the finding.
In this talk I will describe how by capturing the visual exploration process, visualizations can be made reproducible and sharable. My collaborators and I leverage such data about the analysis process to allow analysts to create "vistories", which are interactive and annotated figures, that communicate insights and findings.
A Unified Approach to Exploration, Authoring, and Communication with Reproducible Visualizations
1. Nils Gehlenborg, PhD
Department of Biomedical Informatics
Harvard Medical School
http://gehlenborglab.org @ngehlenborghttp://gehlenborglab.org
A Unified Approach to Exploration,
Authoring, and Communication
with Reproducible Visualizations
17. Nature asked 1,576 researchers if there
is a reproducibility crisis in science.
M Baker, Nature 533, 452-454, 2016
18. 0% 100%
No crisis (3%)
Don’t know (7%)
Slight crisis (38%)
M Baker, Nature 533, 452-454, 2016
Significant crisis (52%)
Nature asked 1,576 researchers if there
is a reproducibility crisis in science.
21. Intentional?
Inability to capture everything?
Inability to communicate everything?
SOCIAL ISSUE
TECHNICAL ISSUES
M Baker, Nature 533, 452-454, 2016
23. Discovery of Tumor Subtypes
PROBLEM 1
Visualize overlap of patient sets across two or more stratifications.
PROBLEM 2
Visualize characteristics of patient sets within a stratification of interest.
StratomeX: Exploratory Data Visualization
M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, P Park, N Gehlenborg, Nature Methods (2014)
26. PROBLEM 3
Identify relevant stratifications, pathways, and clinical variables.
Discovery of Tumor Subtypes
PROBLEM 1
Visualize overlap of patient sets across two or more stratifications.
PROBLEM 2
Visualize characteristics of patient sets within a stratification of interest.
StratomeX: Exploratory Data Visualization
M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, P Park, N Gehlenborg, Nature Methods (2014)
27. Is there a mutation that overlaps with this mRNA cluster?
Is there a CNV that affects survival?
Is there a pathway that is enriched in this cluster?
Query
Stratifications
Clinical Params
Pathways
Guided
Exploration
M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, P Park, N Gehlenborg, Nature Methods (2014)