HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
Visual Analytics talk at ISMB2013
1. - Visual Analytics -
The human back in the loop
Jan Aerts
Biodata Analysis and Visualization
Stadius Group, ESAT
Leuven University, Belgium
jan.aerts@esat.kuleuven.be
@jandot
http://orcid.org/0000-0002-6416-2717
2. hypothesis-driven -> data-driven
Scientific Research Paradigms (Jim Gray, Microsoft)
I have an hypothesis -> need to generate data to (dis)prove it.
I have data -> need to find hypotheses that I can test.
1st 1,000s years ago empirical
2nd 100s years ago theoretical
3rd last few decades computational
4rd today data exploration
3. What does this mean?
• immense re-use of existing datasets
• much of initial analysis is exploratory in nature
• biologically interesting signals may be too poorly understood to be analyzed
in automated fashion
• visualization is very effective in facilitating human reasoning about complex
data
• automated algorithms often act as black boxes => biologists must have blind
faith in bioinformatician (and bioinformatician in his/her own skills)
12. • Types of interaction (Yi et al, IEEE Transactions on Visualization and Computer
Graphics, 2007)
• select -> mark something as interesting
• explore -> show me something else
• reconfigure -> show me a different arrangement
• encode -> show me a different representation
• abstract/elaborate -> show me less/more detail
• filter -> show me something conditionally
• connect -> show me connected items
13.
14. Visualization for biological hypothesis generation
• example: eQTL data (IEEE BioVis visualization challenge 2011)
• 500 patients (affected + non-affected)
• 7500 SNPs; gene expression data for 15 genes
• PLINK one-locus/two-locus
17. HiTSee
Bertini E et al. IEEE Symposium on Biological Data Visualization (2011)
18.
19. when do I know that my algorithm is “correct”? -> peek into the black box
input
filter 1
filter 2
output A
filter 3
output B output C
Visualization for algorithm development
25. ParCoord
Boogaerts T et al. IEEE International Conference on
Bioinformatics & Bioengineering (2012)
Thomas Boogaerts
Endeavour gene prioritization
26.
27. Visualization for (live) interaction with analysis
• alternating between visual and automatic methods -> continuous
refinement and verification of preliminary results
• misleading results: discovered at early stage
• leverage user’s (biologist’s) insights
• no black box
29. Data filtering (visual parameter setting)
TrioVis
Ryo Sakai
Sakai R et al. Bioinformatics (2013)
30. User-guided analysis
Spark
Nielsen et al. Genome Research (2012)
clustering
chromatin modification
DNA methylation
RNA-Seq
data samples
regions of interest
31. BaobabView
van den Elzen S & van Wijk J. IEEE Conference on
Visual Analytics Science and Technology (2011)decision trees
32. Goecks, J. et al. Nature Biotechnology (2012)
Galaxy Trackster
Goecks J et al. Nature Biotechnology (2012)
34. Many challenges remain
• scalability (data processing + perception), uncertainty, “interestingness”,
interaction, evaluation
• infrastructure & architecture
• fast imprecise answers with progressive refinement
• incremental re-computation
• steering computation towards data regions of interest
35. Acknowledgments
• Bioinformatics Group at Stadius, Leuven University
• in particular: Ryo Sakai, Georgios Pavlopoulos
• visualization community for examples
• Jeremy for Trackster video