SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
R tools for HiC data visualizationR tools for HiC data visualization
Nathalie Vialaneix, INRAE/MIATNathalie Vialaneix, INRAE/MIAT
Chrocogen, January 31st 2020Chrocogen, January 31st 2020
1 / 271 / 27
First ofall... a bit ofbibliographyFirst ofall... a bit ofbibliography
2 / 272 / 27
What did I use to make this presentation?
a github repository with a bunch of references, classified by themes
https://github.com/mdozmorov/HiC_tools#visualization
two reviews on the topic: [Yardimci & Noble, 2017] (5 tools, no R package),
[Ing-Simons & Vaquerizas, CB 2019] (12 tools, 2/3 R packages, half of the
tools are interactive)
bioconductor research tool
and I identi ed...
HiT-C - HiCBricks - DNARchitect (with a shiny interface) - GENOVA - Gviz
and GenomicInteraction - Sushi - HiCeekR (the last one sent by a colleague
the day after I finished my slides)
(+ Pierre's package adjclust that is on CRAN)
3 / 27
What did I learn fromthat?
a large number of interactive tools already exist
in another review [Lin et al, WSBM 2019], you can also find tools for 3D
visualization of Hi-C data
however, most tools seem to propose very common approaches for Hi-C
data visualization, even the interactive tools
a problem still remains: find appropriate standards to store the data and
load them into the software (multiple standards currently exist, with no
clear consensus yet)
4 / 27
Format speci cationsFormat speci cations
5 / 275 / 27
Import and format ofthe di erent tools
HiTC (bioconductor, 7.5 years): (function importC)
input: mandatory: a CSV (tab separated) file with bin pairs and a BED
file describing the bins (chr | start | end | bin nb) outputs of HiC-
Pro
class: HTCexp (for submatrices) or HTClist (for all matrices) with
slots intdata (interaction matrix; can be sparse) and xgi/ygi
(GRanges objects describing the bins) can be used directly
HiCBricks (bioconductor, 1 year): (functions Create_many_Bricks +
Brick_load_matrix / Create_many_Bricks_from_mcool)
input: mandatory TXT (space separated) files with the count matrices
for every chromosome and a BED file describing the bins (chr | start |
end) by order of appearance in the matrix OR .mcool files AND soon
available .hic files
class: BrickContainer that does not incorporate the data
themselves but only information on the chromosomes (names and
lengths) and on files in which the information (bin description and
interactions) is stored. When creating this object, a directory is
created with HDF (Hierarchical Data Format) files with the data in
them
⇒
⇒
6 / 27
Import and format ofthe di erent tools
DNA_Rchitect (web shiny interface at
http://shiny.immgen.org/DNARchitect/)
input: TXT file, separated by comma, semicolumn or tabulations, with
the following columns (chrom1 | start1 | end1 | chrom2 | start2 |
end2 | score | samplenumber) BEDPE files
GENOVA (github repository https://github.com/robinweide/GENOVA, not
properly documented and full of bugs): (functions read_bedpe,
read.hicpro.matrix)
input: mandatory: a CSV (tab separated) file with bin pairs and a BED
file describing the bins (chr | start | end | bin nb) outputs of HiC-
Pro OR BEDPE files. It is said that it can handle .cool files OR .hic
files but I haven't found where
class(?): contacts that contains the slots MAT (triplet interaction
matrix), IDX (bin descriptions, BED), CHRS (chr description),
CENTROMERES (location of the centromeres, BED)
⇒
⇒
7 / 27
Import and format ofthe di erent tools
GenomicInteractions (bioconductor, 5 years, based on Gviz): (functions
makeGenomicInteractionsFromFile or directly using
GenomicInteractions)
input: mandatory: BEDPE files OR HOMER files (TXT files with 20
columns;
http://homer.ucsd.edu/homer/interactions/HiCinteractions.html)
class: GenomicInteractions that contains two GRanges objects (bin
pairs) and a count object (numeric vector) can be used directly
Sushi (bioconductor, 5.5 years):
input: BEDPE files or interaction matrix (with genomic coordinates in
row/column names) as TXT files. No dedicated import function; data
passed to the package functions as simple data.frame
⇒
8 / 27
Import and format ofthe di erent tools
HiCeekR (github repository https://github.com/lucidif/HiCeekR, 1 year,
well documented, shiny application to run locally)
input: BAM file and FASTA reference. Makes all the processing and
creates local files and stores intermediate results (report also created)
adjclust (CRAN, 2 years)
input: a CSV (tab separated) file with bin pairs OR an interaction
matrix OR HTC-exp objects. No dedicated import function; data
passed to the package functions as simple (sparse) matrices
9 / 27
Summary
X: possible
XX: tested (by myself)
~: possible but not quite direct
TXT file
(bin pairs)
TXT file
(matrix)
BEDPE .cool .hic custom
HiTC XX ~XX ~X XX
HiCBricks X X X?
DNA_Rchitect X
GENOVA X X X X ?
GenomicInteractions ~XX X XX
Sushi X X X X
adjclust XX XX ~X XX
Only very recent (and still unmature) tools handle HiC specific formats like
.cool and .hic. HiCeekR handles only raw BAM files.
10 / 27
Visualization ofHiC-dataVisualization ofHiC-data
organized by types of visualization and tasksorganized by types of visualization and tasks
11 / 2711 / 27
Heatmaps
# using GENOVA
maria_90_chr7 <- load_contacts(signal_path = "../../data/forTests/ch
indices_path = "../../data/forTests/c
sample_name = "dg90-chr7", colour =
hic.matrixplot(maria_90_chr7, chr = 7, start = 0, end = 5000000)
12 / 27
Recommandations for heatmaps
whole-genome heatmaps used to highlight genomic rearrangement /
zoomed heatmaps used to highlight TADs and loops
colour coding should scale with log$_10$ rather than linearly and should
be made with a colour scale consisting of only one color to avoid artificial
transitions (also use multiple hues for colorblinds). Two colour scales can
be used to represent a correlation matrix (compartments) or a comparison
between matrices (see below)
comparisons can be made with side by side heatmaps or (better) with a
heatmap of the log$_2$ ratio
linear tracks can be added to heatmaps and in this case triangular
heatmaps should be preferred (the tracks are then placed below)
tools that contain heatmaps: HiTC, HiCBricks, GENOVA, sushi and
adjclust (no heatmaps in DNA_Rchitect or in GenomicInteractions)
13 / 27
Features for heatmaps
rectang-
ular
triangular
custom
colors
zoom comparison
linear
tracks
HiTC
XX
(genome)
XX (chr)
log,
pos/neg
col
prior to plot
(start/end)
X (2,
triangular)
X (only
genomic
int.)
HiCBricks X X
X
(palette
and log)
X
(start/end/dist)
X (2)
GENOVA XX X(?)
X (but
limited)
X (2) X (?)
Sushi X X
X
(palette)
X
(start/end/dist)
HiCeekR X X (start/end)
X
(numeric/2)
adjclust XX
XX
(palette
and log)
14 / 27
Features for heatmaps
In addition: HiCBricks and adjclust can show TADs on the heatmap (maybe
also GENOVA) and GENOVA can highlight loops with circles on the maps.
15 / 27
Example ofvisualization with annotation tracks
16 / 27
Critical assessment ofthe tools
The simplest, more complete and nicest visualization function for heatmaps is
in HiCBricks (even if it can not display linear tracks) but unfortunately, the
import format of the tools is rather hard to use.
GENOVA is promising (including many functions to extract features (IS, TADs,
loops, ...) from HiC matrices) but impossible to use at that stage because of the 17 / 27
Interactions as arcs (or networks)
# with GenomicInteractions: how to create the data?
genomic_pos <- read.table("../../data/forTests/chr7_index.bed", sep
genomic_pos <- GRanges(genomic_pos[ ,1],
IRanges(genomic_pos[ ,2], width = 40000,
names = genomic_pos[ ,4]))
bin_pairs <- read.table("../../data/forTests/chr7_90.matrix", sep =
bins1 <- match(bin_pairs[ ,1], as.numeric(names(ranges(genomic_pos)
bins1 <- genomic_pos[bins1, ]
bins2 <- match(bin_pairs[ ,2], as.numeric(names(ranges(genomic_pos)
bins2 <- genomic_pos[bins2, ]
maria_90_chr7 <- GenomicInteractions(bins1, bins2, counts = bin_pai
18 / 27
Interactions as arcs (or networks)
maria_90_chr7
## GenomicInteractions object with 905906 interactions and 1 metadata column
## seqnames1 ranges1 seqnames2 ranges
## <Rle> <IRanges> <Rle> <IRanges
## [1] 7 0-39999 --- 7 0-3999
## [2] 7 0-39999 --- 7 40000-7999
## [3] 7 0-39999 --- 7 80000-11999
## [4] 7 0-39999 --- 7 120000-15999
## [5] 7 0-39999 --- 7 160000-19999
## ... ... ... ... ... .
## [905902] 7 134640000-134679999 --- 7 134680000-13471999
## [905903] 7 134640000-134679999 --- 7 134720000-13475999
## [905904] 7 134680000-134719999 --- 7 134680000-13471999
## [905905] 7 134680000-134719999 --- 7 134720000-13475999
## [905906] 7 134720000-134759999 --- 7 134720000-13475999
## | counts
## | <integer>
## [1] | 15
## [2] | 55
## [3] | 19
## [4] | 8 19 / 27
Interactions as arcs (or networks)
interaction_track <- InteractionTrack(maria_90_chr7, name = "HiC",
chromosome = "7")
plotTracks(interaction_track, chromosome = "7", from = 0, to = 50000
20 / 27
Interactions as arcs (or networks)
plotTracks(interaction_track, chromosome = "7", from = 0, to = 50000
21 / 27
Recommandations for arcs
usefull mainly to superimpose annotations or qualitative/quantitative
tracks (Gviz offers plently of solutions to do so)
but becomes unreadable for large regions and is unable to show the
interaction intensity (a solution would be to threshold the interaction
intensity before)
alternatives display the data as networks (but the genome linear structure
is lost and it is also restricted to very small regions) or as circos plot
(thresholding of interactions to keep only the strongest is mandatory, even
for a single chromosome)
22 / 27
Critical assessment oftools
DNA_Rchitect, Sushi and GenomicInteractions display the
interactions as arcs
DNA_Rchitect is interactive but I never managed to use it, even on the
example dataset (two many annotation information is required for a
proper use)
the other two propose approximately the same types of features
(GenomicInteractions is maybe more complete but Sushi easier to
customize)
HiCeekr can represent the data as a(n interactive) network, for a whole
chromosome or a selected region and with/without a threshold for the
edge value
23 / 27
Example ofvisualization with annotation tracks
24 / 27
Example ofvisualization with annotation tracks
with circlize (a bit sophisticated to use, similar to Gviz)
25 / 27
Other (quality control) graphics
in HiCeekR: quality control of the alignment (fragment length
distribution, insert size distributions)
in HiTC: inter/intra interaction barplot, interaction versus distance dot
plot, interaction distribution (histogram) for CIS/TRANS
in GenomicInteractions: inter/intra donut graphs (forget them!),
interaction distribution (histogram but cut; also forget them), donut
graphs with annotation of the interactions
26 / 27
References
Ing-Simmons E, Vaquerizas JM (2019) Visualising three-dimensional genome
organisation in two dimensions. Development, 146(19): dev177162.
Lin D, Bonora G, Yardimci GG, Noble WS (2017) Computational methods for
analyzing and modeling genome structure and organization. WIREs Systems
Biology and Medicine, 11: e1435.
Yardimci GG, Noble WS (2017) Software tools for visualizing Hi-C data. Genome
Biology, 18: 26.
27 / 27

Weitere ähnliche Inhalte

Was ist angesagt?

Fractionation in radiotherapy
Fractionation in radiotherapyFractionation in radiotherapy
Fractionation in radiotherapyamalu2801
 
Chapter 5 -repair or radiation damage and dose-rate effect - jtl
Chapter 5 -repair or radiation damage and dose-rate effect - jtlChapter 5 -repair or radiation damage and dose-rate effect - jtl
Chapter 5 -repair or radiation damage and dose-rate effect - jtlJohn Lucas
 
Foundation of Radiotherapy (RT)
Foundation of Radiotherapy (RT)Foundation of Radiotherapy (RT)
Foundation of Radiotherapy (RT)Eneutron
 
Oer , rbe &amp; let
Oer , rbe &amp; letOer , rbe &amp; let
Oer , rbe &amp; letNilesh Kucha
 
Plan evaluation in RADIOTHERAPY
Plan evaluation in RADIOTHERAPYPlan evaluation in RADIOTHERAPY
Plan evaluation in RADIOTHERAPYKanhu Charan
 
Heritable effects of radiation 14.11.14
Heritable effects of radiation 14.11.14Heritable effects of radiation 14.11.14
Heritable effects of radiation 14.11.14Sneha George
 
Radiosensitivity and the Cell Cycle - Chapter 4 jtl
Radiosensitivity and the Cell Cycle - Chapter 4   jtlRadiosensitivity and the Cell Cycle - Chapter 4   jtl
Radiosensitivity and the Cell Cycle - Chapter 4 jtlJohn Lucas
 
Oncotype dx presentation
Oncotype dx presentationOncotype dx presentation
Oncotype dx presentationahmed mjali
 
Systems Of Differential Equations
Systems Of Differential EquationsSystems Of Differential Equations
Systems Of Differential EquationsJDagenais
 
Acute radiation syndrome
Acute radiation syndromeAcute radiation syndrome
Acute radiation syndromeameneh haghbin
 
Mind the Gap: Dealing with Interruptions in Radiotherapy Treatment
Mind the Gap: Dealing with Interruptions in Radiotherapy TreatmentMind the Gap: Dealing with Interruptions in Radiotherapy Treatment
Mind the Gap: Dealing with Interruptions in Radiotherapy TreatmentVictor Ekpo
 
MCQs Ordinary Differential Equations
MCQs Ordinary Differential EquationsMCQs Ordinary Differential Equations
MCQs Ordinary Differential EquationsDrDeepaChauhan
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionRonak Shah
 
Second order homogeneous linear differential equations
Second order homogeneous linear differential equations Second order homogeneous linear differential equations
Second order homogeneous linear differential equations Viraj Patel
 
Unsealed radionuclides
Unsealed radionuclidesUnsealed radionuclides
Unsealed radionuclidesNilesh Kucha
 

Was ist angesagt? (20)

Fractionation in radiotherapy
Fractionation in radiotherapyFractionation in radiotherapy
Fractionation in radiotherapy
 
Chapter 5 -repair or radiation damage and dose-rate effect - jtl
Chapter 5 -repair or radiation damage and dose-rate effect - jtlChapter 5 -repair or radiation damage and dose-rate effect - jtl
Chapter 5 -repair or radiation damage and dose-rate effect - jtl
 
Foundation of Radiotherapy (RT)
Foundation of Radiotherapy (RT)Foundation of Radiotherapy (RT)
Foundation of Radiotherapy (RT)
 
cell survival curve
cell survival curvecell survival curve
cell survival curve
 
Cell survival curve
Cell survival curveCell survival curve
Cell survival curve
 
C-V characteristics of MOS Capacitor
C-V characteristics of MOS CapacitorC-V characteristics of MOS Capacitor
C-V characteristics of MOS Capacitor
 
Chapter3 cell survival curve
Chapter3 cell survival curveChapter3 cell survival curve
Chapter3 cell survival curve
 
Oer , rbe &amp; let
Oer , rbe &amp; letOer , rbe &amp; let
Oer , rbe &amp; let
 
Plan evaluation in RADIOTHERAPY
Plan evaluation in RADIOTHERAPYPlan evaluation in RADIOTHERAPY
Plan evaluation in RADIOTHERAPY
 
Heritable effects of radiation 14.11.14
Heritable effects of radiation 14.11.14Heritable effects of radiation 14.11.14
Heritable effects of radiation 14.11.14
 
Radiosensitivity and the Cell Cycle - Chapter 4 jtl
Radiosensitivity and the Cell Cycle - Chapter 4   jtlRadiosensitivity and the Cell Cycle - Chapter 4   jtl
Radiosensitivity and the Cell Cycle - Chapter 4 jtl
 
Oncotype dx presentation
Oncotype dx presentationOncotype dx presentation
Oncotype dx presentation
 
Systems Of Differential Equations
Systems Of Differential EquationsSystems Of Differential Equations
Systems Of Differential Equations
 
Acute radiation syndrome
Acute radiation syndromeAcute radiation syndrome
Acute radiation syndrome
 
Mind the Gap: Dealing with Interruptions in Radiotherapy Treatment
Mind the Gap: Dealing with Interruptions in Radiotherapy TreatmentMind the Gap: Dealing with Interruptions in Radiotherapy Treatment
Mind the Gap: Dealing with Interruptions in Radiotherapy Treatment
 
MCQs Ordinary Differential Equations
MCQs Ordinary Differential EquationsMCQs Ordinary Differential Equations
MCQs Ordinary Differential Equations
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detection
 
Differential evolution
Differential evolutionDifferential evolution
Differential evolution
 
Second order homogeneous linear differential equations
Second order homogeneous linear differential equations Second order homogeneous linear differential equations
Second order homogeneous linear differential equations
 
Unsealed radionuclides
Unsealed radionuclidesUnsealed radionuclides
Unsealed radionuclides
 

Ähnlich wie R tools for HiC data visualization

An introduction to knitr and R Markdown
An introduction to knitr and R MarkdownAn introduction to knitr and R Markdown
An introduction to knitr and R Markdownsahirbhatnagar
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC datatuxette
 
EBtree - Design for a Scheduler and Use (Almost) Everywhere
EBtree - Design for a Scheduler and Use (Almost) EverywhereEBtree - Design for a Scheduler and Use (Almost) Everywhere
EBtree - Design for a Scheduler and Use (Almost) EverywhereC4Media
 
La famille *down
La famille *downLa famille *down
La famille *downtuxette
 
Graph operations in Git version control system
Graph operations in Git version control systemGraph operations in Git version control system
Graph operations in Git version control systemJakub Narębski
 
Frac paq user guide v2.0 release
Frac paq user guide v2.0 releaseFrac paq user guide v2.0 release
Frac paq user guide v2.0 releasejoseicha
 
La famille *down
La famille *downLa famille *down
La famille *downtuxette
 
Graph computation
Graph computationGraph computation
Graph computationSigmoid
 
Crosstalk
CrosstalkCrosstalk
Crosstalkcdhowe
 
Opensource gis development - part 3
Opensource gis development - part 3Opensource gis development - part 3
Opensource gis development - part 3Andrea Antonello
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User InterfacesSetia Pramana
 
Postgres indexes: how to make them work for your application
Postgres indexes: how to make them work for your applicationPostgres indexes: how to make them work for your application
Postgres indexes: how to make them work for your applicationBartosz Sypytkowski
 
intro to knitr with RStudio
intro to knitr with RStudiointro to knitr with RStudio
intro to knitr with RStudioBen Bolker
 
Data visualization in python/Django
Data visualization in python/DjangoData visualization in python/Django
Data visualization in python/Djangokenluck2001
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Makoto Yui
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoJoel Falcou
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Saajid Akram
 

Ähnlich wie R tools for HiC data visualization (20)

An introduction to knitr and R Markdown
An introduction to knitr and R MarkdownAn introduction to knitr and R Markdown
An introduction to knitr and R Markdown
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
EBtree - Design for a Scheduler and Use (Almost) Everywhere
EBtree - Design for a Scheduler and Use (Almost) EverywhereEBtree - Design for a Scheduler and Use (Almost) Everywhere
EBtree - Design for a Scheduler and Use (Almost) Everywhere
 
Postgres indexes
Postgres indexesPostgres indexes
Postgres indexes
 
La famille *down
La famille *downLa famille *down
La famille *down
 
Graph operations in Git version control system
Graph operations in Git version control systemGraph operations in Git version control system
Graph operations in Git version control system
 
Seeing Like Software
Seeing Like SoftwareSeeing Like Software
Seeing Like Software
 
Frac paq user guide v2.0 release
Frac paq user guide v2.0 releaseFrac paq user guide v2.0 release
Frac paq user guide v2.0 release
 
R basics
R basicsR basics
R basics
 
La famille *down
La famille *downLa famille *down
La famille *down
 
Graph computation
Graph computationGraph computation
Graph computation
 
Crosstalk
CrosstalkCrosstalk
Crosstalk
 
Opensource gis development - part 3
Opensource gis development - part 3Opensource gis development - part 3
Opensource gis development - part 3
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User Interfaces
 
Postgres indexes: how to make them work for your application
Postgres indexes: how to make them work for your applicationPostgres indexes: how to make them work for your application
Postgres indexes: how to make them work for your application
 
intro to knitr with RStudio
intro to knitr with RStudiointro to knitr with RStudio
intro to knitr with RStudio
 
Data visualization in python/Django
Data visualization in python/DjangoData visualization in python/Django
Data visualization in python/Django
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]
 

Mehr von tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

Mehr von tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Kürzlich hochgeladen

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 

Kürzlich hochgeladen (20)

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 

R tools for HiC data visualization

  • 1. R tools for HiC data visualizationR tools for HiC data visualization Nathalie Vialaneix, INRAE/MIATNathalie Vialaneix, INRAE/MIAT Chrocogen, January 31st 2020Chrocogen, January 31st 2020 1 / 271 / 27
  • 2. First ofall... a bit ofbibliographyFirst ofall... a bit ofbibliography 2 / 272 / 27
  • 3. What did I use to make this presentation? a github repository with a bunch of references, classified by themes https://github.com/mdozmorov/HiC_tools#visualization two reviews on the topic: [Yardimci & Noble, 2017] (5 tools, no R package), [Ing-Simons & Vaquerizas, CB 2019] (12 tools, 2/3 R packages, half of the tools are interactive) bioconductor research tool and I identi ed... HiT-C - HiCBricks - DNARchitect (with a shiny interface) - GENOVA - Gviz and GenomicInteraction - Sushi - HiCeekR (the last one sent by a colleague the day after I finished my slides) (+ Pierre's package adjclust that is on CRAN) 3 / 27
  • 4. What did I learn fromthat? a large number of interactive tools already exist in another review [Lin et al, WSBM 2019], you can also find tools for 3D visualization of Hi-C data however, most tools seem to propose very common approaches for Hi-C data visualization, even the interactive tools a problem still remains: find appropriate standards to store the data and load them into the software (multiple standards currently exist, with no clear consensus yet) 4 / 27
  • 5. Format speci cationsFormat speci cations 5 / 275 / 27
  • 6. Import and format ofthe di erent tools HiTC (bioconductor, 7.5 years): (function importC) input: mandatory: a CSV (tab separated) file with bin pairs and a BED file describing the bins (chr | start | end | bin nb) outputs of HiC- Pro class: HTCexp (for submatrices) or HTClist (for all matrices) with slots intdata (interaction matrix; can be sparse) and xgi/ygi (GRanges objects describing the bins) can be used directly HiCBricks (bioconductor, 1 year): (functions Create_many_Bricks + Brick_load_matrix / Create_many_Bricks_from_mcool) input: mandatory TXT (space separated) files with the count matrices for every chromosome and a BED file describing the bins (chr | start | end) by order of appearance in the matrix OR .mcool files AND soon available .hic files class: BrickContainer that does not incorporate the data themselves but only information on the chromosomes (names and lengths) and on files in which the information (bin description and interactions) is stored. When creating this object, a directory is created with HDF (Hierarchical Data Format) files with the data in them ⇒ ⇒ 6 / 27
  • 7. Import and format ofthe di erent tools DNA_Rchitect (web shiny interface at http://shiny.immgen.org/DNARchitect/) input: TXT file, separated by comma, semicolumn or tabulations, with the following columns (chrom1 | start1 | end1 | chrom2 | start2 | end2 | score | samplenumber) BEDPE files GENOVA (github repository https://github.com/robinweide/GENOVA, not properly documented and full of bugs): (functions read_bedpe, read.hicpro.matrix) input: mandatory: a CSV (tab separated) file with bin pairs and a BED file describing the bins (chr | start | end | bin nb) outputs of HiC- Pro OR BEDPE files. It is said that it can handle .cool files OR .hic files but I haven't found where class(?): contacts that contains the slots MAT (triplet interaction matrix), IDX (bin descriptions, BED), CHRS (chr description), CENTROMERES (location of the centromeres, BED) ⇒ ⇒ 7 / 27
  • 8. Import and format ofthe di erent tools GenomicInteractions (bioconductor, 5 years, based on Gviz): (functions makeGenomicInteractionsFromFile or directly using GenomicInteractions) input: mandatory: BEDPE files OR HOMER files (TXT files with 20 columns; http://homer.ucsd.edu/homer/interactions/HiCinteractions.html) class: GenomicInteractions that contains two GRanges objects (bin pairs) and a count object (numeric vector) can be used directly Sushi (bioconductor, 5.5 years): input: BEDPE files or interaction matrix (with genomic coordinates in row/column names) as TXT files. No dedicated import function; data passed to the package functions as simple data.frame ⇒ 8 / 27
  • 9. Import and format ofthe di erent tools HiCeekR (github repository https://github.com/lucidif/HiCeekR, 1 year, well documented, shiny application to run locally) input: BAM file and FASTA reference. Makes all the processing and creates local files and stores intermediate results (report also created) adjclust (CRAN, 2 years) input: a CSV (tab separated) file with bin pairs OR an interaction matrix OR HTC-exp objects. No dedicated import function; data passed to the package functions as simple (sparse) matrices 9 / 27
  • 10. Summary X: possible XX: tested (by myself) ~: possible but not quite direct TXT file (bin pairs) TXT file (matrix) BEDPE .cool .hic custom HiTC XX ~XX ~X XX HiCBricks X X X? DNA_Rchitect X GENOVA X X X X ? GenomicInteractions ~XX X XX Sushi X X X X adjclust XX XX ~X XX Only very recent (and still unmature) tools handle HiC specific formats like .cool and .hic. HiCeekR handles only raw BAM files. 10 / 27
  • 11. Visualization ofHiC-dataVisualization ofHiC-data organized by types of visualization and tasksorganized by types of visualization and tasks 11 / 2711 / 27
  • 12. Heatmaps # using GENOVA maria_90_chr7 <- load_contacts(signal_path = "../../data/forTests/ch indices_path = "../../data/forTests/c sample_name = "dg90-chr7", colour = hic.matrixplot(maria_90_chr7, chr = 7, start = 0, end = 5000000) 12 / 27
  • 13. Recommandations for heatmaps whole-genome heatmaps used to highlight genomic rearrangement / zoomed heatmaps used to highlight TADs and loops colour coding should scale with log$_10$ rather than linearly and should be made with a colour scale consisting of only one color to avoid artificial transitions (also use multiple hues for colorblinds). Two colour scales can be used to represent a correlation matrix (compartments) or a comparison between matrices (see below) comparisons can be made with side by side heatmaps or (better) with a heatmap of the log$_2$ ratio linear tracks can be added to heatmaps and in this case triangular heatmaps should be preferred (the tracks are then placed below) tools that contain heatmaps: HiTC, HiCBricks, GENOVA, sushi and adjclust (no heatmaps in DNA_Rchitect or in GenomicInteractions) 13 / 27
  • 14. Features for heatmaps rectang- ular triangular custom colors zoom comparison linear tracks HiTC XX (genome) XX (chr) log, pos/neg col prior to plot (start/end) X (2, triangular) X (only genomic int.) HiCBricks X X X (palette and log) X (start/end/dist) X (2) GENOVA XX X(?) X (but limited) X (2) X (?) Sushi X X X (palette) X (start/end/dist) HiCeekR X X (start/end) X (numeric/2) adjclust XX XX (palette and log) 14 / 27
  • 15. Features for heatmaps In addition: HiCBricks and adjclust can show TADs on the heatmap (maybe also GENOVA) and GENOVA can highlight loops with circles on the maps. 15 / 27
  • 16. Example ofvisualization with annotation tracks 16 / 27
  • 17. Critical assessment ofthe tools The simplest, more complete and nicest visualization function for heatmaps is in HiCBricks (even if it can not display linear tracks) but unfortunately, the import format of the tools is rather hard to use. GENOVA is promising (including many functions to extract features (IS, TADs, loops, ...) from HiC matrices) but impossible to use at that stage because of the 17 / 27
  • 18. Interactions as arcs (or networks) # with GenomicInteractions: how to create the data? genomic_pos <- read.table("../../data/forTests/chr7_index.bed", sep genomic_pos <- GRanges(genomic_pos[ ,1], IRanges(genomic_pos[ ,2], width = 40000, names = genomic_pos[ ,4])) bin_pairs <- read.table("../../data/forTests/chr7_90.matrix", sep = bins1 <- match(bin_pairs[ ,1], as.numeric(names(ranges(genomic_pos) bins1 <- genomic_pos[bins1, ] bins2 <- match(bin_pairs[ ,2], as.numeric(names(ranges(genomic_pos) bins2 <- genomic_pos[bins2, ] maria_90_chr7 <- GenomicInteractions(bins1, bins2, counts = bin_pai 18 / 27
  • 19. Interactions as arcs (or networks) maria_90_chr7 ## GenomicInteractions object with 905906 interactions and 1 metadata column ## seqnames1 ranges1 seqnames2 ranges ## <Rle> <IRanges> <Rle> <IRanges ## [1] 7 0-39999 --- 7 0-3999 ## [2] 7 0-39999 --- 7 40000-7999 ## [3] 7 0-39999 --- 7 80000-11999 ## [4] 7 0-39999 --- 7 120000-15999 ## [5] 7 0-39999 --- 7 160000-19999 ## ... ... ... ... ... . ## [905902] 7 134640000-134679999 --- 7 134680000-13471999 ## [905903] 7 134640000-134679999 --- 7 134720000-13475999 ## [905904] 7 134680000-134719999 --- 7 134680000-13471999 ## [905905] 7 134680000-134719999 --- 7 134720000-13475999 ## [905906] 7 134720000-134759999 --- 7 134720000-13475999 ## | counts ## | <integer> ## [1] | 15 ## [2] | 55 ## [3] | 19 ## [4] | 8 19 / 27
  • 20. Interactions as arcs (or networks) interaction_track <- InteractionTrack(maria_90_chr7, name = "HiC", chromosome = "7") plotTracks(interaction_track, chromosome = "7", from = 0, to = 50000 20 / 27
  • 21. Interactions as arcs (or networks) plotTracks(interaction_track, chromosome = "7", from = 0, to = 50000 21 / 27
  • 22. Recommandations for arcs usefull mainly to superimpose annotations or qualitative/quantitative tracks (Gviz offers plently of solutions to do so) but becomes unreadable for large regions and is unable to show the interaction intensity (a solution would be to threshold the interaction intensity before) alternatives display the data as networks (but the genome linear structure is lost and it is also restricted to very small regions) or as circos plot (thresholding of interactions to keep only the strongest is mandatory, even for a single chromosome) 22 / 27
  • 23. Critical assessment oftools DNA_Rchitect, Sushi and GenomicInteractions display the interactions as arcs DNA_Rchitect is interactive but I never managed to use it, even on the example dataset (two many annotation information is required for a proper use) the other two propose approximately the same types of features (GenomicInteractions is maybe more complete but Sushi easier to customize) HiCeekr can represent the data as a(n interactive) network, for a whole chromosome or a selected region and with/without a threshold for the edge value 23 / 27
  • 24. Example ofvisualization with annotation tracks 24 / 27
  • 25. Example ofvisualization with annotation tracks with circlize (a bit sophisticated to use, similar to Gviz) 25 / 27
  • 26. Other (quality control) graphics in HiCeekR: quality control of the alignment (fragment length distribution, insert size distributions) in HiTC: inter/intra interaction barplot, interaction versus distance dot plot, interaction distribution (histogram) for CIS/TRANS in GenomicInteractions: inter/intra donut graphs (forget them!), interaction distribution (histogram but cut; also forget them), donut graphs with annotation of the interactions 26 / 27
  • 27. References Ing-Simmons E, Vaquerizas JM (2019) Visualising three-dimensional genome organisation in two dimensions. Development, 146(19): dev177162. Lin D, Bonora G, Yardimci GG, Noble WS (2017) Computational methods for analyzing and modeling genome structure and organization. WIREs Systems Biology and Medicine, 11: e1435. Yardimci GG, Noble WS (2017) Software tools for visualizing Hi-C data. Genome Biology, 18: 26. 27 / 27