SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Visualizing specificity in coexpression networks
                                 Jesse Gillis*, Anton Zoubarev, Cameron McDonald, Thea Van Rossum, Paul Pavlidis
                                 Department of Psychiatry and Centre for High-throughput Biology, University of British Columbia, British Columbia, Canada
                                 *Current address: Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, New York, USA


  Networks diagrams are frequently derided as “hairballs”,                                                   Network views can be misleading                                                  Node degree estimation
  suffering from poor interpretability. Here we describe an
                                                                                                             Because of the limitations of network visualization tools, and the        The current approach is based on counting how many
  attempt to improve the situation with an idea inspired by                                                  difficult of interpreting large graphs Gemma, like other tools that       coexpression pairs (“links”) the gene is involved in, across all
  function prediction considerations.                                                                        use network visualization, limits the size of the network that can        experiments, with the constraint that the edge must occur in
                                                                                                             be displayed. This makes it easy to forget that the data being            at least two experiments. This corresponds to the data that is
                                                                                                             visualized is part of a vastly larger network. This necessary             stored during the analysis phase. This value is then re-
                                                                                                             reduction may give the false impression that the portion of the           expressed as a relative rank for the genes of that organism.
                                                                                                             network being visualized is somehow representing information              Thus the gene with the largest number of coexpression
                                                                                                             specific to the genes shown. As the above discussion of node              partners has a score of 1.0, and that with the lowest 0.0
                                                                                                             degree biases suggests, this is potentially a very misleading             (there can be ties).
                                                                                                             assumption.
                                                                                                                                                                                       This method has the benefit of simplicity, and it is intuitive
                                                                                                                This coexpression-derived network might be considered
                                                                                                                highly suggestive of functional relations between Ankyrin 2            because it is tied directly to the data users will be able to
                                                                                                                and Synaptotagmin 4, but these genes are embedded in a                 access in the system. However, users should be aware that
                                                                                                                much larger network. Edge thickness indicates the level of
                                                                                                                support (thicker means more experiments exhibit this link).            the measure is potentially sensitive to sources of variance
The context for our study is coexpression network analysis and                                                  Nodes with red circles were initial query genes.
                                                                                                                                                                                       that are not biological in nature, most importantly the
visualization of interaction specificity using Gemma                                                                                                                                   number of data sets in which the gene is tested. Genes which
(http://chibi.ubc.ca/Gemma). Gemma is a database and analysis                                                To combat this problem within Gemma we use the rank statistics            are tested more often will tend to have more links. A more
software system with analyzed data from over 4000 expression                                                 of gene node degrees to visually up-weight genes by their                 complex measure that takes that source of variance into
profiling studies (Zoubarev et al., 2012)                                                                    interaction specificity in the full network (effectively down-            account is correlated with the simpler measure (Spearman
                                                                                                             weighting by the prior probability of the interaction occurring           rho=~0.7), but not as readily understandable. In addition, the
                                                                                                             generically, as determined from the node degrees). Importantly,           fact that a gene has a high node degree whether it is due to
                                                                                                             the network sparsification is accomplished without any use of             frequent testing or not is still of interpretational significance.
                                                                                                             functional information or additional network data. Variations on
                                                                                                             this approach are not uncommon in function prediction
                                                                                                             algorithms, but to our knowledge, have not been implemented to                                                   Another example,
                                                                                                                                                                                                                              the coexpression
                                                                                                             customize visualization in gene network analyses.
                                                                                                                                                                                                                              network for CBWD5.




                                                                                                                                                                                                                  Conclusions
                                                                                                                                                                                        We suggest that the approach outlined here is a way to
                                                                                                                                                                                        improve the utility of network views by minimizing the
   Node degree is highly informative                                                                                                           Nodes are shaded
                                                                                                                                               inversely to their node
                                                                                                                                               degree in the full
                                                                                                                                                                                        impact of “non-specific” interactions. It is easy to implement
                                                                                                                                               network, not just the                    because node degrees can be computed ahead of time and
                                                                                                                                               visualized fragment.
Previously we showed that gene function can be predicted from                                                                                  The query was mouse                      network visualization tools (e.g. Cytoscape Web, as used
networks without using interactions directly (Gillis and Pavlidis,                                                                             semaphorin genes
                                                                                                                                                                                        here) have support for modifying node color. There are likely
2011). We observed that ranking genes by their node degrees                                                                                                                             to be many ways to improve on our scheme. For example,
results in surprisingly good “guilt-by-association” performance;
about one-half of performance could be attributed entirely to node
                                                                                                                Coexpression analysis method                                            we have not yet attempted to “optimize” our mapping
                                                                                                                                                                                        between node degree and “visibility”.
degree effects.
                                     Increasing number of neighbours




                                                                                         “hubs”




                                                                                                             The method was previously described by Lee et al. (2004). For
                                                                                                             each data set, pairs of probes showing a statistically significant
                                                                                                             Pearson correlation are identified using stringent multiple test
                                                                                                             correction criteria and stored. No more than 1% of the
                                                                                                             correlations for any data set were stored in any case. The stored
                                                                       Increasing number of GO annotations   “links” are available for queries by end-users. The number of data
 Node degree is predictive because genes that have high node                                                 sets in which a link is observed is referred to as the support. In        References
 degree tend to have many functions (e.g. GO terms). Thus for any                                            general, only links that occur in at least two data sets are              Visualizations were built using Cytoscape Web
                                                                                                                                                                                       (http://cytoscapeweb.cytoscape.org/)
 given prediction task, algorithms that assign any given function to                                         retrieved, though the choice of data sets to be searched can be
                                                                                                                                                                                       Zouberev A., Hamer K.M., Keshav K., McCarthy E.L.M., Santos J.R.C., Van
 high node-degree genes appear to perform well without using                                                 changed by the user. The idea is that the more data sets which            Rossum T., McDonald C., Hall A., Wan X., Lim R., Gillis J., Pavlidis P. (2012)
 information on which genes are associated with which. More                                                  support a link, the less likely the link is due to technical artifacts.   Gemma: A resource for the re-use, sharing and meta-analysis of
 concretely, when studying any biological process, simply assuming                                                                                                                     expression profiling data. Bioinformatics, in press.
 P53 (for example) is implicated will go a surprisingly long way, and                                                                       Grant support                              Lee, H.K., et al., (2004) Coexpression analysis of human genes across
 networks encode this completely generic information in their node                                                                                                                     many microarray data sets. Genome Research 14: p. 1085-1094.
 degree.                                                                                                                                                                               Gillis J., Pavlidis P (2011) “The impact of multifunctional genes on “guilt
                                                                                                                                                                                       by association” analysis.” PLoS ONE. 6(2):e17258.

Weitere ähnliche Inhalte

Was ist angesagt?

A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesMedicineAndHealthNeurolog
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Ronak Shah
 
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesValue Amplify Consulting
 
A Survey on Clustering Techniques for Wireless Sensor Network
A Survey on Clustering Techniques for Wireless Sensor Network A Survey on Clustering Techniques for Wireless Sensor Network
A Survey on Clustering Techniques for Wireless Sensor Network IJORCS
 
Genetics influence inter-subject Brain State Prediction.
Genetics influence inter-subject Brain State Prediction.Genetics influence inter-subject Brain State Prediction.
Genetics influence inter-subject Brain State Prediction.Cameron Craddock
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Sage Base
 
Tracking Dynamic Networks in Real Time
Tracking Dynamic Networks in Real TimeTracking Dynamic Networks in Real Time
Tracking Dynamic Networks in Real TimeCameron Craddock
 
Neuron level interpretation of deep nlp model
Neuron level interpretation of deep nlp model Neuron level interpretation of deep nlp model
Neuron level interpretation of deep nlp model Shreya Goyal
 
A novel data embedding method using adaptive pixel pair matching
A novel data embedding method using adaptive pixel pair matchingA novel data embedding method using adaptive pixel pair matching
A novel data embedding method using adaptive pixel pair matchingJPINFOTECH JAYAPRAKASH
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Neuroscience Information Framework
 

Was ist angesagt? (10)

A Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the NeurosciencesA Dual congress Psychiatry and the Neurosciences
A Dual congress Psychiatry and the Neurosciences
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
 
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
 
A Survey on Clustering Techniques for Wireless Sensor Network
A Survey on Clustering Techniques for Wireless Sensor Network A Survey on Clustering Techniques for Wireless Sensor Network
A Survey on Clustering Techniques for Wireless Sensor Network
 
Genetics influence inter-subject Brain State Prediction.
Genetics influence inter-subject Brain State Prediction.Genetics influence inter-subject Brain State Prediction.
Genetics influence inter-subject Brain State Prediction.
 
Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27Stephen Friend HHMI-Penn 2011-05-27
Stephen Friend HHMI-Penn 2011-05-27
 
Tracking Dynamic Networks in Real Time
Tracking Dynamic Networks in Real TimeTracking Dynamic Networks in Real Time
Tracking Dynamic Networks in Real Time
 
Neuron level interpretation of deep nlp model
Neuron level interpretation of deep nlp model Neuron level interpretation of deep nlp model
Neuron level interpretation of deep nlp model
 
A novel data embedding method using adaptive pixel pair matching
A novel data embedding method using adaptive pixel pair matchingA novel data embedding method using adaptive pixel pair matching
A novel data embedding method using adaptive pixel pair matching
 
Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework. Where are the Data? Perspectives from the Neuroscience Information Framework.
Where are the Data? Perspectives from the Neuroscience Information Framework.
 

Andere mochten auch

NetBioSIG2012 chaozhang-mosaic
NetBioSIG2012 chaozhang-mosaicNetBioSIG2012 chaozhang-mosaic
NetBioSIG2012 chaozhang-mosaicAlexander Pico
 
NetBioSIG2012 eriksonnhammer
NetBioSIG2012 eriksonnhammerNetBioSIG2012 eriksonnhammer
NetBioSIG2012 eriksonnhammerAlexander Pico
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizAlexander Pico
 
NetBioSIG2012 annabauermehren
NetBioSIG2012 annabauermehrenNetBioSIG2012 annabauermehren
NetBioSIG2012 annabauermehrenAlexander Pico
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioAlexander Pico
 
NetBioSIG2012 kostiidit
NetBioSIG2012 kostiiditNetBioSIG2012 kostiidit
NetBioSIG2012 kostiiditAlexander Pico
 

Andere mochten auch (6)

NetBioSIG2012 chaozhang-mosaic
NetBioSIG2012 chaozhang-mosaicNetBioSIG2012 chaozhang-mosaic
NetBioSIG2012 chaozhang-mosaic
 
NetBioSIG2012 eriksonnhammer
NetBioSIG2012 eriksonnhammerNetBioSIG2012 eriksonnhammer
NetBioSIG2012 eriksonnhammer
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-viz
 
NetBioSIG2012 annabauermehren
NetBioSIG2012 annabauermehrenNetBioSIG2012 annabauermehren
NetBioSIG2012 annabauermehren
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbio
 
NetBioSIG2012 kostiidit
NetBioSIG2012 kostiiditNetBioSIG2012 kostiidit
NetBioSIG2012 kostiidit
 

Ähnlich wie NetBioSIG2012 paulpavlidis

Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...butest
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksAkua Biaa Adu
 
Methods of Combining Neural Networks and Genetic Algorithms
Methods of Combining Neural Networks and Genetic AlgorithmsMethods of Combining Neural Networks and Genetic Algorithms
Methods of Combining Neural Networks and Genetic AlgorithmsESCOM
 
Genome structure prediction a review over soft computing techniques
Genome structure prediction a review over soft computing techniquesGenome structure prediction a review over soft computing techniques
Genome structure prediction a review over soft computing techniqueseSAT Journals
 
A Relational Model of Data for Large Shared Data Banks
A Relational Model of Data for Large Shared Data BanksA Relational Model of Data for Large Shared Data Banks
A Relational Model of Data for Large Shared Data Banksrenguzi
 
Birthof Relation Database
Birthof Relation DatabaseBirthof Relation Database
Birthof Relation DatabaseRaj Bhat
 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Sivagowry Shathesh
 
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptx
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptxSolis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptx
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptxRyanLong78
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural networkNagarajan
 
Detection of suspected nodes in MANET
Detection of suspected nodes in MANETDetection of suspected nodes in MANET
Detection of suspected nodes in MANETIDES Editor
 
An Analysis of The Methods Employed for Breast Cancer Diagnosis
An Analysis of The Methods Employed for Breast Cancer Diagnosis An Analysis of The Methods Employed for Breast Cancer Diagnosis
An Analysis of The Methods Employed for Breast Cancer Diagnosis IJORCS
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsRafael C. Jimenez
 
29. continuous neighbor discovery in asynchronous sensor networks
29. continuous neighbor discovery in asynchronous sensor networks29. continuous neighbor discovery in asynchronous sensor networks
29. continuous neighbor discovery in asynchronous sensor networksakshuu16
 
Quantum neural network
Quantum neural networkQuantum neural network
Quantum neural networksurat murthy
 

Ähnlich wie NetBioSIG2012 paulpavlidis (20)

Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...Machine Learning for Efficient Neighbor Selection in ...
Machine Learning for Efficient Neighbor Selection in ...
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein Networks
 
kkyle_poster_FINAL
kkyle_poster_FINALkkyle_poster_FINAL
kkyle_poster_FINAL
 
Methods of Combining Neural Networks and Genetic Algorithms
Methods of Combining Neural Networks and Genetic AlgorithmsMethods of Combining Neural Networks and Genetic Algorithms
Methods of Combining Neural Networks and Genetic Algorithms
 
Genome structure prediction a review over soft computing techniques
Genome structure prediction a review over soft computing techniquesGenome structure prediction a review over soft computing techniques
Genome structure prediction a review over soft computing techniques
 
A Relational Model of Data for Large Shared Data Banks
A Relational Model of Data for Large Shared Data BanksA Relational Model of Data for Large Shared Data Banks
A Relational Model of Data for Large Shared Data Banks
 
Birthof Relation Database
Birthof Relation DatabaseBirthof Relation Database
Birthof Relation Database
 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing
 
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptx
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptxSolis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptx
Solis-Lemus & Ane (2016) Inferring Phylogenetic Networks.pptx
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
PDN for Machine Learning
PDN for Machine LearningPDN for Machine Learning
PDN for Machine Learning
 
Detection of suspected nodes in MANET
Detection of suspected nodes in MANETDetection of suspected nodes in MANET
Detection of suspected nodes in MANET
 
Amazon SimpleDB
Amazon SimpleDBAmazon SimpleDB
Amazon SimpleDB
 
An Analysis of The Methods Employed for Breast Cancer Diagnosis
An Analysis of The Methods Employed for Breast Cancer Diagnosis An Analysis of The Methods Employed for Breast Cancer Diagnosis
An Analysis of The Methods Employed for Breast Cancer Diagnosis
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular Interactions
 
29. continuous neighbor discovery in asynchronous sensor networks
29. continuous neighbor discovery in asynchronous sensor networks29. continuous neighbor discovery in asynchronous sensor networks
29. continuous neighbor discovery in asynchronous sensor networks
 
Kn2518431847
Kn2518431847Kn2518431847
Kn2518431847
 
Kn2518431847
Kn2518431847Kn2518431847
Kn2518431847
 
Quantum neural network
Quantum neural networkQuantum neural network
Quantum neural network
 

Mehr von Alexander Pico

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018Alexander Pico
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017Alexander Pico
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallAlexander Pico
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksAlexander Pico
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerAlexander Pico
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoAlexander Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicAlexander Pico
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaAlexander Pico
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutAlexander Pico
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilAlexander Pico
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarAlexander Pico
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoAlexander Pico
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonAlexander Pico
 

Mehr von Alexander Pico (20)

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive Networks
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank Kramer
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana Milenkovic
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu Xia
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian Walhout
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini Patil
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon Cho
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald Quon
 

NetBioSIG2012 paulpavlidis

  • 1. Visualizing specificity in coexpression networks Jesse Gillis*, Anton Zoubarev, Cameron McDonald, Thea Van Rossum, Paul Pavlidis Department of Psychiatry and Centre for High-throughput Biology, University of British Columbia, British Columbia, Canada *Current address: Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, New York, USA Networks diagrams are frequently derided as “hairballs”, Network views can be misleading Node degree estimation suffering from poor interpretability. Here we describe an Because of the limitations of network visualization tools, and the The current approach is based on counting how many attempt to improve the situation with an idea inspired by difficult of interpreting large graphs Gemma, like other tools that coexpression pairs (“links”) the gene is involved in, across all function prediction considerations. use network visualization, limits the size of the network that can experiments, with the constraint that the edge must occur in be displayed. This makes it easy to forget that the data being at least two experiments. This corresponds to the data that is visualized is part of a vastly larger network. This necessary stored during the analysis phase. This value is then re- reduction may give the false impression that the portion of the expressed as a relative rank for the genes of that organism. network being visualized is somehow representing information Thus the gene with the largest number of coexpression specific to the genes shown. As the above discussion of node partners has a score of 1.0, and that with the lowest 0.0 degree biases suggests, this is potentially a very misleading (there can be ties). assumption. This method has the benefit of simplicity, and it is intuitive This coexpression-derived network might be considered highly suggestive of functional relations between Ankyrin 2 because it is tied directly to the data users will be able to and Synaptotagmin 4, but these genes are embedded in a access in the system. However, users should be aware that much larger network. Edge thickness indicates the level of support (thicker means more experiments exhibit this link). the measure is potentially sensitive to sources of variance The context for our study is coexpression network analysis and Nodes with red circles were initial query genes. that are not biological in nature, most importantly the visualization of interaction specificity using Gemma number of data sets in which the gene is tested. Genes which (http://chibi.ubc.ca/Gemma). Gemma is a database and analysis To combat this problem within Gemma we use the rank statistics are tested more often will tend to have more links. A more software system with analyzed data from over 4000 expression of gene node degrees to visually up-weight genes by their complex measure that takes that source of variance into profiling studies (Zoubarev et al., 2012) interaction specificity in the full network (effectively down- account is correlated with the simpler measure (Spearman weighting by the prior probability of the interaction occurring rho=~0.7), but not as readily understandable. In addition, the generically, as determined from the node degrees). Importantly, fact that a gene has a high node degree whether it is due to the network sparsification is accomplished without any use of frequent testing or not is still of interpretational significance. functional information or additional network data. Variations on this approach are not uncommon in function prediction algorithms, but to our knowledge, have not been implemented to Another example, the coexpression customize visualization in gene network analyses. network for CBWD5. Conclusions We suggest that the approach outlined here is a way to improve the utility of network views by minimizing the Node degree is highly informative Nodes are shaded inversely to their node degree in the full impact of “non-specific” interactions. It is easy to implement network, not just the because node degrees can be computed ahead of time and visualized fragment. Previously we showed that gene function can be predicted from The query was mouse network visualization tools (e.g. Cytoscape Web, as used networks without using interactions directly (Gillis and Pavlidis, semaphorin genes here) have support for modifying node color. There are likely 2011). We observed that ranking genes by their node degrees to be many ways to improve on our scheme. For example, results in surprisingly good “guilt-by-association” performance; about one-half of performance could be attributed entirely to node Coexpression analysis method we have not yet attempted to “optimize” our mapping between node degree and “visibility”. degree effects. Increasing number of neighbours “hubs” The method was previously described by Lee et al. (2004). For each data set, pairs of probes showing a statistically significant Pearson correlation are identified using stringent multiple test correction criteria and stored. No more than 1% of the correlations for any data set were stored in any case. The stored Increasing number of GO annotations “links” are available for queries by end-users. The number of data Node degree is predictive because genes that have high node sets in which a link is observed is referred to as the support. In References degree tend to have many functions (e.g. GO terms). Thus for any general, only links that occur in at least two data sets are Visualizations were built using Cytoscape Web (http://cytoscapeweb.cytoscape.org/) given prediction task, algorithms that assign any given function to retrieved, though the choice of data sets to be searched can be Zouberev A., Hamer K.M., Keshav K., McCarthy E.L.M., Santos J.R.C., Van high node-degree genes appear to perform well without using changed by the user. The idea is that the more data sets which Rossum T., McDonald C., Hall A., Wan X., Lim R., Gillis J., Pavlidis P. (2012) information on which genes are associated with which. More support a link, the less likely the link is due to technical artifacts. Gemma: A resource for the re-use, sharing and meta-analysis of concretely, when studying any biological process, simply assuming expression profiling data. Bioinformatics, in press. P53 (for example) is implicated will go a surprisingly long way, and Grant support Lee, H.K., et al., (2004) Coexpression analysis of human genes across networks encode this completely generic information in their node many microarray data sets. Genome Research 14: p. 1085-1094. degree. Gillis J., Pavlidis P (2011) “The impact of multifunctional genes on “guilt by association” analysis.” PLoS ONE. 6(2):e17258.