8. Load a metagenomeSeq class object
This loads an example object and demonstrates that it uses the
default show method defined for eSet. A custom show method
could be defined if desired.
suppressPackageStartupMessages(library(metagenomeSeq))
data(lungData)
lungData
## MRexperiment (storageMode: environment)
## assayData: 51891 features, 78 samples
## element names: counts
## protocolData: none
## phenoData
## sampleNames: CHK_6467_E3B11_BRONCH2_PREWASH_V1V2
## CHK_6467_E3B11_OW_V1V2 ... CHK_6467_E3B09_BAL_A_V1V2 (78
## total)
## varLabels: SampleType SiteSampled SmokingStatus
## varMetadata: labelDescription
## featureData
## featureNames: 1 2 ... 51891 (51891 total)
## fvarLabels: taxa
## fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'
## pubMedIds: 21680950
## Annotation:
9. Load a phyloseq class object
Do the same for a phyloseq class example data object, and
demonstrate its custom show method:
suppressPackageStartupMessages(library(phyloseq))
data(GlobalPatterns)
GlobalPatterns
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 19216 taxa and 26 samples ]
## sample_data() Sample Data: [ 26 samples by 7 sample variables ]
## tax_table() Taxonomy Table: [ 19216 taxa by 7 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 19216 tips and 19215 internal nodes ]
10. Inheritance of core methods: example 1
Since metagenomeSeq contains eSet, it automatically inherits core
methods like dim(). These would have to be defined separately for
the phyloseq class since it does not extend a core class.
dim(lungData)
## Features Samples
## 51891 78
dim(GlobalPatterns)
## NULL
Note that neither the phyloseq or the metagenomeSeq package
defines a dim() method, but metagenomeSeq got it for free by
extending eSet.
11. Inheritance of core methods: example 2
For core Bioconductor objects, $ generally accessess the sample
data, but for phyloseq objects the sample data must be explicitly
extracted first:
head(lungData$SampleType)
## CHK_6467_E3B11_BRONCH2_PREWASH_V1V2 CHK_6467_E3B11_OW_V1V2
## Bronch2.PreWash OW
## CHK_6467_E3B08_OW_V1V2 CHK_6467_E3B07_BAL_A_V1V2
## OW BAL.A
## CHK_6467_E3B11_BAL_A_V1V2 CHK_6467_E3B09_OP_V1V2
## BAL.A OP.Swab
## 12 Levels: BAL.1stReturn BAL.A BAL.B Bronch1.PostWash ... PSB
head(sample_data(GlobalPatterns)$SampleType)
## [1] Soil Soil Soil Feces Feces Skin
## 9 Levels: Feces Freshwater Freshwater (creek) Mock ... Tongue
12. Inheritance of core methods: example 2
subset(), [, and head() are core methods
they are defined for eSet and other core classes, so these
familiar operations work “out of the box”:
subset(lungData, lungData$SampleType=="OW")
lungData[, lungData$SampleType=="OW"]
lungData[, 1:5]
head(lungData)
13. Inheritance of core methods: example 2
phyloseq cannot use these, so a custom subset_samples()
method is defined instead:
subset_samples(GlobalPatterns, SampleType=="Ocean")
But square bracket subsetting, subset(), and head() are not
defined for phyloseq objects, and have no parent class to inherit
them from.
GlobalPatterns[, 1:5]
## Error in GlobalPatterns[, 1:5]: object of type 'S4' is not subsettable
subset(GlobalPatterns, 1:5)
## Error in subset.default(GlobalPatterns, 1:5): 'subset' must be logical
14. Relevance to multi-omics data analysis
The MultiAssayExperiment core class allows coordinated
representation and management of an open-ended set of assays,
as long as their data class provides basic methods:
dimnames()
[ subsetting
dim()
and preferably assay()
MultiAssayExperiment data management is modeled on
SummarizedExperiment but allows for multiple assays of
different row and column dimensions.
15. Relevance to multi-omics data analysis (cont’d)
With no special accommodations, the lungData object “just works”
in a MultiAssayExperiment:
suppressPackageStartupMessages(library(MultiAssayExperiment))
MultiAssayExperiment(list(lung=lungData))
## A MultiAssayExperiment object of 1 listed
## experiment with a user-defined name and respective class.
## Containing an ExperimentList class object of length 1:
## [1] lung: MRexperiment with 51891 rows and 78 columns
## Features:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample availability DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
But GlobalPattern does not, because it is not derived from a core class:
MultiAssayExperiment(list(global=GlobalPatterns))
## Error in if (dim(object)[2] > 0 && is.null(colnames(object))) {: missing value where TRUE/FALSE needed
16. Inheritance of core methods
These are not isolated examples.Full-time, professional software
developers have developed many methods for core classes.
Classes containing core classes get all of these for free,
providing future advantages that you can not possibly imagine
in advance.
For example, SummarizedExperiment has more than 100
methods defined!
17. Inheritance of core methods (cont’d)
suppressPackageStartupMessages(library(SummarizedExperiment))
methods(class="SummarizedExperiment")
## [1] != [ [[ [[<- [<-
## [6] %in% < <= == >
## [11] >= $ $<- aggregate anyNA
## [16] append as.character as.complex as.data.frame as.env
## [21] as.integer as.list as.logical as.matrix as.numeric
## [26] as.raw assay assay<- assayNames assayNames<-
## [31] assays assays<- by cbind coerce
## [36] coerce<- colData colData<- countOverlaps dim
## [41] dimnames dimnames<- duplicated elementMetadata elementMetadata<-
## [46] eval expand expand.grid extractROWS findOverlaps
## [51] head intersect is.na length lengths
## [56] match mcols mcols<- merge metadata
## [61] metadata<- mstack names names<- NROW
## [66] overlapsAny parallelSlotNames pcompare rank rbind
## [71] realize relist rename rep rep.int
## [76] replaceROWS rev rowData rowData<- ROWNAMES
## [81] rowRanges<- seqlevelsInUse setdiff setequal shiftApply
## [86] show sort split split<- subset
## [91] subsetByOverlaps table tail tapply transform
## [96] union unique updateObject values values<-
## [101] window window<- with xtabs
## see '?methods' for accessing help and source code
SummarizedExperiment also provides great functionality like
out-of-the-box compatibility with on-disk data representation.
USE AND DERIVE FROM THESE CLASSES!
18. What are the “core” classes?
• Rectangular feature x sample data (RNAseq count matrix, microarray, …)
– SummarizedExperiment::SummarizedExperiment()
• Genomic coordinates (1-based, closed interval)
– GenomicRanges::GRanges()
• DNA / RNA / AA sequences
– Biostrings::*Stringset()
• Gene sets
– GSEABase::GeneSet()
– GSEABase::GeneSetCollection()
• Multi-omics data
– MultiAssayExperiment::MultiAssayExperiment()
• Single cell data
– SingleCellExperiment::SingleCellExperiment()
• Mass spectrometry data
– MSnbase::MSnExp()
https://www.bioconductor.org/developers/how-to/commonMethodsAndClasses/