SlideShare a Scribd company logo
1 of 46
Download to read offline
Computational pipeline Global analysis Characterization of 50
Circular RNAs are a large class of animal RNAs with
regulatory potency
S Memczak∗
, M Jens∗
, A Elefsinioti∗
, F Torti∗
, J Krueger, A Rybak, L Maier, S
D Mackowiak, L H Gregersen, M Munschauer, A Loewer, U Ziebold, M
Landthaler, C Kocks, F le Noble, and N Rajewsky
April 30, 2013
ciRNA April 30, 2013 1 / 47
Computational pipeline Global analysis Characterization of 50
Basic principles
used 2x76bp run data generated after ribosomal RNA depletion and random
priming
screened RNA sequencing reads for splice junctions formed by an acceptor
splice site at the 5’ end of an exon and a donor site at a downstream 3’ end
(head-to-tail)
ciRNA April 30, 2013 2 / 47
Computational pipeline Global analysis Characterization of 50
More details
1 filtered out reads that aligned (with bowtie2) contiguously and full-length to
the genome
2 From unmapped reads (including normal spliced reads) extracted 20mers
from both ends and aligned them independently to find unique anchor
positions within spliced exons. Anchors that aligned in the reversed
orientation (head-to-tail) indicated circRNA splicing (normal splicing was also
found by consecutive anchors).
3 Extended the anchor alignments such that the complete read aligns and the
breakpoints were flanked by GU/AG splice sites. Ambiguous breakpoints were
discarded.
4 The resulting alignments were read by another custom script that jointly
evaluates consecutive anchor alignments belonging to the same original read,
performs extensions of the anchor alignments, and collects statistics on splice
sites. After the run completes, the script outputs all detected splice junctions
(linear and circular) in a UCSC BED-like format with extra columns holding
quality statistics, read counts etc.
ciRNA April 30, 2013 3 / 47
Computational pipeline Global analysis Characterization of 50
Filtering demands
GU/AG flanking the splice sites (built in);
unambiguous breakpoint detection
a maximum of two mismatches in the extension procedure
the breakpoint cannot reside more than 2 nucleotides inside an anchor
at least two independent reads (each distinct sequence only counted once per
sample) support the junction
unique anchor alignments with a safety margin to the next-best alignment of
at least one anchor above 35 points (approximately equivalent to more than
two extra mismatches in high-quality bases)
a genomic distance between the two splice sites of no more than 100 kb (only
a small percentage of the data).
As the ribosomal DNA cluster is part of the C. elegans genome assembly and
ribosomal pre-RNAs could give rise to circular RNAs by mechanisms
independent of the spliceosome, we discarded 130 candidates that mapped to
the rDNA cluster on chrI:15,060,286-15,071,020.
ciRNA April 30, 2013 4 / 47
Computational pipeline Global analysis Characterization of 50
Permutation testing
reversed either anchor
reversed the complete read
randomly reassigned anchors between reads
reverse complemented the read (as a positive control).
the reverse complement recovered the same output as expected, the various
permutations led to only very few candidate predictions, well below 0.2% of
the output with unpermuted reads and in excel-lent agreement with the results
from simulated reads
sensitivity (.75%) and FDR (0.2%) using simulated reads and permutations
of real sequencing data
the efficiency of ribominus protocols to extract and sequence circRNAs is
limited, reducing overall sensitivity
ciRNA April 30, 2013 5 / 47
Computational pipeline Global analysis Characterization of 50
Data generated
generated ribominus data for HEK293 cells and combined with human
leukocyte data
detected 1,950 circRNAs with support from at least two independent
junction-spanning reads
ciRNA April 30, 2013 6 / 47
Computational pipeline Global analysis Characterization of 50
Jeck 2012: cANRIL, or how they got interested
Our group discovered a circular RNA species, circular ANRIL, whose
expression is associated with that of products of the human INK4a/ARF
locus and is correlated with the risk of human atherosclerosis
Production of cANRIL in humans is associated with common SNPs predicted
to affect cANRIL splicing, suggesting the possibility that cANRIL production
influences PcG-mediated repression of the INK4a/ARF locus to influence
atherosclerosis risk
ciRNA April 30, 2013 7 / 47
Computational pipeline Global analysis Characterization of 50
CircleSeq: The RNAse R approach - Jeck 2012
ciRNA April 30, 2013 8 / 47
Computational pipeline Global analysis Characterization of 50
CircleSeq: The RNase R approach - Jeck 2012
Method was optimized to allow for >10-fold enrichment of cANRIL in cDNA prepared from RNase
R-treated vs. untreated samples
Used approx. 300 million 100-bp reads per sample aligned to the human genome using a de novo splice
mapping algorithm, MapSplice
compiled the list of all fusion splice junctions where splice donor and acceptor occur within 2 Mb but in
the non-colinear ordering; term these junctions backsplices.
New metric - SRPBM - spliced reads per billion mapping
Counts of reads mapping across an identified backsplice in untreated samples, normalized by read
length and number of reads mapping,– to permit quantitative comparisons between backsplices.
ciRNA April 30, 2013 9 / 47
Computational pipeline Global analysis Characterization of 50
Using RNase R works
CircleSeq enriches for backsplice junctions. Two-dimensional histograms showing normalized backspliced read count (SRPBM) or normalized exon
coverage (RPKM) between two samples or replicates. (A) Coverage of backsplice reads in RNase R-treated replicates over all distinct backsplice
species (R2 = 0.579). (B) Coverage of exons in mock treated replicates (R2 = 0.91). (C) Average backsplice coverage in RNase R-treated
against mock treated RNA-seq showing enrichment of most backsplice species by RNase R. (D) Mean normalized exon coverage in annotated
exon sequences in RNase R-treated against mock treated RNA-seq showing depletion of the majority of species by RNase R.
ciRNA April 30, 2013 10 / 47
Computational pipeline Global analysis Characterization of 50
And generates some interesting results...
identified >100,000 unique backsplice events throughout the genome.
Of these, 25,166 were present in both RNase R-treated biological replicates
and were enriched by RNase R treatment as compared with mock treatment.
31% of backsplice species observed in untreated controls were not enriched
by RNase R (Fig. 2C). These species likely represent mapping artifacts or
nonsequential exons harbored in linear products, resulting from either RNA
trans-splicing or cleavage of ecircRNAs.
See “positive controls” cANRIL, cETS-1; not able to detect other known
circular species (e.g., SRY and DCC), due to low expression of these genes in
Hs68 cells
the 25,166 replicated backsplice events detected by CircleSeq included the
large majority (1025 of 1319) of putative circles identified through a
previously described bioinformatic approach (Salzman 2012) - in different cell
types!!! [T-cells vs fibroblasts]
Of the 25,166 unique backsplicing events that reproducibly enriched by RNase R
digestion in both biological replicates, most were only found in RNase R-treated
samples and were not observed in the absence of exonuclease digestion.
many of these events represent rare ecircRNAs arising from pervasive
background levels of RNA circularization, which may result from an occasional
error in splicing (Hsu and Hertel 2009).
ciRNA April 30, 2013 11 / 47
Computational pipeline Global analysis Characterization of 50
Filtering on confidence
LOW stringency set requiring a single backsplice read in the control data and
HIGH stringency requiring coverage on a par with splices in a moderately
expressed gene
ciRNA April 30, 2013 12 / 47
Computational pipeline Global analysis Characterization of 50
Data generated - Memzack 2013
generated ribominus data for HEK293 cells and combined with human
leukocyte data
detected 1,950 circRNAs with support from at least two independent
junction-spanning reads
ciRNA April 30, 2013 13 / 47
Computational pipeline Global analysis Characterization of 50
Expression of genes predicted to give rise to circRNAs was
slightly shifted towards higher expression values
“indicating that circRNAs are not just rare mistakes of the spliceosome”
Histograms of gene expression levels obtained from
polyA+ RNA sequencing in HEK293 cells. The number of
reads per kilobase of exon per million mapped reads
(RPKM) reflects mRNA abundance. Genes that are
predicted to give rise to circRNAs (red circles) are not
specifically enriched for high expression, (solid line: all
genes). circRNAs from lowly expressed genes are detected
less frequently, comparable to the loss of sensitivity
observed for linear splicing (black dashed line: genes with
> 75% of annotated splice sites recovered, gray dashed
line: >50%, light gray:>10%)
ciRNA April 30, 2013 14 / 47
Computational pipeline Global analysis Characterization of 50
Mouse and nematode
identified 1,903 circRNAs in mouse (brains, fetal
head, differentiation-induced embryonic stem cells;
81 of these mapped to human circRNAs
mapped mouse circRNAs were compared with independently
identified human circRNAs using liftOver, yielding 229
circRNAs with precisely orthologous splice sites between human
and mouse. Of these, 223 were composed exclusively of coding
exons and were subsequently used for our conservation analysis
(Fig. 1f).
When intersecting the reported sets of circRNAs supported by
two independent reads in each species, we found 81 conserved
circRNAs (supported by at least 4 reads in total).
used sequencing data from various C. elegans
developmental stages (Stoeckius, M. manuscript in
prep) and detected 724 circRNAs, with at least two
independent reads.
Numerous circRNAs seem to be specifically
expressed in a cell type or developmental stage
(Fig. 1b,c, S1e)
hsa-circRNA 2149 is supported by 13 unique, head-to-tail
spanning reads in CD191 leukocytes but is not detected in
CD341 leukocytes, neutrophils or HEK293 cells
a number of nematode circRNAs seem to be expressed in
oocytes but absent in 1- or 2-cell embryos
ciRNA April 30, 2013 15 / 47
Computational pipeline Global analysis Characterization of 50
Where do these circRNAs come from?
Intersection of circRNAs with known transcripts
computational screen identifies only the splice sites that lead to
circularization but not the internal exon/ intron structure of circular RNAs
=> inferred as much as possible from annotated transcripts
conservative assumption was that as little as possible should be spliced out
coincidence of circRNA splice sites with exonic boundaries inside a transcript
were considered as an indicator for relevant agreement and internal introns
appear to be spliced out
ciRNA April 30, 2013 16 / 47
Computational pipeline Global analysis Characterization of 50
Where do these circRNAs come from?
annotated human
circRNAs using the
RefSeq database and a
catalogue of non-coding
RNAs
85% of human circRNAs
align sense to known
genes
10% of all circRNAs align
antisense to known
transcripts, smaller
fractions align to UTRs,
introns, unannotated
regions of the genome.
ciRNA April 30, 2013 17 / 47
Computational pipeline Global analysis Characterization of 50
Where do these circRNAs come from?
Sorted all overlapping transcripts hierarchically by
splice-site coincidence (2, 1, or 0)
total amount of exonic sequence between the splice sites
total amount of coding sequence
Latter was used to break ties only and helped the annotation process.
if one or both splice sites fell into an exon of the best matching transcript,
the corresponding exon boundary was trimmed.
if it fell into an intron or beyond transcript bounds, the closest exon was
extended to match the circRNA boundaries.
circRNA start/end coordinates were never altered.
If no annotated exons overlapped the circRNA we assumed a single-exon
circRNA.
The resulting annotation of circRNAs is based on the best matching
transcript and may in some cases not represent the ideal choice. Changing
the annotation rules, however, did not substantially change the numbers in
Fig. 1d
ciRNA April 30, 2013 18 / 47
Computational pipeline Global analysis Characterization of 50
How many exons form part of circRNA?
Their splice sites typically
span one to five exons
and overlap coding exons
(84%), but only in 65% of
these cases are both splice
sites that participate in
the circularization known
splice sites
Jeck: introns are spliced
from most circular forms
Jeck: observed many
single exon ecircRNAs,
that is, where the donor
site splices to the
acceptor of the same exon
(e.g., KIAA0182)
ciRNA April 30, 2013 19 / 47
Computational pipeline Global analysis Characterization of 50
Jeck : ratio of forward to back-spliced products
compared the abundance of backsplice-spanning reads to reads spanning
traditionally spliced junctions.
The relative rate of backsplicing over forward splicing for these sites varied
enormously, from <0.1% to >3200%, with no forward splicing products
observed in some cases
using the LOW stringency list, 14.4% of genes expressed in human fibroblasts
produced circular RNA species, suggesting at least one in eight human genes
produces abundant circular as well as linear transcripts.
ciRNA April 30, 2013 20 / 47
Computational pipeline Global analysis Characterization of 50
Jeck validation: circRNA-generating transcripts are
non-polyadenylated
oligo-dT priming for reverse
transcription significantly reduced the
levels of backspliced products relative
to poly(A)- containing transcripts
To exclude the possibility that these
transcripts were the results of
trans-spliced products, used a virtual
Northern approach, which identifies the
size of transcripts by fractionating on a
denatur-ing agarose gel, followed by
qPCR.
Backsplice-containing transcripts
appeared in faster migrating fractions as
compared to the associated full-length
linear transcripts, as would be predicted
of circular RNA species composed of
only a subset of exons. Trans-spliced
products, in contrast, would be
expected to be longer than full-length,
as they contain repeated exons.
ciRNA April 30, 2013 21 / 47
Computational pipeline Global analysis Characterization of 50
Examples
The AFF1 intron is spliced
out (Supplementary Fig.
2e). Sequence
conservation: placental
mammals phyloP score,
scale bar, 200nucleotides.
ciRNA April 30, 2013 22 / 47
Computational pipeline Global analysis Characterization of 50
Jeck examples
ciRNA April 30, 2013 23 / 47
Computational pipeline Global analysis Characterization of 50
Conservation of intergenic and intronic circRNAs
Intergenic and a few intronic circRNAs display a mild but significant enrichment
of conserved nucleotides
ciRNA April 30, 2013 24 / 47
Computational pipeline Global analysis Characterization of 50
Conservation of intergenic and intronic: approach
downloaded genome-wide human (hg19) phyloP conservation score58 tracks derived from genome
alignments of placental mammals from UCSC
read out the conservation scores along the complete circRNA and searched for blocks of at least
6-nucleotide length that exceeded a conservation score of 0.3 for intergenic and 0.5 for intronic
circRNAs.
The different cutoffs empirically adjust for the different background levels of conservation and
were also used on the respective controls.
For each circRNA, we computed the cumulative length of all such blocks and normalized it by the
genomic length of the circRNA.
Artefacts of constant positive conservation scores in the phyloP profile, apparently caused by missing
alignment data, were removed with an entropy filter (this did not qualitatively affect the results).
circRNAs annotated as intronic by the best-match procedure explained above that had any overlap with
exons in alternative transcripts on either strand (5 cases) were removed from the analysis.
phastCons score takes neighboring bases into account, estimating the probability that each nucleotide belongs to a conserved element. phyloP
score is a separate measurement of conservation at each base, ignoring neighboring bases in its calculation. It can measure acceleration (faster
evolution than expected under neutral drift) as well as conservation (slower than expected evolution). PhyloP is useful for evaluating signatures of
selection at particular nucleotides (e.g. third codon positions, or first positions of miRNA target sites). In the phyloP plots, sites predicted to be
conserved are assigned positive scores, while sites predicted to be fast-evolving are assigned negative scores. The absolute values of the scores
represent -log p-values under a null hypothesis of neutral evolution (|phyloP| = -log(p-value) under a null hypothesis of neutral evolution)
ciRNA April 30, 2013 25 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs from coding loci conservation - approach
To analyse circRNAs composed of coding sequence and thus high overall
conservation, we selected 223 human circRNAs with circular orthologues in mouse
and entirely composed of coding sequence. Control (linear) exons were randomly
selected to match the level of conservation observed in first and second codon
positions (Methods, Fig. 1f inset and Supplementary Fig. 1k for conservation of
the remaining coding sequence (CDS))
ciRNA April 30, 2013 26 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs from coding loci conservation - results
circRNAs with conserved circularization were significantly more conserved in the
third codon position than controls, indicating evolutionary constraints at the
nucleotide level, in addition to selection at the protein level (Fig. 1f and
Supplementary Fig. 1j, k)
Coding sequence phyloP conservation score distributions of first
and second codon positions match between circRNAs and
controls, in contrast the 3rd codon position is significantly more
conserved in circRNAs (P ¡ 3e-10 n=223 Mann-Whitney-U
(mwu)(also main Fig1. f). k, The conservation score
distributions in the remaining parts of the CDS (outside the
circRNA or control) do not differ significantly for codon positions
two and three. For the first codon position, the controls are
actually more conserved, P ¡ 2e-3 n=223 Mann- Whitney-U
(mwu)), therefore conservative
ciRNA April 30, 2013 27 / 47
Computational pipeline Global analysis Characterization of 50
How they did coding conservation - the gritty details
used the best-match strategy outlined above to construct an estimated exon- chain for the circRNAs
that overlapped exclusively coding sequence.
Using this chain we in silico spliced out the corresponding blocks of the conservation score profile.
kept track of the frame and sorted the conservation scores into separate bins for each codon position.
also recorded conservation scores in the remaining pieces of coding sequence (outside the circRNA) as a
control.
However, we observed that the level of conservation is systematically different between internal parts of
the coding sequence and the amino- or carboxy-terminal parts (not shown).
We therefore randomly generated chains of internal exons, mimicking the exon-number
distribution of real circRNAs, as a control.
When analysing the circRNAs conserved between human and mouse, it became furthermore apparent
that we also needed to adjust for the higher level of overall conservation. High expression generally
correlates with conservation and thus, an expression cutoff was enforced on the transcripts used to
generate random controls. This resulted in a good to conservative match with the actual circRNAs.
ciRNA April 30, 2013 28 / 47
Computational pipeline Global analysis Characterization of 50
Conservation - Jeck - HIPK2 and HIPK3
the paralogous kinases, HIPK2 and
HIPK3, both produced abundant
circRNA, and so do the mouse
orthologues
sufficiently diverged to allow
unique mapping but retain a
similar genomic structure: a large
second exon that contains the
start codon flanked by large
introns on either side
increased coverage of exon 2 was not
seen in polyAplus libraries, consistent
with the predominant exon 2 species
being circular (Encode Data, not
shown). Based on RPKM and
qRT-PCR, the circular exon 2 transcript
of HIPK3 was apparently 5fold more
abundant than the linear form.
Consistent with transcripts being
ecircRNAs originating from the murine
Hipk2/3 genes, the amplified fragments
were of the expected size and sequence
and were enriched by RNase R digestion
ciRNA April 30, 2013 29 / 47
Computational pipeline Global analysis Characterization of 50
Jeck - more global conservation: compare to murine testis
As was the case for human cells, a high number (646 of 1477) of circles
found through a prior bioinformatic analysis (Salzman et al. 2012) of murine
brain were identified by CircleSeq of murine testis.
Of 2121 human circles from the MEDIUM stringency list (human fibroblasts)
that could be readily mapped to the murine genome, 457 mapped to genes
that produced a murine circular RNA (in testis...).
identified 69 murine circular species (including Hipk3) with exactly
homologous start and stop points of RNA circularization
ciRNA April 30, 2013 30 / 47
Computational pipeline Global analysis Characterization of 50
Experimentally tested circRNA predictions in HEK293 cells
Head-to-tail splicing assayed
by RT-qPCR with divergent
primers and Sanger
sequencing.
Predicted head-to-tail
junctions of 19/23 could be
validated.
5/7 candidates exclusively
predicted in leukocytes could
not be detected in HEK293
cells, validating
cell-type-specific expression.
ciRNA April 30, 2013 31 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs are insensitive(ish) to RNAse R
Jeck 2012: Backsplices for CDR1-AS were abundant in the control samples (mean SRPBM of 198), but both the backsplice reads and nonsplicing
reads within the gene were depleted by exonuclease digestion (mean SRPBM of 16). These observations are most consistent with linear
trans-splicing products rather than circular RNAs or with the cleavage of this circular RNA, as has been reported
Head-to-tail splicing could be produced by
trans-splicing or genomic rearrangements. To
rule out these possibilities and PCR artefacts,
validated the insensitivity of human circRNA
candidates to digestion with RNase R (degrades
linear RNA molecules) by northern blotting with
probes which span the head-to-tail junctions
Quantified RNase R resistance for 21 candidates with confirmed head-to-tail
splicing by qPCR. All of these were at least 10-fold more resistant than GAPDH
ciRNA April 30, 2013 32 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs turn over more slowly than linear RNA
24 h after blocking
transcription circRNAs
were highly stable,
exceeding the stability of
the housekeeping gene
GAPDH
ciRNA April 30, 2013 33 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs turn over more slowly than linear RNA
ciRNA April 30, 2013 34 / 47
Computational pipeline Global analysis Characterization of 50
Validation in other organisms
3/3 tested mouse circRNAs with human orthologues in
mouse brains
C. elegans
ciRNA April 30, 2013 35 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs are not translated
Engineered circular RNAs
have previously been shown
to have coding potential
(Chen and Sarnow 1995).
Linear products were
significantly enriched in the
ribosome-bound fraction for
the genes assayed: HIPK3,
KIAA0182, and MYO9B.
Circular products, in contrast,
were abundant in the
unbound fractions but not
detected in the bound
fractions, indicating that
these AUG- containing
ecircRNAs are not translated
ciRNA April 30, 2013 36 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs can be targeted by RNAi
transfected Hs68 cells with
siRNA targeting HIPK3 and
ZFY that produce both linear
and circularized transcripts; 3
siRNAs per gene
one targeting sequence
only in the linear transcript
targeting the backsplice
sequence
targeting sequence in a
circularized exon shared by
both linear and circular
species
It was even possible to design an siRNA to the
backsplice junction of ZFY that specifically targeted the
circular, but not linear, transcript.
ciRNA April 30, 2013 37 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs are probably miRNA sponges
screened for occurrences of conserved miRNA
family seed matches (Methods).
counting repetitions of conserved matches to the
same miRNA family, circRNAs were significantly
enriched compared to coding sequences
(P<2.96x10−22, MannWhitney U-test,
n=3873) or 3’ UTR sequences
(P<2.76x10−21, MannWhitney U-test,
n=3182)
ciRNA April 30, 2013 38 / 47
Computational pipeline Global analysis Characterization of 50
circRNAs are cytoplasmic
ecircRNAs either undergo nuclear export or are released to the cytoplasm
during mitosis, where they enjoy extraordinary stability, likely as a result of
resistance to debranching enzymes and RNA exonucleases.
ciRNA April 30, 2013 39 / 47
Computational pipeline Global analysis Characterization of 50
CDR1as is localized in the cytoplasm
CDR1as RNA is cytoplasmic and disperse (white spots;
single-molecule RNA FISH; maximum intensity merges of
Z-stacks). siSCR, positive; siRNA1, negative control. Blue,
nuclei (DAPI); scale bar, 5mkm
ciRNA April 30, 2013 40 / 47
Computational pipeline Global analysis Characterization of 50
Bioinformatic analysis
DAVID revealed an enrichment of protein kinases and related proteins among
the set of genes producing ecircRNAs
no specific subfamily of kinase was particularly associated with ecircRNA
production.
sought to identify cis-sequence elements proximal to backsplice events.
Sequences in the 200 bp preceding or following backsplice sites were analyzed
for enriched motifs compared to similar windows flanking noncircularized,
expressed exons.
the highest information- bearing motif was shared by both the upstream
and downstream introns and was identified as the canonical ALU repeat
the intronic flanks adjacent to circularized exons were approximately twofold
more likely to contain an ALU repeat than noncircularized exons. Circularized
exons were sixfold more likely to contain complementary ALUs than control,
noncircularized exons.
Pairs of ALU elements taken from introns flanking circularized exons were
significantly more likely to be complementary (in in- verted orientation) than
noncomplementary.
Equally likely in single-exon and multi-exon circRNAs
ciRNA April 30, 2013 41 / 47
Computational pipeline Global analysis Characterization of 50
Bioinformatic analysis 2: circRNA exons are large
the upstream and downstream introns flanking circularized exons tended to
be large: on average more than approximately threefold longer than introns
flanking control exons
circularized exons were larger than expected: when restricted to an analysis of
single exon ecircRNAs, we noted that circularized exons were approximately
threefold longer than expressed exons overall, at an average length of 690 nt
ciRNA April 30, 2013 42 / 47
Computational pipeline Global analysis Characterization of 50
How are circRNAs formed?
ciRNA April 30, 2013 43 / 47
Computational pipeline Global analysis Characterization of 50
Overlap of identified circRNAs with published circular
RNAs.
exons of DCC, ETS1 and a non-coding RNA from the human INK4/ARF
locus and the CDR1as locus
circRNAs from exons of the genes CAMSAP1, FBXW4, MAN1A2, REXO4,
RNF220 and ZKSCAN1 have been recently experimentally validated10. For
the four genes from this study, where we had ribominus data from the tissues
in which these circRNAs were predicted (leukocytes), we recovered validated
circRNAs from all of them (ZKSCAN1, CAMSAP1, FBXW4, MAN1A2).
ciRNA April 30, 2013 44 / 47
Computational pipeline Global analysis Characterization of 50
Known before
DCC
scrambled transcripts were estimated to comprise less than one
one-thousandth of transcripts
Nigro, J. M., Cho, K. R., Fearon, E. R., Kern, S. E., Ruppert, J. M., Oliner, J.
D., et al. (1991). Scrambled exons. Cell, 64(3), 607613.
MLL
Caldas, C., So, C. W., MacGregor, A., Ford, A. M., McDonald, B., Chan, L.
C., Wiedemann, L. M. (1998). Exon scrambling of MLL transcripts occur
commonly and mimic partial genomic duplication of the gene. Gene, 208(2),
167176.
ETS-1
Cocquerelle, C., Mascrez, B., Hetuin, D., Bailleul, B. (1993). Mis-splicing
yields circular RNA molecules. FASEB Journal, 7(1), 155160.
ciRNA April 30, 2013 45 / 47
Computational pipeline Global analysis Characterization of 50
Best studied: mouse SRY
consists of a single exon.
During development, the RNA exists as a linear transcript that is translated
into protein.
In the adult testes, the RNA exists primarily as a circular product that is
predominantly localized to the cytoplasm and is apparently not translated
inverted repeats in the genomic sequence flanking the SRY exon direct
transcript circularization
ciRNA April 30, 2013 46 / 47

More Related Content

What's hot

Johanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-finalJohanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-final
Johanna Edlund
 
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Thermo Fisher Scientific
 
Advances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery MethodsAdvances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery Methods
Thermo Fisher Scientific
 
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
Thermo Fisher Scientific
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...
Douglas Wu
 
single-cell-sequencing-research-review
single-cell-sequencing-research-reviewsingle-cell-sequencing-research-review
single-cell-sequencing-research-review
Swati Kadam Ph.D.
 

What's hot (20)

Johanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-finalJohanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-final
 
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
Fusion Gene Detection and Gene Expression Analysis of Circulating RNA in Plas...
 
Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...
 
2014 Zorniak JNS
2014 Zorniak JNS2014 Zorniak JNS
2014 Zorniak JNS
 
Gene Editing for everyone
Gene Editing for everyoneGene Editing for everyone
Gene Editing for everyone
 
Advances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery MethodsAdvances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery Methods
 
Visualizing Human Stem Cell Dynamics by Multicolor, Multiday High-Content Mic...
Visualizing Human Stem Cell Dynamics by Multicolor, Multiday High-Content Mic...Visualizing Human Stem Cell Dynamics by Multicolor, Multiday High-Content Mic...
Visualizing Human Stem Cell Dynamics by Multicolor, Multiday High-Content Mic...
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
 
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
Clinical Validation of an NGS-based (CE-IVD) Kit for Targeted Detection of Ge...
 
sequencing-methods-review
sequencing-methods-reviewsequencing-methods-review
sequencing-methods-review
 
A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...A decade into Next Generation Sequencing on marine non-model organisms: curre...
A decade into Next Generation Sequencing on marine non-model organisms: curre...
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
 
Next Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome AnalysisNext Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome Analysis
 
JClinChem_2003
JClinChem_2003JClinChem_2003
JClinChem_2003
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
 
Genome Sequencing in Finger Millet
Genome Sequencing in Finger MilletGenome Sequencing in Finger Millet
Genome Sequencing in Finger Millet
 
NGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical viewNGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical view
 
single-cell-sequencing-research-review
single-cell-sequencing-research-reviewsingle-cell-sequencing-research-review
single-cell-sequencing-research-review
 

Similar to Comparing the early ciRNA papers

Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
Dayananda Salam
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysis
Dr. Olusoji Adewumi
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02
t7260678
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
MohamedHasan816582
 
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell TumorsWhole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
Thermo Fisher Scientific
 
16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis
Abdulrahman Muhammad
 
The Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation DatabaseThe Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation Database
EBI
 

Similar to Comparing the early ciRNA papers (20)

RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Ashg poster sp_compressed
Ashg poster sp_compressedAshg poster sp_compressed
Ashg poster sp_compressed
 
cvw250
cvw250cvw250
cvw250
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysis
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
 
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell TumorsWhole Transcriptome Analysis of Testicular Germ Cell Tumors
Whole Transcriptome Analysis of Testicular Germ Cell Tumors
 
16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis
 
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
 
Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seq
 
Dna microarray mehran- u of toronto
Dna microarray  mehran- u of torontoDna microarray  mehran- u of toronto
Dna microarray mehran- u of toronto
 
The Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation DatabaseThe Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation Database
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
High-throughput RNA sequencing with Thermostable Group II Intron Reverse Tran...
High-throughput RNA sequencing with Thermostable Group II Intron Reverse Tran...High-throughput RNA sequencing with Thermostable Group II Intron Reverse Tran...
High-throughput RNA sequencing with Thermostable Group II Intron Reverse Tran...
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarrays
 
MASSIVELY PARELLEL SIGNATURE SEQUENCING
MASSIVELY PARELLEL SIGNATURE SEQUENCINGMASSIVELY PARELLEL SIGNATURE SEQUENCING
MASSIVELY PARELLEL SIGNATURE SEQUENCING
 
The Infobiotics Contact Map predictor at CASP9
The Infobiotics Contact Map predictor at CASP9The Infobiotics Contact Map predictor at CASP9
The Infobiotics Contact Map predictor at CASP9
 
Grindberg - PNAS
Grindberg - PNASGrindberg - PNAS
Grindberg - PNAS
 

More from Darya Vanichkina

More from Darya Vanichkina (6)

Sharing & Reusing Training Materials
Sharing & Reusing Training MaterialsSharing & Reusing Training Materials
Sharing & Reusing Training Materials
 
Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...Jumping into digital: Lessons learned while moving live coding machine learni...
Jumping into digital: Lessons learned while moving live coding machine learni...
 
ANDS_TrainingTheTrainer
ANDS_TrainingTheTrainerANDS_TrainingTheTrainer
ANDS_TrainingTheTrainer
 
Grammar of Graphics - Darya Vanichkina
Grammar of Graphics - Darya VanichkinaGrammar of Graphics - Darya Vanichkina
Grammar of Graphics - Darya Vanichkina
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolution
 
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
Activity-dependent transcriptional dynamics in mouse primary cortical and hum...
 

Recently uploaded

Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Dipal Arora
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
 
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur  Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Guntur  Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
 

Comparing the early ciRNA papers

  • 1. Computational pipeline Global analysis Characterization of 50 Circular RNAs are a large class of animal RNAs with regulatory potency S Memczak∗ , M Jens∗ , A Elefsinioti∗ , F Torti∗ , J Krueger, A Rybak, L Maier, S D Mackowiak, L H Gregersen, M Munschauer, A Loewer, U Ziebold, M Landthaler, C Kocks, F le Noble, and N Rajewsky April 30, 2013 ciRNA April 30, 2013 1 / 47
  • 2. Computational pipeline Global analysis Characterization of 50 Basic principles used 2x76bp run data generated after ribosomal RNA depletion and random priming screened RNA sequencing reads for splice junctions formed by an acceptor splice site at the 5’ end of an exon and a donor site at a downstream 3’ end (head-to-tail) ciRNA April 30, 2013 2 / 47
  • 3. Computational pipeline Global analysis Characterization of 50 More details 1 filtered out reads that aligned (with bowtie2) contiguously and full-length to the genome 2 From unmapped reads (including normal spliced reads) extracted 20mers from both ends and aligned them independently to find unique anchor positions within spliced exons. Anchors that aligned in the reversed orientation (head-to-tail) indicated circRNA splicing (normal splicing was also found by consecutive anchors). 3 Extended the anchor alignments such that the complete read aligns and the breakpoints were flanked by GU/AG splice sites. Ambiguous breakpoints were discarded. 4 The resulting alignments were read by another custom script that jointly evaluates consecutive anchor alignments belonging to the same original read, performs extensions of the anchor alignments, and collects statistics on splice sites. After the run completes, the script outputs all detected splice junctions (linear and circular) in a UCSC BED-like format with extra columns holding quality statistics, read counts etc. ciRNA April 30, 2013 3 / 47
  • 4. Computational pipeline Global analysis Characterization of 50 Filtering demands GU/AG flanking the splice sites (built in); unambiguous breakpoint detection a maximum of two mismatches in the extension procedure the breakpoint cannot reside more than 2 nucleotides inside an anchor at least two independent reads (each distinct sequence only counted once per sample) support the junction unique anchor alignments with a safety margin to the next-best alignment of at least one anchor above 35 points (approximately equivalent to more than two extra mismatches in high-quality bases) a genomic distance between the two splice sites of no more than 100 kb (only a small percentage of the data). As the ribosomal DNA cluster is part of the C. elegans genome assembly and ribosomal pre-RNAs could give rise to circular RNAs by mechanisms independent of the spliceosome, we discarded 130 candidates that mapped to the rDNA cluster on chrI:15,060,286-15,071,020. ciRNA April 30, 2013 4 / 47
  • 5. Computational pipeline Global analysis Characterization of 50 Permutation testing reversed either anchor reversed the complete read randomly reassigned anchors between reads reverse complemented the read (as a positive control). the reverse complement recovered the same output as expected, the various permutations led to only very few candidate predictions, well below 0.2% of the output with unpermuted reads and in excel-lent agreement with the results from simulated reads sensitivity (.75%) and FDR (0.2%) using simulated reads and permutations of real sequencing data the efficiency of ribominus protocols to extract and sequence circRNAs is limited, reducing overall sensitivity ciRNA April 30, 2013 5 / 47
  • 6. Computational pipeline Global analysis Characterization of 50 Data generated generated ribominus data for HEK293 cells and combined with human leukocyte data detected 1,950 circRNAs with support from at least two independent junction-spanning reads ciRNA April 30, 2013 6 / 47
  • 7. Computational pipeline Global analysis Characterization of 50 Jeck 2012: cANRIL, or how they got interested Our group discovered a circular RNA species, circular ANRIL, whose expression is associated with that of products of the human INK4a/ARF locus and is correlated with the risk of human atherosclerosis Production of cANRIL in humans is associated with common SNPs predicted to affect cANRIL splicing, suggesting the possibility that cANRIL production influences PcG-mediated repression of the INK4a/ARF locus to influence atherosclerosis risk ciRNA April 30, 2013 7 / 47
  • 8. Computational pipeline Global analysis Characterization of 50 CircleSeq: The RNAse R approach - Jeck 2012 ciRNA April 30, 2013 8 / 47
  • 9. Computational pipeline Global analysis Characterization of 50 CircleSeq: The RNase R approach - Jeck 2012 Method was optimized to allow for >10-fold enrichment of cANRIL in cDNA prepared from RNase R-treated vs. untreated samples Used approx. 300 million 100-bp reads per sample aligned to the human genome using a de novo splice mapping algorithm, MapSplice compiled the list of all fusion splice junctions where splice donor and acceptor occur within 2 Mb but in the non-colinear ordering; term these junctions backsplices. New metric - SRPBM - spliced reads per billion mapping Counts of reads mapping across an identified backsplice in untreated samples, normalized by read length and number of reads mapping,– to permit quantitative comparisons between backsplices. ciRNA April 30, 2013 9 / 47
  • 10. Computational pipeline Global analysis Characterization of 50 Using RNase R works CircleSeq enriches for backsplice junctions. Two-dimensional histograms showing normalized backspliced read count (SRPBM) or normalized exon coverage (RPKM) between two samples or replicates. (A) Coverage of backsplice reads in RNase R-treated replicates over all distinct backsplice species (R2 = 0.579). (B) Coverage of exons in mock treated replicates (R2 = 0.91). (C) Average backsplice coverage in RNase R-treated against mock treated RNA-seq showing enrichment of most backsplice species by RNase R. (D) Mean normalized exon coverage in annotated exon sequences in RNase R-treated against mock treated RNA-seq showing depletion of the majority of species by RNase R. ciRNA April 30, 2013 10 / 47
  • 11. Computational pipeline Global analysis Characterization of 50 And generates some interesting results... identified >100,000 unique backsplice events throughout the genome. Of these, 25,166 were present in both RNase R-treated biological replicates and were enriched by RNase R treatment as compared with mock treatment. 31% of backsplice species observed in untreated controls were not enriched by RNase R (Fig. 2C). These species likely represent mapping artifacts or nonsequential exons harbored in linear products, resulting from either RNA trans-splicing or cleavage of ecircRNAs. See “positive controls” cANRIL, cETS-1; not able to detect other known circular species (e.g., SRY and DCC), due to low expression of these genes in Hs68 cells the 25,166 replicated backsplice events detected by CircleSeq included the large majority (1025 of 1319) of putative circles identified through a previously described bioinformatic approach (Salzman 2012) - in different cell types!!! [T-cells vs fibroblasts] Of the 25,166 unique backsplicing events that reproducibly enriched by RNase R digestion in both biological replicates, most were only found in RNase R-treated samples and were not observed in the absence of exonuclease digestion. many of these events represent rare ecircRNAs arising from pervasive background levels of RNA circularization, which may result from an occasional error in splicing (Hsu and Hertel 2009). ciRNA April 30, 2013 11 / 47
  • 12. Computational pipeline Global analysis Characterization of 50 Filtering on confidence LOW stringency set requiring a single backsplice read in the control data and HIGH stringency requiring coverage on a par with splices in a moderately expressed gene ciRNA April 30, 2013 12 / 47
  • 13. Computational pipeline Global analysis Characterization of 50 Data generated - Memzack 2013 generated ribominus data for HEK293 cells and combined with human leukocyte data detected 1,950 circRNAs with support from at least two independent junction-spanning reads ciRNA April 30, 2013 13 / 47
  • 14. Computational pipeline Global analysis Characterization of 50 Expression of genes predicted to give rise to circRNAs was slightly shifted towards higher expression values “indicating that circRNAs are not just rare mistakes of the spliceosome” Histograms of gene expression levels obtained from polyA+ RNA sequencing in HEK293 cells. The number of reads per kilobase of exon per million mapped reads (RPKM) reflects mRNA abundance. Genes that are predicted to give rise to circRNAs (red circles) are not specifically enriched for high expression, (solid line: all genes). circRNAs from lowly expressed genes are detected less frequently, comparable to the loss of sensitivity observed for linear splicing (black dashed line: genes with > 75% of annotated splice sites recovered, gray dashed line: >50%, light gray:>10%) ciRNA April 30, 2013 14 / 47
  • 15. Computational pipeline Global analysis Characterization of 50 Mouse and nematode identified 1,903 circRNAs in mouse (brains, fetal head, differentiation-induced embryonic stem cells; 81 of these mapped to human circRNAs mapped mouse circRNAs were compared with independently identified human circRNAs using liftOver, yielding 229 circRNAs with precisely orthologous splice sites between human and mouse. Of these, 223 were composed exclusively of coding exons and were subsequently used for our conservation analysis (Fig. 1f). When intersecting the reported sets of circRNAs supported by two independent reads in each species, we found 81 conserved circRNAs (supported by at least 4 reads in total). used sequencing data from various C. elegans developmental stages (Stoeckius, M. manuscript in prep) and detected 724 circRNAs, with at least two independent reads. Numerous circRNAs seem to be specifically expressed in a cell type or developmental stage (Fig. 1b,c, S1e) hsa-circRNA 2149 is supported by 13 unique, head-to-tail spanning reads in CD191 leukocytes but is not detected in CD341 leukocytes, neutrophils or HEK293 cells a number of nematode circRNAs seem to be expressed in oocytes but absent in 1- or 2-cell embryos ciRNA April 30, 2013 15 / 47
  • 16. Computational pipeline Global analysis Characterization of 50 Where do these circRNAs come from? Intersection of circRNAs with known transcripts computational screen identifies only the splice sites that lead to circularization but not the internal exon/ intron structure of circular RNAs => inferred as much as possible from annotated transcripts conservative assumption was that as little as possible should be spliced out coincidence of circRNA splice sites with exonic boundaries inside a transcript were considered as an indicator for relevant agreement and internal introns appear to be spliced out ciRNA April 30, 2013 16 / 47
  • 17. Computational pipeline Global analysis Characterization of 50 Where do these circRNAs come from? annotated human circRNAs using the RefSeq database and a catalogue of non-coding RNAs 85% of human circRNAs align sense to known genes 10% of all circRNAs align antisense to known transcripts, smaller fractions align to UTRs, introns, unannotated regions of the genome. ciRNA April 30, 2013 17 / 47
  • 18. Computational pipeline Global analysis Characterization of 50 Where do these circRNAs come from? Sorted all overlapping transcripts hierarchically by splice-site coincidence (2, 1, or 0) total amount of exonic sequence between the splice sites total amount of coding sequence Latter was used to break ties only and helped the annotation process. if one or both splice sites fell into an exon of the best matching transcript, the corresponding exon boundary was trimmed. if it fell into an intron or beyond transcript bounds, the closest exon was extended to match the circRNA boundaries. circRNA start/end coordinates were never altered. If no annotated exons overlapped the circRNA we assumed a single-exon circRNA. The resulting annotation of circRNAs is based on the best matching transcript and may in some cases not represent the ideal choice. Changing the annotation rules, however, did not substantially change the numbers in Fig. 1d ciRNA April 30, 2013 18 / 47
  • 19. Computational pipeline Global analysis Characterization of 50 How many exons form part of circRNA? Their splice sites typically span one to five exons and overlap coding exons (84%), but only in 65% of these cases are both splice sites that participate in the circularization known splice sites Jeck: introns are spliced from most circular forms Jeck: observed many single exon ecircRNAs, that is, where the donor site splices to the acceptor of the same exon (e.g., KIAA0182) ciRNA April 30, 2013 19 / 47
  • 20. Computational pipeline Global analysis Characterization of 50 Jeck : ratio of forward to back-spliced products compared the abundance of backsplice-spanning reads to reads spanning traditionally spliced junctions. The relative rate of backsplicing over forward splicing for these sites varied enormously, from <0.1% to >3200%, with no forward splicing products observed in some cases using the LOW stringency list, 14.4% of genes expressed in human fibroblasts produced circular RNA species, suggesting at least one in eight human genes produces abundant circular as well as linear transcripts. ciRNA April 30, 2013 20 / 47
  • 21. Computational pipeline Global analysis Characterization of 50 Jeck validation: circRNA-generating transcripts are non-polyadenylated oligo-dT priming for reverse transcription significantly reduced the levels of backspliced products relative to poly(A)- containing transcripts To exclude the possibility that these transcripts were the results of trans-spliced products, used a virtual Northern approach, which identifies the size of transcripts by fractionating on a denatur-ing agarose gel, followed by qPCR. Backsplice-containing transcripts appeared in faster migrating fractions as compared to the associated full-length linear transcripts, as would be predicted of circular RNA species composed of only a subset of exons. Trans-spliced products, in contrast, would be expected to be longer than full-length, as they contain repeated exons. ciRNA April 30, 2013 21 / 47
  • 22. Computational pipeline Global analysis Characterization of 50 Examples The AFF1 intron is spliced out (Supplementary Fig. 2e). Sequence conservation: placental mammals phyloP score, scale bar, 200nucleotides. ciRNA April 30, 2013 22 / 47
  • 23. Computational pipeline Global analysis Characterization of 50 Jeck examples ciRNA April 30, 2013 23 / 47
  • 24. Computational pipeline Global analysis Characterization of 50 Conservation of intergenic and intronic circRNAs Intergenic and a few intronic circRNAs display a mild but significant enrichment of conserved nucleotides ciRNA April 30, 2013 24 / 47
  • 25. Computational pipeline Global analysis Characterization of 50 Conservation of intergenic and intronic: approach downloaded genome-wide human (hg19) phyloP conservation score58 tracks derived from genome alignments of placental mammals from UCSC read out the conservation scores along the complete circRNA and searched for blocks of at least 6-nucleotide length that exceeded a conservation score of 0.3 for intergenic and 0.5 for intronic circRNAs. The different cutoffs empirically adjust for the different background levels of conservation and were also used on the respective controls. For each circRNA, we computed the cumulative length of all such blocks and normalized it by the genomic length of the circRNA. Artefacts of constant positive conservation scores in the phyloP profile, apparently caused by missing alignment data, were removed with an entropy filter (this did not qualitatively affect the results). circRNAs annotated as intronic by the best-match procedure explained above that had any overlap with exons in alternative transcripts on either strand (5 cases) were removed from the analysis. phastCons score takes neighboring bases into account, estimating the probability that each nucleotide belongs to a conserved element. phyloP score is a separate measurement of conservation at each base, ignoring neighboring bases in its calculation. It can measure acceleration (faster evolution than expected under neutral drift) as well as conservation (slower than expected evolution). PhyloP is useful for evaluating signatures of selection at particular nucleotides (e.g. third codon positions, or first positions of miRNA target sites). In the phyloP plots, sites predicted to be conserved are assigned positive scores, while sites predicted to be fast-evolving are assigned negative scores. The absolute values of the scores represent -log p-values under a null hypothesis of neutral evolution (|phyloP| = -log(p-value) under a null hypothesis of neutral evolution) ciRNA April 30, 2013 25 / 47
  • 26. Computational pipeline Global analysis Characterization of 50 circRNAs from coding loci conservation - approach To analyse circRNAs composed of coding sequence and thus high overall conservation, we selected 223 human circRNAs with circular orthologues in mouse and entirely composed of coding sequence. Control (linear) exons were randomly selected to match the level of conservation observed in first and second codon positions (Methods, Fig. 1f inset and Supplementary Fig. 1k for conservation of the remaining coding sequence (CDS)) ciRNA April 30, 2013 26 / 47
  • 27. Computational pipeline Global analysis Characterization of 50 circRNAs from coding loci conservation - results circRNAs with conserved circularization were significantly more conserved in the third codon position than controls, indicating evolutionary constraints at the nucleotide level, in addition to selection at the protein level (Fig. 1f and Supplementary Fig. 1j, k) Coding sequence phyloP conservation score distributions of first and second codon positions match between circRNAs and controls, in contrast the 3rd codon position is significantly more conserved in circRNAs (P ¡ 3e-10 n=223 Mann-Whitney-U (mwu)(also main Fig1. f). k, The conservation score distributions in the remaining parts of the CDS (outside the circRNA or control) do not differ significantly for codon positions two and three. For the first codon position, the controls are actually more conserved, P ¡ 2e-3 n=223 Mann- Whitney-U (mwu)), therefore conservative ciRNA April 30, 2013 27 / 47
  • 28. Computational pipeline Global analysis Characterization of 50 How they did coding conservation - the gritty details used the best-match strategy outlined above to construct an estimated exon- chain for the circRNAs that overlapped exclusively coding sequence. Using this chain we in silico spliced out the corresponding blocks of the conservation score profile. kept track of the frame and sorted the conservation scores into separate bins for each codon position. also recorded conservation scores in the remaining pieces of coding sequence (outside the circRNA) as a control. However, we observed that the level of conservation is systematically different between internal parts of the coding sequence and the amino- or carboxy-terminal parts (not shown). We therefore randomly generated chains of internal exons, mimicking the exon-number distribution of real circRNAs, as a control. When analysing the circRNAs conserved between human and mouse, it became furthermore apparent that we also needed to adjust for the higher level of overall conservation. High expression generally correlates with conservation and thus, an expression cutoff was enforced on the transcripts used to generate random controls. This resulted in a good to conservative match with the actual circRNAs. ciRNA April 30, 2013 28 / 47
  • 29. Computational pipeline Global analysis Characterization of 50 Conservation - Jeck - HIPK2 and HIPK3 the paralogous kinases, HIPK2 and HIPK3, both produced abundant circRNA, and so do the mouse orthologues sufficiently diverged to allow unique mapping but retain a similar genomic structure: a large second exon that contains the start codon flanked by large introns on either side increased coverage of exon 2 was not seen in polyAplus libraries, consistent with the predominant exon 2 species being circular (Encode Data, not shown). Based on RPKM and qRT-PCR, the circular exon 2 transcript of HIPK3 was apparently 5fold more abundant than the linear form. Consistent with transcripts being ecircRNAs originating from the murine Hipk2/3 genes, the amplified fragments were of the expected size and sequence and were enriched by RNase R digestion ciRNA April 30, 2013 29 / 47
  • 30. Computational pipeline Global analysis Characterization of 50 Jeck - more global conservation: compare to murine testis As was the case for human cells, a high number (646 of 1477) of circles found through a prior bioinformatic analysis (Salzman et al. 2012) of murine brain were identified by CircleSeq of murine testis. Of 2121 human circles from the MEDIUM stringency list (human fibroblasts) that could be readily mapped to the murine genome, 457 mapped to genes that produced a murine circular RNA (in testis...). identified 69 murine circular species (including Hipk3) with exactly homologous start and stop points of RNA circularization ciRNA April 30, 2013 30 / 47
  • 31. Computational pipeline Global analysis Characterization of 50 Experimentally tested circRNA predictions in HEK293 cells Head-to-tail splicing assayed by RT-qPCR with divergent primers and Sanger sequencing. Predicted head-to-tail junctions of 19/23 could be validated. 5/7 candidates exclusively predicted in leukocytes could not be detected in HEK293 cells, validating cell-type-specific expression. ciRNA April 30, 2013 31 / 47
  • 32. Computational pipeline Global analysis Characterization of 50 circRNAs are insensitive(ish) to RNAse R Jeck 2012: Backsplices for CDR1-AS were abundant in the control samples (mean SRPBM of 198), but both the backsplice reads and nonsplicing reads within the gene were depleted by exonuclease digestion (mean SRPBM of 16). These observations are most consistent with linear trans-splicing products rather than circular RNAs or with the cleavage of this circular RNA, as has been reported Head-to-tail splicing could be produced by trans-splicing or genomic rearrangements. To rule out these possibilities and PCR artefacts, validated the insensitivity of human circRNA candidates to digestion with RNase R (degrades linear RNA molecules) by northern blotting with probes which span the head-to-tail junctions Quantified RNase R resistance for 21 candidates with confirmed head-to-tail splicing by qPCR. All of these were at least 10-fold more resistant than GAPDH ciRNA April 30, 2013 32 / 47
  • 33. Computational pipeline Global analysis Characterization of 50 circRNAs turn over more slowly than linear RNA 24 h after blocking transcription circRNAs were highly stable, exceeding the stability of the housekeeping gene GAPDH ciRNA April 30, 2013 33 / 47
  • 34. Computational pipeline Global analysis Characterization of 50 circRNAs turn over more slowly than linear RNA ciRNA April 30, 2013 34 / 47
  • 35. Computational pipeline Global analysis Characterization of 50 Validation in other organisms 3/3 tested mouse circRNAs with human orthologues in mouse brains C. elegans ciRNA April 30, 2013 35 / 47
  • 36. Computational pipeline Global analysis Characterization of 50 circRNAs are not translated Engineered circular RNAs have previously been shown to have coding potential (Chen and Sarnow 1995). Linear products were significantly enriched in the ribosome-bound fraction for the genes assayed: HIPK3, KIAA0182, and MYO9B. Circular products, in contrast, were abundant in the unbound fractions but not detected in the bound fractions, indicating that these AUG- containing ecircRNAs are not translated ciRNA April 30, 2013 36 / 47
  • 37. Computational pipeline Global analysis Characterization of 50 circRNAs can be targeted by RNAi transfected Hs68 cells with siRNA targeting HIPK3 and ZFY that produce both linear and circularized transcripts; 3 siRNAs per gene one targeting sequence only in the linear transcript targeting the backsplice sequence targeting sequence in a circularized exon shared by both linear and circular species It was even possible to design an siRNA to the backsplice junction of ZFY that specifically targeted the circular, but not linear, transcript. ciRNA April 30, 2013 37 / 47
  • 38. Computational pipeline Global analysis Characterization of 50 circRNAs are probably miRNA sponges screened for occurrences of conserved miRNA family seed matches (Methods). counting repetitions of conserved matches to the same miRNA family, circRNAs were significantly enriched compared to coding sequences (P<2.96x10−22, MannWhitney U-test, n=3873) or 3’ UTR sequences (P<2.76x10−21, MannWhitney U-test, n=3182) ciRNA April 30, 2013 38 / 47
  • 39. Computational pipeline Global analysis Characterization of 50 circRNAs are cytoplasmic ecircRNAs either undergo nuclear export or are released to the cytoplasm during mitosis, where they enjoy extraordinary stability, likely as a result of resistance to debranching enzymes and RNA exonucleases. ciRNA April 30, 2013 39 / 47
  • 40. Computational pipeline Global analysis Characterization of 50 CDR1as is localized in the cytoplasm CDR1as RNA is cytoplasmic and disperse (white spots; single-molecule RNA FISH; maximum intensity merges of Z-stacks). siSCR, positive; siRNA1, negative control. Blue, nuclei (DAPI); scale bar, 5mkm ciRNA April 30, 2013 40 / 47
  • 41. Computational pipeline Global analysis Characterization of 50 Bioinformatic analysis DAVID revealed an enrichment of protein kinases and related proteins among the set of genes producing ecircRNAs no specific subfamily of kinase was particularly associated with ecircRNA production. sought to identify cis-sequence elements proximal to backsplice events. Sequences in the 200 bp preceding or following backsplice sites were analyzed for enriched motifs compared to similar windows flanking noncircularized, expressed exons. the highest information- bearing motif was shared by both the upstream and downstream introns and was identified as the canonical ALU repeat the intronic flanks adjacent to circularized exons were approximately twofold more likely to contain an ALU repeat than noncircularized exons. Circularized exons were sixfold more likely to contain complementary ALUs than control, noncircularized exons. Pairs of ALU elements taken from introns flanking circularized exons were significantly more likely to be complementary (in in- verted orientation) than noncomplementary. Equally likely in single-exon and multi-exon circRNAs ciRNA April 30, 2013 41 / 47
  • 42. Computational pipeline Global analysis Characterization of 50 Bioinformatic analysis 2: circRNA exons are large the upstream and downstream introns flanking circularized exons tended to be large: on average more than approximately threefold longer than introns flanking control exons circularized exons were larger than expected: when restricted to an analysis of single exon ecircRNAs, we noted that circularized exons were approximately threefold longer than expressed exons overall, at an average length of 690 nt ciRNA April 30, 2013 42 / 47
  • 43. Computational pipeline Global analysis Characterization of 50 How are circRNAs formed? ciRNA April 30, 2013 43 / 47
  • 44. Computational pipeline Global analysis Characterization of 50 Overlap of identified circRNAs with published circular RNAs. exons of DCC, ETS1 and a non-coding RNA from the human INK4/ARF locus and the CDR1as locus circRNAs from exons of the genes CAMSAP1, FBXW4, MAN1A2, REXO4, RNF220 and ZKSCAN1 have been recently experimentally validated10. For the four genes from this study, where we had ribominus data from the tissues in which these circRNAs were predicted (leukocytes), we recovered validated circRNAs from all of them (ZKSCAN1, CAMSAP1, FBXW4, MAN1A2). ciRNA April 30, 2013 44 / 47
  • 45. Computational pipeline Global analysis Characterization of 50 Known before DCC scrambled transcripts were estimated to comprise less than one one-thousandth of transcripts Nigro, J. M., Cho, K. R., Fearon, E. R., Kern, S. E., Ruppert, J. M., Oliner, J. D., et al. (1991). Scrambled exons. Cell, 64(3), 607613. MLL Caldas, C., So, C. W., MacGregor, A., Ford, A. M., McDonald, B., Chan, L. C., Wiedemann, L. M. (1998). Exon scrambling of MLL transcripts occur commonly and mimic partial genomic duplication of the gene. Gene, 208(2), 167176. ETS-1 Cocquerelle, C., Mascrez, B., Hetuin, D., Bailleul, B. (1993). Mis-splicing yields circular RNA molecules. FASEB Journal, 7(1), 155160. ciRNA April 30, 2013 45 / 47
  • 46. Computational pipeline Global analysis Characterization of 50 Best studied: mouse SRY consists of a single exon. During development, the RNA exists as a linear transcript that is translated into protein. In the adult testes, the RNA exists primarily as a circular product that is predominantly localized to the cytoplasm and is apparently not translated inverted repeats in the genomic sequence flanking the SRY exon direct transcript circularization ciRNA April 30, 2013 46 / 47