3. • 21-nucleotide long virus-derived small
interfering RNAs (vsiRNAs) derived from cleavage
of viral double-stranded RNA (dsRNA)
• vsiRNAs contain the viral genomic imprint
• This property can be used to reconstruct the
viral genome using vsiRNA
3
4. Method
• Model organism for study: Drosophila melanogaster
4
NaĂŻve
D. melanogaster S2
tissue
infected
S2R cells
infected
Total RNA Extracted
Deep sequencing
of RNA (Illumina)
RNA reads Genome Assembly Contigs
BLAST
NCBI nucleotide
database
RNA Sequencing
(Sanger method)
5. BLAST results
• Viral contigs aligned to different regions of these viruses:
1. Flock house virus (FHV) variant American nodavirus (ANV)
2. Drosophila C virus (DCV)
3. Drosophila X virus (DXV)
• But the obtained contigs and the NCBI sequences had
significant differences
• Possibility: viruses present in these supernatants were
different from those previously published
• These “new” viral sequence were labeled as
1. FHSS2R+
2. DCVS2R+
3. DXVS2R+
5
6. • Introduces a perl script “Paparazzi”
• it accurately reconstitutes viral genomes through
an iterative alignment/consensus call procedure
using an initial reference sequence as a scaffold
• resulting full length consensus sequence is then
reused to profile vsiRNAs
6
8. • Paparazzi reconstructed the 3,107-nucleotide-long genome
sequence of FHVS2R+ RNA1
• But the four first nucleotide were unresolved (i.e. criteria to call
the consensus were not met )
• Also, Paparazzi reconstructed the same sequence for FHVS2R+
RNA1 when FHVNCBI was taken as the reference sequence
• The reconstructed genome differed from ANVNCBI by 9.14%
• So Paparazzi could also be used when distantly related
sequence are taken as reference
8
9. • Compared the Paparazzi reconstructed genome with Sanger
sequencing-determined sequences of FHVS2R+ RNA1
• Difference of only a single nucleotide (0.03%)
• while the sequences of ANVNCBI and FHVS2R+ RNA1 differ by
2.83% over 3,107 nucleotides.
• Paparazzi can be used as a substitute to Sanger sequencing in
this context.
9
10. Genome breakpoint identification
• FHVS2R+ RNA2
• 64 differences as compared to Sanger determined sequences
• introduced premature stop codons
• introduced two artifactual internal duplications (absent in sanger
sequences)
• reconstructed sequence did not represent the actual
functional genome
• The RNA2 segment of FHV is prone to internal deletions
• give rise to Defective Interfering particles (DIs)
• Existence of DIs confirmed by PCR and Sanger sequence data
10
11. Genome breakpoint identification
• comparison of the sanger sequence obtained for DIs and that
of ANVNCBI RNA2
• identification of three breakpoints
• All differences observed between the Paparazzi-reconstructed
and Sanger sequencing determined FHVS2R+ RNA2 sequences
resided in the regions covered by DIs or around DI breakpoints
11
12. Genome breakpoint identification
• Paparazzi was instructed to
• identify breakpoints in viral genomes
• to eliminate reads aligning against these breakpoints
12
• At each cycle, the contigs
obtained were aligned against
the virus consensus sequence
generated using BLAT
• This allowed the identification
of the breakpoints
13. • Paparazzi successfully reconstructed genomes of other detected
viruses - DCVS2R+ and DXVS2R+
13
(Graph generated from the supplementary table 1)
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
No. of reads aligned on
NCBI sequences with 0
mismatch
No. of reads aligned on
NCBI sequences with 1
mismatch
No. of reads aligned on
reconstructed sequences with 0
mismatch
No.ofreadsaligned
Number of vsiRNA (21nt) reads obtained from naĂŻve S2 cells infected by the supernatant of S2R+ cells
that align against NCBI sequences and those reconstructed by Paparazzi
ANV RNA1 ANV RNA2 DCV DXV segment A DXV segment B
14. Conclusion
• Paparazzi provides an effective tool for
• viral genome reconstruction
• accurate vsiRNA profiling
• studying RNAi processing
• Powerful tool to polish the results obtained from virus
discovery pipelines.
14
15. Critique
• This approach cannot work well when the some region of
genome are less covered than others (cold spots)
• observed in the profile of the Semliki forest virus (SFV)
• insensitive to Dicer processing
15