SlideShare ist ein Scribd-Unternehmen logo
1 von 28
An introduction to RNA-seq
                        RNA-
     data analysis
                Sonika Tyagi
                Australian Genome Research Facility




1 August 2012
Outline
• Transcriptomics using RNA-seq:
  Applications
• Gene expression profiling workflows
• Design Challenges
RNA sequencing (mRNA-seq or
               (mRNA-
         RNA-
         RNA-seq)
“An experimental protocol that uses
next- generation sequencing
technologies to sequence the RNA
molecules within a biological sample in
an effort to determine the primary
sequence and relative abundance of
each RNA”
A typical RNA-seq experiment
          RNA-

                                         Library preparation
                                                 and
                                             Sequencing




                                       Bioinformatics Analysis




              Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
RNA-
       RNA-seq Application
• Allele specific expression: prevelance
  of transcribed SNPs
• Fusion transcripts: e.g., in cancer
• Abundance estimation: alternative
  splicing, RNA-editing, novel
  transcripts
• Gene expression profiling
Raw sequences (fastq
My Answer:          files)


              Quality control (QC)



             Spliced Read alignment


                   Transcripts
                 reconstruction


             Differential expression
                    analysis


                    Biology
Reference
                                  Available ?


                                                     Annotated de novo transcriptome
Annotated Genome            Assembled/Predicted
                                                                assembly
                               transcriptome


                             Reads mapping        •De novo assembly
Reads mapping
                                                  •Reference assisted


                             Transcripts
Transcripts                  reconstruction
reconstruction


                             Summarization
                       a     (by CDS, exon,
                             gene, splice
                             junctions )




                            Tables of
                            counts (digital
                            expression)




            Biology         DE analysis
                                                  RNA-
                                                  RNA-seq workflows
            (GO/Pathways)
Raw sequences (fastq
       files)


Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
QC tools
Raw sequences (fastq
       files)


 Quality control (QC)


    Spliced Read
     alignment

      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
Alignments /
mapping splice
   junctions

   Unspliced read           Examples:             •       Ideal for mapping
                                                          reads against cDNA
   aligners                 • MAQ, Stampy,                databases.
                              ELAND               •       Splice junction/events
   • Seed methods
                            • BWA, Bowtie                 are not picked up
   • Burrow wheel methods



   Spliced read             Examples:                 •    Novel splice junctions
                                                           can be detected
   aligners                 • Tophat,Mapsplice,
                              SpliceMap               •    Perform better for
   • Exon first                                            polymorphic regions
   • Seed – Extend method   • GSNAP, QPALMA,
                                                           and aligning
                              Elandv2e
                                                           pseudogenes.
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


     Transcripts
   reconstruction


Differential expression
       analysis


       Biology
Transcripts
reconstruction

                    Examples:
    Genome guided   • G.mor.se (short
                      reads), cufflinks and
                      Scripture (for long
                     reads)




                     Examples:
    Genome           •   Transabyss,
                         velvet+Oases,
    independent          MIRA, cufflinks*
Genome guided transcriptome
        assembly
Genome guided transcriptome
        assembly



           doi:10.1038/nrg3068
            doi:10.1038/nrg3068
            Published online




                Martin J and Wang Z, Nat Rev Gen 2011
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


    Differential
expression analysis


       Biology
Normalisation
   and DE

   Library size     Examples:
   RPKM             ERANGE, Cuffdiff
   FPKM              edgeR , Myrna
   TMM
   Upper quartile
   Poisson GLM      Examples:
   Negative         DEGseq Myrna
   binomial         edgeR, bayseq,
                    Cuffdiff
Quantification and
normalisation
1. Digital expression or raw
   count: number of reads
   mapping to a region (exon/
   transcript/novel region)
2. Normalize counts* : number
   of reads per million reads
   per kb
3. Splice junction detection
4. Compare to existing gene
   models
        Nat Meth 2008 ; DOI:10.1038/NMETH.1226
Differential expression
• Normalised gene expression value as RPKM:
  – reads per kilobase of exon model per million mapped reads

• Or FPKM:
  – fragments per kilobase of exon model per million mapped reads

• Compare RPKM/FPKM across conditions or tissues




                                           Nat Meth DOI:10.1038/NMETH.1226
Raw sequences (fastq
       files)


 Quality control (QC)



Spliced Read alignment


      Transcripts
    reconstruction


Differential expression
       analysis


       Biology
System Biology: beyond the
       list of DE genes
• Ontologies: GO enrichment, Goseq
  (R package)
• DAVID (http://david.abcc.ncifcrf.gov)
• Pathway analysis
RNA-
        RNA-seq experiment design
               challenges
• NGS biases:
    – Libraryprep (GC content, 5’ or 3’
      depletion, random hexamer primers,
      RNA species, bias towards 3’ end …).
    – Transcript length
•   Sequencing depth
•   Single or paired end
•   Biological or technical replicates
•   Validation         BRIEFINGS IN BIOINFORMATICS. VOL 12. NO 3. 280^287
RNA-
   RNA-seq and other
transcriptomics methods




          Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
Summary
• RNA-seq: more versatile, comprehensive with
  superior reproducibility and resolution.
• Not dependent on prior sequence information:
  suitable for non-model organisms.
• Potentially provides information for all RNA
  species in the cell and allows discovery of novel
  ones.
• Still an actively developing fields and there are
  research areas which still need refinement.
• Experimental design and validation gold
  standards to be set.
Tophat Cufflinks pipeline reference


Differential gene and transcript expression
analysis of RNA-seq experiments with
TopHat and Cufflinks. Nat Protoc 7(3), 562-
78. [article]
Differential gene and transcript expression
analysis of RNA-seq experiments with
TopHat and Cufflinks. Nat Protoc 7(3), 562-
78. [article]
R-bioconductor based RNA-seq
                     RNA-
          packages
• edgeR
• Voom
• Deseq

http://bioconductor.org/packages/rele
ase/BiocViews.html#___Software
An introduction to RNA-seq data analysis

Weitere ähnliche Inhalte

Was ist angesagt?

Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisEfi Athieniti
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...Torsten Seemann
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
 
Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seqJyoti Singh
 
RNA INTERFERENCE TECHNOLOGY
RNA INTERFERENCE TECHNOLOGYRNA INTERFERENCE TECHNOLOGY
RNA INTERFERENCE TECHNOLOGYAshok2404
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysisRamaJumwal2
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3BITS
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisSANJANA PANDEY
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingSwathi Prabakar
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beikobeiko
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerKAUSHAL SAHU
 

Was ist angesagt? (20)

Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing Analysis
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seq
 
RNA INTERFERENCE TECHNOLOGY
RNA INTERFERENCE TECHNOLOGYRNA INTERFERENCE TECHNOLOGY
RNA INTERFERENCE TECHNOLOGY
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beiko
 
Primers
PrimersPrimers
Primers
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 
PPT ON ALGORITHM
PPT ON ALGORITHMPPT ON ALGORITHM
PPT ON ALGORITHM
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular marker
 

Ähnlich wie An introduction to RNA-seq data analysis

Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment DesignYaoyu Wang
 
Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectFundación Ramón Areces
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencingSean Davis
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014LutzFr
 
Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Yaoyu Wang
 
Differential expression in RNA-Seq
Differential expression in RNA-SeqDifferential expression in RNA-Seq
Differential expression in RNA-SeqcursoNGS
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisJunsu Ko
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...Borlaug Global Rust Initiative
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...fruitbreedomics
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqManjappa Ganiger
 
Tools for lncRNA research in cancer
Tools for lncRNA research in cancerTools for lncRNA research in cancer
Tools for lncRNA research in cancerGhent University
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 

Ähnlich wie An introduction to RNA-seq data analysis (20)

Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
 
Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome Project
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencing
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014Bioinformatics workshop Sept 2014
Bioinformatics workshop Sept 2014
 
Rnaseq basics ngs_application1
Rnaseq basics ngs_application1Rnaseq basics ngs_application1
Rnaseq basics ngs_application1
 
Differential expression in RNA-Seq
Differential expression in RNA-SeqDifferential expression in RNA-Seq
Differential expression in RNA-Seq
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2
 
Kogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysisKogo 2013 RNA-seq analysis
Kogo 2013 RNA-seq analysis
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
Tools for lncRNA research in cancer
Tools for lncRNA research in cancerTools for lncRNA research in cancer
Tools for lncRNA research in cancer
 
Evolution 2012
Evolution 2012Evolution 2012
Evolution 2012
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 

Kürzlich hochgeladen

AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 

Kürzlich hochgeladen (20)

AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 

An introduction to RNA-seq data analysis

  • 1. An introduction to RNA-seq RNA- data analysis Sonika Tyagi Australian Genome Research Facility 1 August 2012
  • 2. Outline • Transcriptomics using RNA-seq: Applications • Gene expression profiling workflows • Design Challenges
  • 3. RNA sequencing (mRNA-seq or (mRNA- RNA- RNA-seq) “An experimental protocol that uses next- generation sequencing technologies to sequence the RNA molecules within a biological sample in an effort to determine the primary sequence and relative abundance of each RNA”
  • 4. A typical RNA-seq experiment RNA- Library preparation and Sequencing Bioinformatics Analysis Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
  • 5. RNA- RNA-seq Application • Allele specific expression: prevelance of transcribed SNPs • Fusion transcripts: e.g., in cancer • Abundance estimation: alternative splicing, RNA-editing, novel transcripts • Gene expression profiling
  • 6. Raw sequences (fastq My Answer: files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 7. Reference Available ? Annotated de novo transcriptome Annotated Genome Assembled/Predicted assembly transcriptome Reads mapping •De novo assembly Reads mapping •Reference assisted Transcripts Transcripts reconstruction reconstruction Summarization a (by CDS, exon, gene, splice junctions ) Tables of counts (digital expression) Biology DE analysis RNA- RNA-seq workflows (GO/Pathways)
  • 8. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 10. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 11. Alignments / mapping splice junctions Unspliced read Examples: • Ideal for mapping reads against cDNA aligners • MAQ, Stampy, databases. ELAND • Splice junction/events • Seed methods • BWA, Bowtie are not picked up • Burrow wheel methods Spliced read Examples: • Novel splice junctions can be detected aligners • Tophat,Mapsplice, SpliceMap • Perform better for • Exon first polymorphic regions • Seed – Extend method • GSNAP, QPALMA, and aligning Elandv2e pseudogenes.
  • 12. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 13. Transcripts reconstruction Examples: Genome guided • G.mor.se (short reads), cufflinks and Scripture (for long reads) Examples: Genome • Transabyss, velvet+Oases, independent MIRA, cufflinks*
  • 15. Genome guided transcriptome assembly doi:10.1038/nrg3068 doi:10.1038/nrg3068 Published online Martin J and Wang Z, Nat Rev Gen 2011
  • 16. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 17. Normalisation and DE Library size Examples: RPKM ERANGE, Cuffdiff FPKM edgeR , Myrna TMM Upper quartile Poisson GLM Examples: Negative DEGseq Myrna binomial edgeR, bayseq, Cuffdiff
  • 18. Quantification and normalisation 1. Digital expression or raw count: number of reads mapping to a region (exon/ transcript/novel region) 2. Normalize counts* : number of reads per million reads per kb 3. Splice junction detection 4. Compare to existing gene models Nat Meth 2008 ; DOI:10.1038/NMETH.1226
  • 19. Differential expression • Normalised gene expression value as RPKM: – reads per kilobase of exon model per million mapped reads • Or FPKM: – fragments per kilobase of exon model per million mapped reads • Compare RPKM/FPKM across conditions or tissues Nat Meth DOI:10.1038/NMETH.1226
  • 20. Raw sequences (fastq files) Quality control (QC) Spliced Read alignment Transcripts reconstruction Differential expression analysis Biology
  • 21. System Biology: beyond the list of DE genes • Ontologies: GO enrichment, Goseq (R package) • DAVID (http://david.abcc.ncifcrf.gov) • Pathway analysis
  • 22. RNA- RNA-seq experiment design challenges • NGS biases: – Libraryprep (GC content, 5’ or 3’ depletion, random hexamer primers, RNA species, bias towards 3’ end …). – Transcript length • Sequencing depth • Single or paired end • Biological or technical replicates • Validation BRIEFINGS IN BIOINFORMATICS. VOL 12. NO 3. 280^287
  • 23. RNA- RNA-seq and other transcriptomics methods Nature Reviews Genetics, November 2008; doi:10.1038/nrg2484
  • 24. Summary • RNA-seq: more versatile, comprehensive with superior reproducibility and resolution. • Not dependent on prior sequence information: suitable for non-model organisms. • Potentially provides information for all RNA species in the cell and allows discovery of novel ones. • Still an actively developing fields and there are research areas which still need refinement. • Experimental design and validation gold standards to be set.
  • 25. Tophat Cufflinks pipeline reference Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562- 78. [article]
  • 26. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562- 78. [article]
  • 27. R-bioconductor based RNA-seq RNA- packages • edgeR • Voom • Deseq http://bioconductor.org/packages/rele ase/BiocViews.html#___Software