SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Rice Galaxy : an open resource
for rice science
13th International Conference on Genomics
October 24, 2018
Venice Margarette B. Juanillas
Bioinformatics Cluster
Strategic Innovation Platform
Open source and free bioinformatics
software are …
• typically run in command-line environment (no mouse , no graphics!)
• Example: blat (BLAST-Like alignment tool), performing local
alignment between 2 (multi) FASTA files…
Type in your
command…
$ blat
blat - Standalone BLAT v. 34 fast sequence search command line
tool
usage:
blat database query [-tileSize=8] [–maxIntron=3000] [-
out=psl] output.psl
where:
database and query are each either a .fa , .nib or .2bit
file,
or a list these files one file name per line.
-maxIntron=N Sets maximum intron size. Default is 750000
-out=type Controls output file format. Type is one of:
psl - Default. Tab separated format, no
sequence
pslx - Tab separated format with sequence
-tileSize=N sets the size of match that triggers an alignment.
Usually between 8 and 12
Default is 11 for DNA and 5 for protein.
output.psl is where to put the output.
& so many other parameters to set …
What if you can design a GUI for blat?
$blat database query [-tileSize=8] [–maxIntron=3000]
[out=psl] output.psl
Which is what Galaxy does, in fact…
Analyses are often sequential…
Commonly called analyses workflow or pipeline
– Use software1 with its own input file and generate
<output file 1> , then..
– Manipulate the text of <output file 1> so that it can be
used as input file <manipulated outfile2> of software2
– Use software2 with <manipulated outfile2> as input
and generate <outfile3>
– Use <outfile3> as input for software3, then generate
final output of analysis…
Galaxy allows you to graphically create
workflows!
https://galaxyproject.org
The Galaxy Project is supported in part by NSF,
NHGRI, The Huck Institutes of the Life Sciences, The
Institute for CyberScience at Penn State, and Johns
Hopkins University.
Galaxy has features that fit our needs
“Open, web-based platform for accessible,
reproducible, and transparent computational
biomedical research”
• Accessible: Users w/o programming experience can
easily specify parameters and run tools and workflows
• Reproducible: Galaxy captures info so that any user
can repeat and understand a complete computational
analysis
• Transparent: Users share and publish analyses via
the web and create interactive, web-based documents
that describe a complete analysis.
Galaxy Objects
• Anything that can be saved and shared
– Histories
– Workflows
– Datasets
– Pages
Community Galaxy : usegalaxy.org
Rice Galaxy Project
Rice Galaxy: Bioinformatics tools, datasets, and
reusable workflows for rice genomic and genetic
analyses
Collaboration of Institutions
IRRI : Philippines
IRD, CIRAD : France
Colorado State University, Texas A&M University, Indiana University: USA
Advanced Institute of Science and Technology: Japan
RiceGalaxy: https://galaxy.irri.org
Data accessible/integrated into Rice Galaxy
• 3,000 genomes SNP / indel & phenotype data
Data accessible/integrated into Rice Galaxy
• 3,000 genomes SNP / indel & phenotype data
Data accessible/integrated into Rice Galaxy
• 3,000 genomes SNP / indel & phenotype data
• Rice HDRA genotyping and phenotyping data
• 7 (+2 older Nipponbare) published rice genomes
and annotations
Integrated Tools into Rice Galaxy
Workflows/tools dedicated for rice from both
bioinformatics platform (South Green
Bioinformatics and IRRI platform):
1. 3k RG and HDRA Toolkit (IRRI)
2. SNP Data Analysis Tools
3. TASSEL bioinformatics (South Green, IRRI) for GBS data management
4. OGHMA genomic prediction tool (IRRI)
5. RAVE (Rapid Allelic Variant extractor) to extract variants from 3000
genomes
6. SNiPlay workflows (7) for diversity and population structure analysis and
GWAS studies (South Green)
7. Uniqprimer microbial pathogen diagnostic design toolkit (CSU/USDA/South
Green/IRRI)
Genomic Prediction Tool suite
Aim: Tools that will decipher the genotypes to understand
how it affects phenotype on rice using machine learning
algorithms
Genomic Prediction Workflow
Several classifiers in OGHMA:
- LASSO
- Random Forest
- SVM
- rrBLUP
Genomic Prediction in Rice Galaxy
http://galaxy.southgreen.fr/galaxy/
Rice Variant Analysis Tool Suite (RAVE)
Basic Use Case
• Find the gene position from Nipponbare to IR8
Nipponbare : chr01 11218-12435
How about in IR8??
1. Get gene sequence from Nipponbare (GD->Get Gene
Sequence)
2. Align to another reference genome ,IR8 (SDT->Find-
seq)
3. Post-process, clean up alignment
– Cut col 14,16,17 (TM->cut columns)
– Remove 1st 5 lines (TM ->remove beginning…)
4. Extract liftover sequences (SDT -> batch-get-subseq)
SNP lift-over (i.e. Nipponbare-> IR8) workflow
Rice Galaxy Open Access
All meaningful data objects must have a globally unique
and persistent identifier (PID) for CGIAR open access
compliance
Rice Galaxy Tool shed
• Allow other researchers to use the tools
• Allow tool shed enrichment by hosting tools from other
researchers in the rice community
• Rice Galaxy Tool shed: http: //52.76.88.51:8081/
Conclusion
• Rice Galaxy is a federated Galaxy resource tailored for
rice genetics, genomics and breeding
• Rice Galaxy integrates publicly available rice datasets
and tools from other researchers in the rice community
Thank You!
• Alexis Dereeper
• Nicolas Beaume
• Gaetan Droc
• Joshua Dizon
• John Robert Mendoza
• Jon Peter Perdon
• Locedie Mansueto
• Lindsay Triplett
• Jillian Lang
• Gabriel Zhou
• Jay Santos
• Dennis Diaz
• DOST-ASTI
• Kunalan Ratharanjan
• Beth Plale
• Jason Haga
• Jan E. Leach
• Manuel Ruiz
• Michael Thomson
• Nickolai Alexandrov
• Pierre Larmande
• Ramil P. Mauleon

Weitere ähnliche Inhalte

Was ist angesagt?

From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
Alejandra Gonzalez-Beltran
 
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
Spark Summit
 
BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...
Alejandra Gonzalez-Beltran
 

Was ist angesagt? (20)

Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
4A2B2C-2013
4A2B2C-20134A2B2C-2013
4A2B2C-2013
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio  Silva: Cloud Computing Technologies for Genomic Big Data AnalysisFabricio  Silva: Cloud Computing Technologies for Genomic Big Data Analysis
Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
 
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview Leaflet2016 Summer - Araport Project Overview Leaflet
2016 Summer - Araport Project Overview Leaflet
 
Managing Genomes At Scale: What We Learned - StampedeCon 2014
Managing Genomes At Scale: What We Learned - StampedeCon 2014Managing Genomes At Scale: What We Learned - StampedeCon 2014
Managing Genomes At Scale: What We Learned - StampedeCon 2014
 
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
Finding Needles in Genomic Haystacks with “Wide” Random Forest: Spark Summit ...
 
Scalable Genome Analysis With ADAM
Scalable Genome Analysis With ADAMScalable Genome Analysis With ADAM
Scalable Genome Analysis With ADAM
 
BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...BioSharing.org - mapping the landscape of community standards, databases, dat...
BioSharing.org - mapping the landscape of community standards, databases, dat...
 
ROHub
ROHubROHub
ROHub
 
Made fosdem v2
Made fosdem v2Made fosdem v2
Made fosdem v2
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
 

Ähnlich wie Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science

Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
David Ruau
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
BOSC 2010
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
Justin Johnson
 

Ähnlich wie Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science (20)

Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Enabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQLEnabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQL
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
 
Cloud bioinformatics 2
Cloud bioinformatics 2Cloud bioinformatics 2
Cloud bioinformatics 2
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real Science
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
 
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
 
Accelerating Genomics SNPs Processing and Interpretation with Apache Spark
Accelerating Genomics SNPs Processing and Interpretation with Apache SparkAccelerating Genomics SNPs Processing and Interpretation with Apache Spark
Accelerating Genomics SNPs Processing and Interpretation with Apache Spark
 
A
AA
A
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 

Mehr von GigaScience, BGI Hong Kong

Mehr von GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
 

Kürzlich hochgeladen

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Kürzlich hochgeladen (20)

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 

Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science

  • 1. Rice Galaxy : an open resource for rice science 13th International Conference on Genomics October 24, 2018 Venice Margarette B. Juanillas Bioinformatics Cluster Strategic Innovation Platform
  • 2. Open source and free bioinformatics software are … • typically run in command-line environment (no mouse , no graphics!) • Example: blat (BLAST-Like alignment tool), performing local alignment between 2 (multi) FASTA files… Type in your command…
  • 3. $ blat blat - Standalone BLAT v. 34 fast sequence search command line tool usage: blat database query [-tileSize=8] [–maxIntron=3000] [- out=psl] output.psl where: database and query are each either a .fa , .nib or .2bit file, or a list these files one file name per line. -maxIntron=N Sets maximum intron size. Default is 750000 -out=type Controls output file format. Type is one of: psl - Default. Tab separated format, no sequence pslx - Tab separated format with sequence -tileSize=N sets the size of match that triggers an alignment. Usually between 8 and 12 Default is 11 for DNA and 5 for protein. output.psl is where to put the output. & so many other parameters to set …
  • 4. What if you can design a GUI for blat? $blat database query [-tileSize=8] [–maxIntron=3000] [out=psl] output.psl
  • 5. Which is what Galaxy does, in fact…
  • 6. Analyses are often sequential… Commonly called analyses workflow or pipeline – Use software1 with its own input file and generate <output file 1> , then.. – Manipulate the text of <output file 1> so that it can be used as input file <manipulated outfile2> of software2 – Use software2 with <manipulated outfile2> as input and generate <outfile3> – Use <outfile3> as input for software3, then generate final output of analysis…
  • 7. Galaxy allows you to graphically create workflows!
  • 8. https://galaxyproject.org The Galaxy Project is supported in part by NSF, NHGRI, The Huck Institutes of the Life Sciences, The Institute for CyberScience at Penn State, and Johns Hopkins University.
  • 9. Galaxy has features that fit our needs “Open, web-based platform for accessible, reproducible, and transparent computational biomedical research” • Accessible: Users w/o programming experience can easily specify parameters and run tools and workflows • Reproducible: Galaxy captures info so that any user can repeat and understand a complete computational analysis • Transparent: Users share and publish analyses via the web and create interactive, web-based documents that describe a complete analysis.
  • 10. Galaxy Objects • Anything that can be saved and shared – Histories – Workflows – Datasets – Pages
  • 11. Community Galaxy : usegalaxy.org
  • 12. Rice Galaxy Project Rice Galaxy: Bioinformatics tools, datasets, and reusable workflows for rice genomic and genetic analyses Collaboration of Institutions IRRI : Philippines IRD, CIRAD : France Colorado State University, Texas A&M University, Indiana University: USA Advanced Institute of Science and Technology: Japan
  • 14. Data accessible/integrated into Rice Galaxy • 3,000 genomes SNP / indel & phenotype data
  • 15. Data accessible/integrated into Rice Galaxy • 3,000 genomes SNP / indel & phenotype data
  • 16. Data accessible/integrated into Rice Galaxy • 3,000 genomes SNP / indel & phenotype data • Rice HDRA genotyping and phenotyping data • 7 (+2 older Nipponbare) published rice genomes and annotations
  • 17. Integrated Tools into Rice Galaxy Workflows/tools dedicated for rice from both bioinformatics platform (South Green Bioinformatics and IRRI platform): 1. 3k RG and HDRA Toolkit (IRRI) 2. SNP Data Analysis Tools 3. TASSEL bioinformatics (South Green, IRRI) for GBS data management 4. OGHMA genomic prediction tool (IRRI) 5. RAVE (Rapid Allelic Variant extractor) to extract variants from 3000 genomes 6. SNiPlay workflows (7) for diversity and population structure analysis and GWAS studies (South Green) 7. Uniqprimer microbial pathogen diagnostic design toolkit (CSU/USDA/South Green/IRRI)
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Genomic Prediction Tool suite Aim: Tools that will decipher the genotypes to understand how it affects phenotype on rice using machine learning algorithms
  • 23. Genomic Prediction Workflow Several classifiers in OGHMA: - LASSO - Random Forest - SVM - rrBLUP
  • 24. Genomic Prediction in Rice Galaxy
  • 26. Basic Use Case • Find the gene position from Nipponbare to IR8 Nipponbare : chr01 11218-12435 How about in IR8?? 1. Get gene sequence from Nipponbare (GD->Get Gene Sequence) 2. Align to another reference genome ,IR8 (SDT->Find- seq) 3. Post-process, clean up alignment – Cut col 14,16,17 (TM->cut columns) – Remove 1st 5 lines (TM ->remove beginning…) 4. Extract liftover sequences (SDT -> batch-get-subseq)
  • 27. SNP lift-over (i.e. Nipponbare-> IR8) workflow
  • 28. Rice Galaxy Open Access All meaningful data objects must have a globally unique and persistent identifier (PID) for CGIAR open access compliance
  • 29. Rice Galaxy Tool shed • Allow other researchers to use the tools • Allow tool shed enrichment by hosting tools from other researchers in the rice community • Rice Galaxy Tool shed: http: //52.76.88.51:8081/
  • 30. Conclusion • Rice Galaxy is a federated Galaxy resource tailored for rice genetics, genomics and breeding • Rice Galaxy integrates publicly available rice datasets and tools from other researchers in the rice community
  • 31. Thank You! • Alexis Dereeper • Nicolas Beaume • Gaetan Droc • Joshua Dizon • John Robert Mendoza • Jon Peter Perdon • Locedie Mansueto • Lindsay Triplett • Jillian Lang • Gabriel Zhou • Jay Santos • Dennis Diaz • DOST-ASTI • Kunalan Ratharanjan • Beth Plale • Jason Haga • Jan E. Leach • Manuel Ruiz • Michael Thomson • Nickolai Alexandrov • Pierre Larmande • Ramil P. Mauleon