SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Proteomics Bioinformatics
WTAC
13-17 December 2010
Rafael Jimenez
rafael@ebi.ac.uk
EnCORE
presentation
DAS
Distributed Annotation System
Table of contents
• DAS
 What is it?
 Commands and queries
 Why should I use it?
 Documentation
 Clients and servers
What is it?
DAS, The Distributed Annotation System
The Distributed Annotation System is…
– A network of biological data sources
– A Service Oriented Architecture (SOA)
– RESTful web service
– An example of federation
• Uniform access to multiple repositories of biological data.
• Repositories distributed in different geographical locations.
The DAS Protocol is…
– An integration platform
– A client-server protocol
– An agreed standard for web services
23.08.18 5
DAS data types
Genome sequence
Sequence alignments
Protein sequence
Protein-protein interaction
Gel 2D
EMAP
3DM
Protein structure
Protein structure
EMAP
3DM
Protein-protein interaction
Protein structure
Gel 2D
Mass spectrometry
Epigenetics
Phenotype
Functional genomics
Structural genomics
Protein sequence
Alignment servers Annotation servers Reference servers
The Distributed Annotation System, 2001 Dowell et al;
BMC Bioinformatics. 2001; 2: 7. Published online 2001 October 10.
DAS, Architectural Overview
illustration
Service
broker
Service
consumer
Service
provider
Service
Contract
...
...
Interact
PublishFind
Service Oriented Architecture
DAS implementation
DAS
...
...
...
DAS
Registry
DAS Clients
Annotation
sources
Reference
source
Alignment
sources
Alignment
sources
Alignment
sources
Annotation
sources
Annotation
sources
DAS Clients
DAS Clients
Protocol
Example client behaviour
Andy Jenkinson
Example client behaviour
Andy Jenkinson
Example client behaviour
Standardization allows clients to connect to different
DAS sources without additional programming
Andy Jenkinson
Commands and queries
DAS – Andy Jenkinson
23.08.1812
Query model
Structured REST URL
– http://server/das/source/command?arguments
– servers, data sources, commands, parameters
Reference object
– e.g. “chromosome X”
Reference servers provide sequence
– http://server/das/source/sequence?segment=X:1,500
Annotation servers provide features
– http://server/das/source/features?segment=X:1,500
DAS – Andy Jenkinson
23.08.1813
Data model
Lightweight XML
http://server/das/source/features?segment=X:1,500
<SEGMENT id=“X” start=“1” stop=“500”>
<FEATURE id=“…”>
<TYPE id=“…” category=“…”>…</TYPE>
<METHOD id=“…”>…</METHOD>
<START>…</START>
<END>…</END>
</FEATURE>
<FEATURE id=“…”>
…
</FEATURE>
</SEGMENT>
http://server/das/source/features?segment=X:1,500
<SEGMENT id=“X” start=“1” stop=“500”>
<FEATURE id=“…”>
<TYPE id=“…” category=“…”>…</TYPE>
<METHOD id=“…”>…</METHOD>
<START>…</START>
<END>…</END>
</FEATURE>
<FEATURE id=“…”>
…
</FEATURE>
</SEGMENT>
DAS Annotation source - Protein Feature Request
Non-positional feature
Positional feature
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/features?segment=Q12345
DAS Reference source - Protein Sequence Request
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=Q12345
More DAS Commands
• Alignment, Structure and Interaction
• More …
http://server/das/source/entry_points
– entry_points: List of available “chromosomes | contigs | proteins | …”
http://server/das/source/types
– types – provides a summary of the feature types for a segment.
http://server/das/source/stylesheet
– stylesheet – gives hints to the DAS client about how to display the
feature types. Can be ignored of course.
http://server/das/sources
– sources – list of available sources in one DAS server. Replaces the
original, underspecified dsn command.
http://www.biodas.org/wiki/DAS1.6
Why should I use it?
DAS – Andy Jenkinson
23.08.1818
DAS Design Principles
Data remains distributed
• “live” data
• data providers retain responsibility
• good for changing data
• spreads resources
Easy for data providers to implement
• simple protocol
• lots of data providers
DAS – Andy Jenkinson
23.08.1819
DAS Design Principles
Principally for display
• should be responsive (fast)
• region-targeted queries
• lightweight infrastructure
Downsides
• Rigid data model
• Weak semantics
Documentation
BioDAS
http://www.biodas.org
Tutorials
http://www.biodas.org/wiki/DASWorkshop2010
Versions of DAS
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
~250
sources
~380
sources
~650
sources
~ 8 sources
DAS
1.01
~1300
sources
DAS
1.53
DAS
2.0
DAS
2.1
DAS
1.53E
DAS
1.6DAS 1 DAS/2
DAS Specification 1.6
http://www.biodas.org/wiki/DAS1.6
Clients and servers
List of DAS Servers
23.08.1826
DAS Client libraries
23.08.1827
• Bio::Das::Lite (Perl)
• Dasobert (Java)
List of DAS Clients
23.08.1828
• Ensembl uses DAS to pull in genomic, gene and protein annotations. It also
provides data via DAS.
• Gbrowse is a generic genome browser, and is both a consumer and provider
of DAS.
• IGB is a desktop application for viewing genomic data.
• SPICE is an application for projecting protein annotations onto 3D structures.
• Dasty2 is a web-based viewer for protein annotations
• Jalview is a multiple alignment editor.
• PeppeR is a graphical viewer for 3D electron microscopy data.
• DASMI is an integration portal for protein interaction data.
• DASher is a Java-based viewer for protein annotations.
• EpiC presents structure-function summaries for antibody design.
• STRAP is a STRucture-based sequence Alignment Program.
23.08.18 29
Protein sequence data
Dasty2
23.08.18 30
Genome sequence data
Ensembl
23.08.18 31
Protein structure data
Spice-Sisyphus
23.08.18 32
Protein-protein interaction data
iPfam
23.08.18 33
Sequence alignment data
Pfam
23.08.18 34
EMAP data
EMAP: The Edinburgh Mouse Atlas Project
Gene expression databases (EMAGE & GXD)

DAS reference server

EMAP - Ontology
DAS annotation servers

EMAGE

GXD
Thank you!
Questions?
ProteomicsServicesTeam

Weitere ähnliche Inhalte

Ähnlich wie DAS, the Distributed Annotation System

A Real World Guide to Building Highly Available Fault Tolerant SharePoint Farms
A Real World Guide to Building Highly Available Fault Tolerant SharePoint FarmsA Real World Guide to Building Highly Available Fault Tolerant SharePoint Farms
A Real World Guide to Building Highly Available Fault Tolerant SharePoint FarmsEric Shupps
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle databaseSamar Prasad
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle databaseSamar Prasad
 
SingleLecture.pdf
SingleLecture.pdfSingleLecture.pdf
SingleLecture.pdfMastroQUU
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...Institute of Information Systems (HES-SO)
 
databasesystemsconollyslide1-151102101031-lva1-app6892.pptx
databasesystemsconollyslide1-151102101031-lva1-app6892.pptxdatabasesystemsconollyslide1-151102101031-lva1-app6892.pptx
databasesystemsconollyslide1-151102101031-lva1-app6892.pptxsalutiontechnology
 
Denodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBasedarach
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsDan Sullivan, Ph.D.
 
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2Dan Sullivan, Ph.D.
 
Events and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of WebopsEvents and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of WebopsDatadog
 
Patterns & Practices of Microservices
Patterns & Practices of MicroservicesPatterns & Practices of Microservices
Patterns & Practices of MicroservicesWesley Reisz
 
History of database processing module 1 (2)
History of database processing module 1 (2)History of database processing module 1 (2)
History of database processing module 1 (2)chottu89
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Running a Megasite on Microsoft Technologies
Running a Megasite on Microsoft TechnologiesRunning a Megasite on Microsoft Technologies
Running a Megasite on Microsoft Technologiesgoodfriday
 
Heterogenous data base
Heterogenous data baseHeterogenous data base
Heterogenous data baseHaqnawaz Ch
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 

Ähnlich wie DAS, the Distributed Annotation System (20)

Lee oracle
Lee oracleLee oracle
Lee oracle
 
A Real World Guide to Building Highly Available Fault Tolerant SharePoint Farms
A Real World Guide to Building Highly Available Fault Tolerant SharePoint FarmsA Real World Guide to Building Highly Available Fault Tolerant SharePoint Farms
A Real World Guide to Building Highly Available Fault Tolerant SharePoint Farms
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
 
SingleLecture.pdf
SingleLecture.pdfSingleLecture.pdf
SingleLecture.pdf
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
 
databasesystemsconollyslide1-151102101031-lva1-app6892.pptx
databasesystemsconollyslide1-151102101031-lva1-app6892.pptxdatabasesystemsconollyslide1-151102101031-lva1-app6892.pptx
databasesystemsconollyslide1-151102101031-lva1-app6892.pptx
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Denodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me Anything
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBase
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
 
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
 
Events and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of WebopsEvents and metrics the Lifeblood of Webops
Events and metrics the Lifeblood of Webops
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
Patterns & Practices of Microservices
Patterns & Practices of MicroservicesPatterns & Practices of Microservices
Patterns & Practices of Microservices
 
History of database processing module 1 (2)
History of database processing module 1 (2)History of database processing module 1 (2)
History of database processing module 1 (2)
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Running a Megasite on Microsoft Technologies
Running a Megasite on Microsoft TechnologiesRunning a Megasite on Microsoft Technologies
Running a Megasite on Microsoft Technologies
 
Heterogenous data base
Heterogenous data baseHeterogenous data base
Heterogenous data base
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 

Mehr von Rafael C. Jimenez

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop Rafael C. Jimenez
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesRafael C. Jimenez
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsRafael C. Jimenez
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...Rafael C. Jimenez
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresRafael C. Jimenez
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic accessRafael C. Jimenez
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...Rafael C. Jimenez
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeRafael C. Jimenez
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Rafael C. Jimenez
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Rafael C. Jimenez
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Rafael C. Jimenez
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesRafael C. Jimenez
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information Rafael C. Jimenez
 

Mehr von Rafael C. Jimenez (20)

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resources
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussions
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructures
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Standards
StandardsStandards
Standards
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic access
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR Europe
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciences
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information
 

Kürzlich hochgeladen

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 

Kürzlich hochgeladen (20)

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 

DAS, the Distributed Annotation System

  • 1. Proteomics Bioinformatics WTAC 13-17 December 2010 Rafael Jimenez rafael@ebi.ac.uk EnCORE presentation DAS Distributed Annotation System
  • 2. Table of contents • DAS  What is it?  Commands and queries  Why should I use it?  Documentation  Clients and servers
  • 4. DAS, The Distributed Annotation System The Distributed Annotation System is… – A network of biological data sources – A Service Oriented Architecture (SOA) – RESTful web service – An example of federation • Uniform access to multiple repositories of biological data. • Repositories distributed in different geographical locations. The DAS Protocol is… – An integration platform – A client-server protocol – An agreed standard for web services
  • 5. 23.08.18 5 DAS data types Genome sequence Sequence alignments Protein sequence Protein-protein interaction Gel 2D EMAP 3DM Protein structure Protein structure EMAP 3DM Protein-protein interaction Protein structure Gel 2D Mass spectrometry Epigenetics Phenotype Functional genomics Structural genomics Protein sequence Alignment servers Annotation servers Reference servers
  • 6. The Distributed Annotation System, 2001 Dowell et al; BMC Bioinformatics. 2001; 2: 7. Published online 2001 October 10. DAS, Architectural Overview illustration
  • 7. Service broker Service consumer Service provider Service Contract ... ... Interact PublishFind Service Oriented Architecture DAS implementation DAS ... ... ... DAS Registry DAS Clients Annotation sources Reference source Alignment sources Alignment sources Alignment sources Annotation sources Annotation sources DAS Clients DAS Clients Protocol
  • 10. Example client behaviour Standardization allows clients to connect to different DAS sources without additional programming Andy Jenkinson
  • 12. DAS – Andy Jenkinson 23.08.1812 Query model Structured REST URL – http://server/das/source/command?arguments – servers, data sources, commands, parameters Reference object – e.g. “chromosome X” Reference servers provide sequence – http://server/das/source/sequence?segment=X:1,500 Annotation servers provide features – http://server/das/source/features?segment=X:1,500
  • 13. DAS – Andy Jenkinson 23.08.1813 Data model Lightweight XML http://server/das/source/features?segment=X:1,500 <SEGMENT id=“X” start=“1” stop=“500”> <FEATURE id=“…”> <TYPE id=“…” category=“…”>…</TYPE> <METHOD id=“…”>…</METHOD> <START>…</START> <END>…</END> </FEATURE> <FEATURE id=“…”> … </FEATURE> </SEGMENT> http://server/das/source/features?segment=X:1,500 <SEGMENT id=“X” start=“1” stop=“500”> <FEATURE id=“…”> <TYPE id=“…” category=“…”>…</TYPE> <METHOD id=“…”>…</METHOD> <START>…</START> <END>…</END> </FEATURE> <FEATURE id=“…”> … </FEATURE> </SEGMENT>
  • 14. DAS Annotation source - Protein Feature Request Non-positional feature Positional feature http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/features?segment=Q12345
  • 15. DAS Reference source - Protein Sequence Request http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=Q12345
  • 16. More DAS Commands • Alignment, Structure and Interaction • More … http://server/das/source/entry_points – entry_points: List of available “chromosomes | contigs | proteins | …” http://server/das/source/types – types – provides a summary of the feature types for a segment. http://server/das/source/stylesheet – stylesheet – gives hints to the DAS client about how to display the feature types. Can be ignored of course. http://server/das/sources – sources – list of available sources in one DAS server. Replaces the original, underspecified dsn command. http://www.biodas.org/wiki/DAS1.6
  • 17. Why should I use it?
  • 18. DAS – Andy Jenkinson 23.08.1818 DAS Design Principles Data remains distributed • “live” data • data providers retain responsibility • good for changing data • spreads resources Easy for data providers to implement • simple protocol • lots of data providers
  • 19. DAS – Andy Jenkinson 23.08.1819 DAS Design Principles Principally for display • should be responsive (fast) • region-targeted queries • lightweight infrastructure Downsides • Rigid data model • Weak semantics
  • 23. Versions of DAS 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 ~250 sources ~380 sources ~650 sources ~ 8 sources DAS 1.01 ~1300 sources DAS 1.53 DAS 2.0 DAS 2.1 DAS 1.53E DAS 1.6DAS 1 DAS/2
  • 26. List of DAS Servers 23.08.1826
  • 27. DAS Client libraries 23.08.1827 • Bio::Das::Lite (Perl) • Dasobert (Java)
  • 28. List of DAS Clients 23.08.1828 • Ensembl uses DAS to pull in genomic, gene and protein annotations. It also provides data via DAS. • Gbrowse is a generic genome browser, and is both a consumer and provider of DAS. • IGB is a desktop application for viewing genomic data. • SPICE is an application for projecting protein annotations onto 3D structures. • Dasty2 is a web-based viewer for protein annotations • Jalview is a multiple alignment editor. • PeppeR is a graphical viewer for 3D electron microscopy data. • DASMI is an integration portal for protein interaction data. • DASher is a Java-based viewer for protein annotations. • EpiC presents structure-function summaries for antibody design. • STRAP is a STRucture-based sequence Alignment Program.
  • 31. 23.08.18 31 Protein structure data Spice-Sisyphus
  • 34. 23.08.18 34 EMAP data EMAP: The Edinburgh Mouse Atlas Project Gene expression databases (EMAGE & GXD)  DAS reference server  EMAP - Ontology DAS annotation servers  EMAGE  GXD

Hinweis der Redaktion

  1. An integration platform for biological data a way of bringing together data from different providers federation unifies data sources that are different to each other
  2. The annotations are stored locally in a database or on file and are served to a DAS client from a DAS server. The real power of DAS comes from the fact that a DAS client can request information from many DAS servers about the same molecule and integrate this information into a single view or analysis.
  3. The communication between the DAS client and the DAS server is done using standard HTTP requests that return simple XML responses The DAS client pulls annotations from data sources on one or several DAS annotation servers and displays them on sequence obtained from a common reference server that is considered to be the &amp;apos;authority&amp;apos; for the sequence.
  4. well-formed hierarchical URL, each server has one or more sources, and each source implements one or more commands sequence command provides sequence, and features command provides sequence annotations stylesheet command allows the server to govern how the feature will be rendered by the client. it works by specifying the type and colour of glyph to use for each type of feature. So for instance the COSMIC cancer mutation database DAS server specifies that substitutions should be drawn as crosses, whereas insertions are drawn as triangles.
  5. live – warehouses allow fast access but data is often not in sync with source database providers are responsible for data, and clients are shielded from database changes rapidly changing data e.g. ENCODE, c.f. warehouses. makes a lot of sense to spread resources given the topology of the network intrinsically simple protocol, and: dumb server – all it has to do is access its adapt the data medium to XML, and existing implementations make that easy clever client –presentation of the data
  6. fast – user-driven applications have to be fast, as users are only prepared to wait a couple of seconds for content rigid data model means data providers don’t have freedom to put all the data in, but this ensures the system is generic meaning clients get additional data for zero cost weak semantics, though this is being addressed with the ontology
  7. Graphic representation of the evolution of &amp;quot;Versions of DAS&amp;quot;. It gives a rough idea of when the different specifications were adopted and when DAS/2 started a as independent specification. It also shows an estimation of available DAS sources per year for DAS 1 and DAS/2.
  8. Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both generation of new bioinformatics tools and experimental validation of computational predictions. Beyond the use of common standards to format individual datasets, there is a need for sophisticated informatics platforms to enable mining data across various domains, sources, formats and types. The aim of the EnCORE project is to integrate across different disciplines an extensive list of database resources and analysis tools in a computationally accessible and extensible manner, facilitating automated data retrieval and processing with a special focus on systems biology. The EnCORE platform is available as a collection of webservices with a common standard format easy to integrate in Workflow management software such as Taverna. Additionally EnCORE services are also accessible thought EnVISION, a web graphical user interface providing elaborated information such as molecular interaction, biological pathways and computational models of pathways.