SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Reactome2JSON-LD
Converting REACTOME database in JSON-LD
and explore the potential in ElasticSearch/Siren Investigate
François Belleau@ Arnaud Droit Lab
19 January 2021
Initial goal
Create a Biomart like user interface built around Reactome,
CHEBI, Uniprot and GO databases all interlinked with URIs.
for
Project’s final architecture
REST/JSON
REST/JSON
EBI:REST/JSON
WSDL/XML
TXT FILE
XML FILE
1 500 000 documents
from 6 data sources
Initial project’s data model
Siren Investigate final data model
How we did it ?
For datasource in [reactome, chebi, uniprot, go, resid, intact]:
1. ETL for data transformation
a. Get the list of object IDs
b. For each document using Python
i. Use API (REST, WSDL, file) to fetch each object
ii. Load document to ElasticSearch index
c. Convert each document to JSON-LD format using Elasticsearch pipeline
d. Export index to Siren Investigate server using elasticsearch-dump
2. With Siren Investigate
a. For each datasource
i. Create visualisation
ii. Combine them to create a dashboard
b. Configure a relational model
c. Explore Reactome data into graph visualizer
Simple python scripts to fetch API and load into ES
available here https://github.com/fbelleau/swat4ls-2021/tree/main/stuff2json-ld
ES pipelines to transform JSON to JSON-LD
at warp speed in the 4 nodes ES cluster
JSON-LD validation @ https://json-ld.org/playground/
Build Siren visualization and dashboard
Using a web user interface :
1. Create visualization
2. Edit visualization parameters
3. Add visualisation to dashboard
Create relational data model in Siren
The BioMart like user
interface and plus...
Using free Siren Investigate Community edition
https://siren.io/siren-platform-editions/
Slicing and dicing a dataset with facets and text search
Main feature of BioMart
Full text search over 6 data sources
Full text search in Reactome using relations
Biomart was relational
148
pathways
222
physical entities
Visual dashboard with facet browsing
Siren Investigate graph browser
ChEBI molecules involved in process GO:0006096 according to Reactome
Conclusions
Lessons learned
● Crawling existing REST/JSON API is a simple and fast way to access Life
Science data sources
● Converting JSON to JSON-LD is simple and efficient
● Elasticsearch is a good container to store JSON-LD
○ with dereferencable URI for free
○ fast text search and query language
○ efficient disk storage
● Siren Investigate community edition works
○ to browse data with facets (Biomart way)
○ to build visual dashboard
○ to query a relational data model composed of ES indexes
○ to explore the knowledge graph using a performing graph browser
● We now know how to store RDF in Elasticsearch.
Next
We are building Kibio.science project at ADLab, it is
a new approach of data and knowledge
management that will provide a powerful solution
by integrating and connecting healthcare and
research.
It is builded with Elasticsearch and Siren
Investigate and will be hosted on our own
infrastructure.
We are open for collaboration with the SWAT4LS
community.
Arnaud Droit Lab, the
computational biology platform
of the research center of
Quebec
https://compbio.ca/
Thanks to my colleagues at ADLab
● Dr Arnaud Droit (PI)
● Mickael Leclercq
● Regis Ongaro-Carcy
and support from the Siren team
● Manu Agarwal
● Andrew Winter

Weitere ähnliche Inhalte

Was ist angesagt?

Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinkingwhalb
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncFranz Inc. - AllegroGraph
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountLeigh Dodds
 
Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingCynthiaCruz55
 
Introduction to Elastic with a hint of Symfony and Docker
Introduction to Elastic with a hint of Symfony and DockerIntroduction to Elastic with a hint of Symfony and Docker
Introduction to Elastic with a hint of Symfony and DockerDaniel Platt
 
A Novel Method and Architecture for Law Processing, Utilising High Performan...
A Novel Method and Architecture  for Law Processing, Utilising High Performan...A Novel Method and Architecture  for Law Processing, Utilising High Performan...
A Novel Method and Architecture for Law Processing, Utilising High Performan...Samos2019Summit
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF Navid Sedighpour
 
4. Crossref and Atypon
4. Crossref and Atypon4. Crossref and Atypon
4. Crossref and AtyponCrossref
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDimitris Kontokostas
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus
 
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...Jenn Riley
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphIoan Toma
 
Publishing XBRL as Linked Open Data
Publishing XBRL as Linked Open DataPublishing XBRL as Linked Open Data
Publishing XBRL as Linked Open DataRoberto García
 

Was ist angesagt? (17)

The Power of Machine Learning and Graphs
The Power of Machine Learning and GraphsThe Power of Machine Learning and Graphs
The Power of Machine Learning and Graphs
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinking
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz Inc
 
STI Summit 2011 - DB vs RDF
STI Summit 2011 - DB vs RDFSTI Summit 2011 - DB vs RDF
STI Summit 2011 - DB vs RDF
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
 
Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen Scraping
 
Introduction to Elastic with a hint of Symfony and Docker
Introduction to Elastic with a hint of Symfony and DockerIntroduction to Elastic with a hint of Symfony and Docker
Introduction to Elastic with a hint of Symfony and Docker
 
A Novel Method and Architecture for Law Processing, Utilising High Performan...
A Novel Method and Architecture  for Law Processing, Utilising High Performan...A Novel Method and Architecture  for Law Processing, Utilising High Performan...
A Novel Method and Architecture for Law Processing, Utilising High Performan...
 
Scalable Web Data Management using RDF
Scalable Web Data Management using RDF  Scalable Web Data Management using RDF
Scalable Web Data Management using RDF
 
4. Crossref and Atypon
4. Crossref and Atypon4. Crossref and Atypon
4. Crossref and Atypon
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in Dublin
 
F# for Data*
F# for Data*F# for Data*
F# for Data*
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
 
Using xml in a data set (ado.net)
Using xml in a data set (ado.net)Using xml in a data set (ado.net)
Using xml in a data set (ado.net)
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Publishing XBRL as Linked Open Data
Publishing XBRL as Linked Open DataPublishing XBRL as Linked Open Data
Publishing XBRL as Linked Open Data
 

Ähnlich wie Reactome2JSON-LD: Converting REACTOME Database to JSON-LD

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONGI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONENCODE-DCC
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseRDTF-Discovery
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 
The ENCODE Portal REST API
The ENCODE Portal REST API The ENCODE Portal REST API
The ENCODE Portal REST API ENCODE-DCC
 
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...Chunlei Wu
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...Benoit Pauwels
 
Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...ULB - Bibliothèques
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseUniversity of Bologna
 
Library Information Retrieval (IR) System of University of Cyprus (UCY)
Library Information Retrieval (IR) System of University of Cyprus (UCY)Library Information Retrieval (IR) System of University of Cyprus (UCY)
Library Information Retrieval (IR) System of University of Cyprus (UCY)ijcsitcejournal
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1Uthara Iyer
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...João Rocha da Silva
 
The LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataThe LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataDavid Newbury
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Jane Stevenson
 

Ähnlich wie Reactome2JSON-LD: Converting REACTOME Database to JSON-LD (20)

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONGI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcase
 
Metadata and me
Metadata and meMetadata and me
Metadata and me
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
The ENCODE Portal REST API
The ENCODE Portal REST API The ENCODE Portal REST API
The ENCODE Portal REST API
 
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
BioThings and SmartAPI: building an ecosystem of interoperable biological kno...
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Resources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the WebResources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the Web
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...
 
Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations arise
 
Library Information Retrieval (IR) System of University of Cyprus (UCY)
Library Information Retrieval (IR) System of University of Cyprus (UCY)Library Information Retrieval (IR) System of University of Cyprus (UCY)
Library Information Retrieval (IR) System of University of Cyprus (UCY)
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...
 
The LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataThe LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked Data
 
Linked sensor data
Linked sensor dataLinked sensor data
Linked sensor data
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 

Mehr von François Belleau

Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017François Belleau
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ESFrançois Belleau
 
BD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionBD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionFrançois Belleau
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)François Belleau
 
Bio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceBio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceFrançois Belleau
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFAcfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFFrançois Belleau
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...François Belleau
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseBio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseFrançois Belleau
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemFrançois Belleau
 

Mehr von François Belleau (20)

Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008
 
Show de boucane pour ELK
Show de boucane pour ELKShow de boucane pour ELK
Show de boucane pour ELK
 
Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES
 
Linuq 20160130
Linuq 20160130Linuq 20160130
Linuq 20160130
 
textOdossier
textOdossiertextOdossier
textOdossier
 
BD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionBD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submission
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)
 
Bio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceBio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conference
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFAcfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
Bio2RDF-ISMB2008
Bio2RDF-ISMB2008Bio2RDF-ISMB2008
Bio2RDF-ISMB2008
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseBio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
 
Bio2RDF should we do it
Bio2RDF should we do itBio2RDF should we do it
Bio2RDF should we do it
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
 
Bio2RDF/Virtuoso
Bio2RDF/VirtuosoBio2RDF/Virtuoso
Bio2RDF/Virtuoso
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Reactome2JSON-LD: Converting REACTOME Database to JSON-LD

  • 1. Reactome2JSON-LD Converting REACTOME database in JSON-LD and explore the potential in ElasticSearch/Siren Investigate François Belleau@ Arnaud Droit Lab 19 January 2021
  • 3. Create a Biomart like user interface built around Reactome, CHEBI, Uniprot and GO databases all interlinked with URIs. for
  • 4. Project’s final architecture REST/JSON REST/JSON EBI:REST/JSON WSDL/XML TXT FILE XML FILE 1 500 000 documents from 6 data sources
  • 7. How we did it ?
  • 8. For datasource in [reactome, chebi, uniprot, go, resid, intact]: 1. ETL for data transformation a. Get the list of object IDs b. For each document using Python i. Use API (REST, WSDL, file) to fetch each object ii. Load document to ElasticSearch index c. Convert each document to JSON-LD format using Elasticsearch pipeline d. Export index to Siren Investigate server using elasticsearch-dump 2. With Siren Investigate a. For each datasource i. Create visualisation ii. Combine them to create a dashboard b. Configure a relational model c. Explore Reactome data into graph visualizer
  • 9. Simple python scripts to fetch API and load into ES available here https://github.com/fbelleau/swat4ls-2021/tree/main/stuff2json-ld
  • 10. ES pipelines to transform JSON to JSON-LD at warp speed in the 4 nodes ES cluster
  • 11. JSON-LD validation @ https://json-ld.org/playground/
  • 12. Build Siren visualization and dashboard Using a web user interface : 1. Create visualization 2. Edit visualization parameters 3. Add visualisation to dashboard
  • 13. Create relational data model in Siren
  • 14. The BioMart like user interface and plus...
  • 15. Using free Siren Investigate Community edition https://siren.io/siren-platform-editions/
  • 16. Slicing and dicing a dataset with facets and text search Main feature of BioMart
  • 17. Full text search over 6 data sources
  • 18. Full text search in Reactome using relations Biomart was relational 148 pathways 222 physical entities
  • 19. Visual dashboard with facet browsing
  • 20. Siren Investigate graph browser ChEBI molecules involved in process GO:0006096 according to Reactome
  • 22. Lessons learned ● Crawling existing REST/JSON API is a simple and fast way to access Life Science data sources ● Converting JSON to JSON-LD is simple and efficient ● Elasticsearch is a good container to store JSON-LD ○ with dereferencable URI for free ○ fast text search and query language ○ efficient disk storage ● Siren Investigate community edition works ○ to browse data with facets (Biomart way) ○ to build visual dashboard ○ to query a relational data model composed of ES indexes ○ to explore the knowledge graph using a performing graph browser ● We now know how to store RDF in Elasticsearch.
  • 23. Next
  • 24. We are building Kibio.science project at ADLab, it is a new approach of data and knowledge management that will provide a powerful solution by integrating and connecting healthcare and research. It is builded with Elasticsearch and Siren Investigate and will be hosted on our own infrastructure. We are open for collaboration with the SWAT4LS community.
  • 25. Arnaud Droit Lab, the computational biology platform of the research center of Quebec https://compbio.ca/ Thanks to my colleagues at ADLab ● Dr Arnaud Droit (PI) ● Mickael Leclercq ● Regis Ongaro-Carcy and support from the Siren team ● Manu Agarwal ● Andrew Winter