Semantic Trilogy Bio2RDF tutorial

•

0 likes•1,882 views

The Bio2RDF project aims to transform silos of life science data into a globally distributed network of linked data for biological knowledge discovery. Bio2RDF creates and provides machine understandable descriptions of biological entities using the RDF/RDFS/OWL Semantic Web languages. Using both syntactic and semantic data integration techniques, Bio2RDF seamlessly integrates diverse biological data and enables powerful new SPARQL-based services across it’s globally distributed knowledge bases. The project has released 28 public databases in RDF format, all available on the internet using a SPARQL endpoint or by fetching dereferencable URI. Now with major data provider like NCBO, UniProt, KEGG, PDB and EBI who also expose their data as Linked Data, we need a framework to ease the buildup of mashup application and designing a workflow is a well-known approach to do so. The tutorial propose to use an open source professional ETL software, Talend, to help rdfization of existing data and to automate triples fetching to populate a mashup into the OpenLink Virtuoso triplestore. How can we build a specific database to answer a very specialized question? How can we build a mashup by fetching linked data from the web? How can we merge our own lab results with the publicly available knowledge from the semantic web? Those are the questions we answer in the tutorial by proposing tools and methods to the participant. In this tutorial you will learn how to install and administer the Virtuoso triplestore, then we will show you how to load RDF triples directly from the web or from your own data you will have converted to RDF using an open source professional ETL software: Talend. Now that Life Sciences semantic web is a reality, we need to make it answer our questions

Technology Education

How to produce and consume
Linked Data the Bio2RDF way
(Using Virtuoso triplestore and Talend ETL)
François Belleau, Arnaud Droit
Centre de recherche du CHUQ, Laval University
Québec, Canada

Download stuff...
Virtuoso Triplestore
http://virtuoso.openlinksw.
com/dataspace/doc/dav/wiki/Main/VOSDownload
Talend Software
http://www.talend.com/products/data-integration
Bio2RDF Talend jobs
http://sourceforge.net/p/bio2rdf/git/ci/master/tree/

Program
● Presentation of Bio2RDF project and other RDF public
data provider like NCBO, UniProt and KEGG.
○ 15 minutes
● Virtuoso triplestore installation and administration
○ 30 minutes
● Talend Open Studio installation and basic introduction
○ 30 minutes
● Hands on part of the tutorial
○ 90 minute

Virtuoso triplestore installation and
administration (30 min.)
● Basic server configuration
● Installing the facet browser
● Loading RDF into the triplestore
● Submitting SPARQL queries

Talend Open Studio installation and
basic introduction (30 min.)
● Concept of JOB and Component
● Java compilation and exporting package
● How to access and transform data from SQL
database or in, XML, JSON or text format
● How to access the web and consume SOAP
service

Hands on part of the tutorial (90 minutes)
● Learning basic Talend technics
● Fetching data from the web
● Creating triple in n-triples format
● Parsing a XML document
● Accessing Virtuoso triplestore via JDBC API

Would you contribute ?
https://github.com/fbelleau/talend4sw
The project goal is to build Talend
components for Semantic Web.

Recently uploaded

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

ICT role in 21st century education and its challenges

rafiqahmad00786416

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Bhuvaneswari Subramani

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Zilliz

Keynote 2: APIs in 2030: The Risk of Technological Sleepwalk Paolo Malinverno, Growth Advisor - The Business of Technology Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

apidays

Exploring Multimodal Embeddings with Milvus

Zilliz

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Dubai, often portrayed as a shimmering oasis in the desert, faces its own set of challenges, including the occasional threat of flooding. Despite its reputation for opulence and modernity, the emirate is not immune to the forces of nature. In recent years, Dubai has experienced sporadic but significant floods, testing the resilience of its infrastructure and communities. Among the critical lifelines in this bustling metropolis is the Dubai International Airport, a bustling hub that connects the city to the world. This article explores the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Orbitshub

Following the popularity of “Cloud Revolution: Exploring the New Wave of Serverless Spatial Data,” we’re thrilled to announce this much-anticipated encore webinar. In this sequel, we’ll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you’re building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

Six Myths about Ontologies: The Basics of Formal Ontology

johnbeverley2021

DBX First Quarter 2024 Investor Presentation

Dropbox

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

Retrieval augmented generation (RAG) is the most popular style of large language model application to emerge from 2023. The most basic style of RAG works by vectorizing your data and injecting it into a vector database like Milvus for retrieval to augment the text output generated by an LLM. This is just the beginning. One of the ways that we can extend RAG, and extend AI, is through multilingual use cases. Typical RAG is done in English using embedding models that are trained in English. In this talk, we’ll explore how RAG could work in languages other than English. We’ll explore French, Chinese, and Polish.

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Zilliz

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

CNIC Information System with Pakdata Cf In Pakistan

danishmna97

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

ICT role in 21st century education and its challenges

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Exploring Multimodal Embeddings with Milvus

AWS Community Day CPH - Three problems of Terraform

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

presentation ICT roal in 21st century education

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Six Myths about Ontologies: The Basics of Formal Ontology

DBX First Quarter 2024 Investor Presentation

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Introduction to Multilingual Retrieval Augmented Generation (RAG)

How to Troubleshoot Apps for the Modern Connected Worker

Artificial Intelligence Chap.5 : Uncertainty

CNIC Information System with Pakdata Cf In Pakistan

Semantic Trilogy Bio2RDF tutorial

1. How to produce and consume Linked Data the Bio2RDF way (Using Virtuoso triplestore and Talend ETL) François Belleau, Arnaud Droit Centre de recherche du CHUQ, Laval University Québec, Canada

2. Download stuff... Virtuoso Triplestore http://virtuoso.openlinksw. com/dataspace/doc/dav/wiki/Main/VOSDownload Talend Software http://www.talend.com/products/data-integration Bio2RDF Talend jobs http://sourceforge.net/p/bio2rdf/git/ci/master/tree/

3. Program ● Presentation of Bio2RDF project and other RDF public data provider like NCBO, UniProt and KEGG. ○ 15 minutes ● Virtuoso triplestore installation and administration ○ 30 minutes ● Talend Open Studio installation and basic introduction ○ 30 minutes ● Hands on part of the tutorial ○ 90 minute

4. Virtuoso triplestore installation and administration (30 min.) ● Basic server configuration ● Installing the facet browser ● Loading RDF into the triplestore ● Submitting SPARQL queries

5. Talend Open Studio installation and basic introduction (30 min.) ● Concept of JOB and Component ● Java compilation and exporting package ● How to access and transform data from SQL database or in, XML, JSON or text format ● How to access the web and consume SOAP service

6. Hands on part of the tutorial (90 minutes) ● Learning basic Talend technics ● Fetching data from the web ● Creating triple in n-triples format ● Parsing a XML document ● Accessing Virtuoso triplestore via JDBC API

7. Would you contribute ? https://github.com/fbelleau/talend4sw The project goal is to build Talend components for Semantic Web.

Semantic Trilogy Bio2RDF tutorial

Recommended

Recommended

More Related Content

More from François Belleau

More from François Belleau (20)

Recently uploaded

Recently uploaded (20)

Semantic Trilogy Bio2RDF tutorial