SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
SCALING OUT FEDERATED QUERIES
FOR LIFE SCIENCES DATA IN PRODUCTION
Dieter De Witte, Laurens De Vocht, et al.
dieter.dewitte@ugent.be
• IMEC– IDLAB – GHENT UNIVERSITY
• ONTOFORCE
Catch 22!?
A. No Semantic Web Applications
because no Semantic Data
B. No Semantic Data
because no applications
A. The LOD Cloud for Life Sciences...
Ontoforce’s
DISQOVER
covers
> 110
Life Sciences
Datasets
B. DISQOVER is an Exploratory
Semantic Search UI (faceted browsing)
To Click = To SPARQL
The missing link in our catch 22?
“How to run federated queries?”
Direct ETL
• Cloud Instances
• PAGO amis:
Scientific Benchmark
= Reproducible Benchmark
Benchmark Client
• 1 single-threaded warm-up
run (all 1,223 queries)
• 1 multi-threaded (8) run
• (8 x randomized order)
Database Node(s)
How to evaluate
an RDF Database solution?
Performance (
Data store,
Dataset,
Configuration,
Number of nodes,
Hardware (RAM)
)
Performance (
NoSQL Triple stores,
Watdiv 10M, 100M, 1000M,
Standard Configs,
Single Node,
32 GB RAM
)
SIGMOD 2016:
Single Node SOTA on artificial data
More data, more problems
Timeout
Query performance:
Virtuoso Leads, Blazegraph follows
Timeout
SWAT4LS 2016:
Multi-node SOTA on real data
Performance (
Scale out systems,
DISQOVER data,
Optimized Configs,
Multi-Node, Compression
64 GB RAM
)
How to deal with Big Linked Data?
1. Vertical Scaling: bigger box
2. Compression: smaller content
3. Horizontal Scaling: more boxes, 1 location
4. Federation: more boxes, more locations
V1, Bla1 (single node Virtuoso, Blazegraph)
V1_32 (32GB Virtuoso)
Fu1 (Fuseki + HDT)
V3 (Virtuoso cluster 3 nodes)
Fl3 (FluidOps, aka FedX)
DISQOVER dataset ...
and queries
Count, Union, Sort, Aggregations
Example Query 1:
Nesting, FILTERs, unbound triples
Example Query 4:
Aggregations, Optionals
Initial performance results were
counter-intuitive... and incorrect!!!
Worse hardware,
better performance?
Only Virtuoso-backed systems survive
multi-threaded benchmark
marks last successful query (no timeout)
1 x 1,223 queries 8 x 1,223 queries
No errors but incorrect #results!!!
FILTERs, UNIONs are challenging but
ORDER + GROUP + OPTIONAL dominate
COUNT DISTINCT
600 – 1,223
BGPs
Conclusions & Future Work
• Additional diagnostics for RDF solutions!
• Extend benchmarking software with query
correctness assessment!
• Multi-node RDF solutions???
• Towards Full paper:
– NoSQL for Ontoforce Data
– Scale out approaches for Watdiv + test LDF
– Release reusable end-to-end benchmark software:
• Setup AND Postprocessing
Thanks for your attention!!
SCALING OUT FEDERATED QUERIES
FOR LIFE SCIENCES DATA IN PRODUCTION
Dieter De Witte, Laurens De Vocht, et al.
contact: dieter.dewitte@ugent.be
slideshare:
• IMEC– IDLAB – GHENT UNIVERSITY
• ONTOFORCE

Weitere ähnliche Inhalte

Was ist angesagt?

FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
Carole Goble
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
Carole Goble
 

Was ist angesagt? (7)

ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 

Andere mochten auch

MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
GIScRG
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
Miel Vander Sande
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
Miel Vander Sande
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Pieter Heyvaert
 

Andere mochten auch (20)

ComparativeMotifFinding
ComparativeMotifFindingComparativeMotifFinding
ComparativeMotifFinding
 
Presentation Data Science Challenge
Presentation Data Science ChallengePresentation Data Science Challenge
Presentation Data Science Challenge
 
ResearchMeeting15April2015
ResearchMeeting15April2015ResearchMeeting15April2015
ResearchMeeting15April2015
 
Gemoda
GemodaGemoda
Gemoda
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital Natives
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
 
Towards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation AdministrationTowards an Interface for User-Friendly Linked Data Generation Administration
Towards an Interface for User-Friendly Linked Data Generation Administration
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
Situation of open data in Flanders
Situation of open data in FlandersSituation of open data in Flanders
Situation of open data in Flanders
 
iRail: History & current issues
iRail: History & current issuesiRail: History & current issues
iRail: History & current issues
 
Querying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through ReasoningQuerying Heterogeneous Linked Date Interfaces through Reasoning
Querying Heterogeneous Linked Date Interfaces through Reasoning
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpedia
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
DBpedia Mappings Quality Assessment
DBpedia Mappings Quality AssessmentDBpedia Mappings Quality Assessment
DBpedia Mappings Quality Assessment
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataEffect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
 

Ähnlich wie Scaling out federated queries for Life Sciences Data In Production

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 

Ähnlich wie Scaling out federated queries for Life Sciences Data In Production (20)

Infinispan, a distributed in-memory key/value data grid and cache
 Infinispan, a distributed in-memory key/value data grid and cache Infinispan, a distributed in-memory key/value data grid and cache
Infinispan, a distributed in-memory key/value data grid and cache
 
Static analysis of java enterprise applications
Static analysis of java enterprise applicationsStatic analysis of java enterprise applications
Static analysis of java enterprise applications
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at Yahoo
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity HardwareBig Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
 
HDF5 FastQuery
HDF5 FastQueryHDF5 FastQuery
HDF5 FastQuery
 
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
 
Intro to Big Data - Spark
Intro to Big Data - SparkIntro to Big Data - Spark
Intro to Big Data - Spark
 
Discovery Interfaces
Discovery InterfacesDiscovery Interfaces
Discovery Interfaces
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilind
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Facets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data ExplorationFacets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data Exploration
 
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live ConnectTableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
 

Kürzlich hochgeladen

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Kürzlich hochgeladen (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

Scaling out federated queries for Life Sciences Data In Production

  • 1. SCALING OUT FEDERATED QUERIES FOR LIFE SCIENCES DATA IN PRODUCTION Dieter De Witte, Laurens De Vocht, et al. dieter.dewitte@ugent.be • IMEC– IDLAB – GHENT UNIVERSITY • ONTOFORCE
  • 2. Catch 22!? A. No Semantic Web Applications because no Semantic Data B. No Semantic Data because no applications
  • 3. A. The LOD Cloud for Life Sciences... Ontoforce’s DISQOVER covers > 110 Life Sciences Datasets
  • 4. B. DISQOVER is an Exploratory Semantic Search UI (faceted browsing) To Click = To SPARQL
  • 5. The missing link in our catch 22? “How to run federated queries?” Direct ETL
  • 6. • Cloud Instances • PAGO amis: Scientific Benchmark = Reproducible Benchmark Benchmark Client • 1 single-threaded warm-up run (all 1,223 queries) • 1 multi-threaded (8) run • (8 x randomized order) Database Node(s)
  • 7. How to evaluate an RDF Database solution? Performance ( Data store, Dataset, Configuration, Number of nodes, Hardware (RAM) )
  • 8. Performance ( NoSQL Triple stores, Watdiv 10M, 100M, 1000M, Standard Configs, Single Node, 32 GB RAM ) SIGMOD 2016: Single Node SOTA on artificial data
  • 9. More data, more problems Timeout
  • 10. Query performance: Virtuoso Leads, Blazegraph follows Timeout
  • 11. SWAT4LS 2016: Multi-node SOTA on real data Performance ( Scale out systems, DISQOVER data, Optimized Configs, Multi-Node, Compression 64 GB RAM )
  • 12. How to deal with Big Linked Data? 1. Vertical Scaling: bigger box 2. Compression: smaller content 3. Horizontal Scaling: more boxes, 1 location 4. Federation: more boxes, more locations V1, Bla1 (single node Virtuoso, Blazegraph) V1_32 (32GB Virtuoso) Fu1 (Fuseki + HDT) V3 (Virtuoso cluster 3 nodes) Fl3 (FluidOps, aka FedX)
  • 14. Count, Union, Sort, Aggregations
  • 15. Example Query 1: Nesting, FILTERs, unbound triples
  • 17. Initial performance results were counter-intuitive... and incorrect!!! Worse hardware, better performance?
  • 18. Only Virtuoso-backed systems survive multi-threaded benchmark marks last successful query (no timeout) 1 x 1,223 queries 8 x 1,223 queries
  • 19. No errors but incorrect #results!!!
  • 20. FILTERs, UNIONs are challenging but ORDER + GROUP + OPTIONAL dominate COUNT DISTINCT 600 – 1,223 BGPs
  • 21. Conclusions & Future Work • Additional diagnostics for RDF solutions! • Extend benchmarking software with query correctness assessment! • Multi-node RDF solutions??? • Towards Full paper: – NoSQL for Ontoforce Data – Scale out approaches for Watdiv + test LDF – Release reusable end-to-end benchmark software: • Setup AND Postprocessing
  • 22. Thanks for your attention!! SCALING OUT FEDERATED QUERIES FOR LIFE SCIENCES DATA IN PRODUCTION Dieter De Witte, Laurens De Vocht, et al. contact: dieter.dewitte@ugent.be slideshare: • IMEC– IDLAB – GHENT UNIVERSITY • ONTOFORCE