SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
RDF Database-as-a-Service with S4
Marin Dimitrov, CTO of Ontotext
Apr 27th, 2015
RDF DBaaS with S4 / AKSW Colloquium #1Apr 2015
• Self-Service Semantic Suite (S4)
• RDF DBaaS on AWS
• Demo
Contents
#2RDF DBaaS with S4 / AKSW Colloquium Apr 2015
About Ontotext
• Provides products & solutions for content
enrichment and metadata management
– 70 employees, headquarters in Sofia (Bulgaria)
– Sales presence in London, Washington & Boston
• Major clients and industries
– Media & Publishing
– Health Care & Life Sciences
– Cultural Heritage & Digital Libraries
– Government
– Education
#3RDF DBaaS with S4 / AKSW Colloquium Apr 2015
The Self-Service Semantic Suite
(S4)
#4RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• On-demand capabilities for text analytics, content
enrichment and metadata management
– Text analytics for news, life sciences and social media
– RDF graph database as-a-service
– Access to large open knowledge graphs
• Available anytime, anywhere
– Simple RESTful services
• Simple, pay-per-use pricing
– No upfront commitments
What is S4?
#5RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Enables quick prototyping
– Instantly available, no provisioning & operations
required
– Focus on building applications, don’t worry about
infrastructure
• Free tier
– Even bigger free quotas for research groups & projects
• Easy to start, shorter learning curve
– Various add-ons, SDKs and demo code
• Based on enterprise semantic technology by
Ontotext
S4 benefits
#6RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Text analytics services
– News annotation
– News categorisation
– Biomedical
– Twitter
• Entity linking & disambiguation
– Mappings to DBpedia & GeoNames instances
– Mappings to biomedical data sources (LinkedLifeData)
• HTML, MS Word, XML, plain text input
• Simple JSON output
Text analytics with S4
#7RDF DBaaS with S4 / AKSW Colloquium Apr 2015
News analytics example
#8
S4 result
RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Available from AWS Marketplace
• Variety of hardware configurations
– 2 to 8 CPU cores / 8 to 61 GB RAM
– IOPS performance & encryption (EBS)
• Manage large data volumes
• Pay-per-hour pricing
Self-managed RDF DB in the Cloud
#9RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Low-cost DBaaS available 24/7
• Ideal for small & moderate data volumes
• Instantly deploy new databases when needed
• Zero administration: automated operations,
maintenance & upgrades
• Users pay only for the actual database utilisation
– Number of triples stored + number of queries per month
Fully managed RDF DB in the Cloud
#10RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• SPARQL query endpoint to the FactForge
knowledge graph
– 500 million entities / 5 billion triples
• Key LOD datasets integrated
– DBpedia, Freebase, GeoNames, WordNet
– Dublin Core, SKOS, PROTON ontologies and
vocabularies
Knowledge graphs with S4
#11RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• (available soon)
• Knowledge Graph bundles
– DBpedia, Wikidata, GeoNames, …
– GraphDB RDF database (self-managed @ AWS)
– 3rd party interactive data exploration tool (faceted
search, data navigation, dynamic charts)
• Get instant & reliable access to KGs without
dealing with provisioning, data import,
maintenance, …
Knowledge graphs with S4
#12RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Java & C# SDKs
• Sample code
– Java, C#, NodeJS, JavaScript, Python, PHP, Groovy
– Curl examples for the most impatient
• GATE & UIMA plugins
• Firefox & Chrome add-ons
• Online documentation
S4 for developers
#13RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• DaPaaS & ProDataMarket
– Goal: Open Data / Linked Data publishing & hosting
– S4 role: scalable Linked Data hosting infrastructure
• KConnect
– Goal: semantic annotation, search & analytics for
healthcare data
– S4 role: scalable text analytics & RDF data management
infrastructure
Research projects using S4
#14RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Fully Managed RDF Database-
as-a-Service
#15RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Elastic
– dynamically adapt to data & query volumes
• High availability & resilience
– no SPFs, “graceful degradation” of performance upon
failures
• Cost efficient
– cost aware architecture
– Key aspect for Open Data scenarios like DaPaaS &
ProDataMarket
• Isolation of the multi-tenant databases
• Fair use of shared resources
Requirements
#16RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Micro DB
– Up to 1M triples
– FREE, available now
• Extra Small DB (10M triples)
• Small DB (50M)
• Medium DB (250M)
• Large DB (1B)
RDF DBaaS options on S4
#17RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• AWS based
– Storage, compute, load balancing, integration services…
• Ontotext GraphDB for the database instances
• OpenRDF REST services
• Docker for containerisation
• Network-attached volumes (EBS) for data storage
• A DBaaS on S4 is…
– A GraphDB instance
– Running within a Docker container
– With a private EBS data volume
Implementation
#18RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Routing nodes
– Expose OpenRDF RESTful services to apps
– Access control & quota checks
– Forward client requests to the proper data node
– Temporarily queue requests when necessary
• Data nodes
– Multiple Docker containers (GDB+EBS) per node
• Coordinator (single)
– Distribute DB initialisation / creation tasks to data
nodes
• Management Console
S4 DBaaS architecture
#19RDF DBaaS with S4 / AKSW Colloquium Apr 2015
S4 DBaaS architecture
#20RDF DBaaS with S4 / AKSW Colloquium Apr 2015
REST apps
3rd party RDF
tools
Quota&AccessControl
routers
data nodes
coordinator
EBS
backups
SNS
Docker
Repository
Account
management
Quota
management
reporting
Monitoring
& Logging Dynamo
Amazon S3
images
• CRUD
– Router node receives a request
– Routes it to the proper data node & container
– Receives a response, forwards it back to client app
• Routing updates
– Data nodes push notification via SNS – “hearbeats” +
changes regarding the hosted DBs (if any)
– Each routing node receives the notifications (via SNS)
and updates its routing tables
– Coordinator also receives notifications, learns which
DBs are operational / down for maintenance
Normal operations
#21RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Failure case #1 – data node crash
#22RDF DBaaS with S4 / AKSW Colloquium Apr 2015
REST apps
3rd party RDF
tools
Quota&AccessControl
routers
data nodes
coordinator
EBS
SNS
Docker
Repository
12
2
2
3
Recovery from a data node crash
#23RDF DBaaS with S4 / AKSW Colloquium Apr 2015
REST apps
3rd party RDF
Visualisation
Quota&AccessControl
routers
data nodes
Coordinator
EBS
SNS
Docker
Repository
1
2
3+4
56
6
6
7
Auto Scaling
Failure case #2 – router crash &
recovery
#24RDF DBaaS with S4 / AKSW Colloquium Apr 2015
REST apps
3rd party RDF
tools
Quota&AccessControl
routers
data nodes
coordinator
EBS
SNS
Docker
Repository
13
Auto Scaling
4
5
6
7
8
2
• (open connections from client apps to the node
are terminated)
• Auto-scaler starts a new router node
– New router subscribes to SNS for heartbeats & updates
• Load balancer starts sending new client requests
to router
– Router puts them in the local queue (if routing table is
still incomplete)
• Heartbeats from data nodes are received
– Routing information is now complete
– Router starts sending the queued requests to data
nodes
Recovery from a router crash
#25RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Failure case #3 – coordinator crash &
recovery
#26RDF DBaaS with S4 / AKSW Colloquium Apr 2015
REST apps
3rd party RDF
tools
Quota&AccessControl
routers
data nodes
coordinator
EBS
SNS
Docker
Repository
2
Auto Scaling
4
5
6
6
3
Create DB 1
• Routers can route requests to data nodes as usual
– … but new DBs cannot be created temporarily
– … and data nodes with free container slots can’t get
info on DBs waiting for initialisation
• AWS Auto-scaler starts a new Coordinator node
– Coordinator reads a list of all registered DBs from the
metadata store & subscribes to SNS
• Coordinator starts receiving heartbeats & updates
from data nodes
– … learns which DBs are operational / pending
– … and resumes distributing new / pending DBs
initialisation tasks to the data nodes with free slots
Failure case #3 – coordinator crash &
recovery
#27RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• Combination of coordinator + data node + routing
node crash – same as #1 + #2 + #3
• Routers depend on data nodes
• Data nodes depend on Coordinator
• Coordinator does not depend on other nodes
– No heartbeats coming, means all DBs are down
– Start distributing DB initialisation tasks whenever a
request comes from a working data node
– Eventually, all data nodes are up, DBs initialised,
heartbeats & routing updates start coming
– … and routers can start routing client requests
Composite failure & recovery
#28RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Management interface
#29RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Micro, XS, S, M, or L
I/O performance
R/O access to Open
Data services or
open knowledge
graphs
Management interface
#30RDF DBaaS with S4 / AKSW Colloquium Apr 2015
DBaaS endpoint
DB details summary
Backup, export, change
settings, delete
Run a test query
• Gradually introduce XS, S, M and L instances
• Integration with the GraphDB Workbench
management UI
• LDF based containers
• Multi-datacenter deployment
• Replication across datacenters (single master)
Roadmap
#31RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• “On-demand Text Analytics and Metadata
Management with S4” (ESaaSA @ CLOSER’2015)
• “Text Analytics and Linked Data Management As-
a-Service with S4” (Wasabi @ ESWC’2015)
• “Low-cost Open Data As-a-Service in the Cloud”
(SemDev @ ESWC’2015)
More Details
#32RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Demo
#33RDF DBaaS with S4 / AKSW Colloquium Apr 2015
• (create an account & generate an API key pair)
• Create a new DB
• Create a new repository in the DB
– via the REST API / OpenRDF Java SDK / curl
– …or via UI tools like the OpenRDF Workbench
• Import sample data (REST / OpenRDF Workbench)
• Run a query through the public SPARQL endpoint
Demo scenario
#34RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Demo data – Universities in Saxony
#35RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#1 Create a database
#36RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#2a Create a repository & load data
(curl)
#37RDF DBaaS with S4 / AKSW Colloquium Apr 2015
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix graphdb: <http://www.ontotext.com/trree/owlim#>.
[] a rep:Repository ;
rep:repositoryID “test01" ;
rdfs:label "Description of my repository" ;
rep:repositoryImpl [
rep:repositoryType "openrdf:SailRepository" ;
sr:sailImpl [
graphdb:ruleset "owl-horst-optimized" ;
sail:sailType "owlim:Sail" ;
graphdb:base-URL "http://example.org/graphdb#" ;
graphdb:repository-type "file-repository" ;
]
].
Repository
configuration file
config.ttl
• Repository name: ”test01”
• OWL-Horst reasoning ruleset
#2a Create a repository & load data
(curl)
#38RDF DBaaS with S4 / AKSW Colloquium Apr 2015
API_KEY=…
KEY_SECRET=…
USER=…
DATABASE=…
REPOSITORY=…
SERVICE_ENDPOINT="https://$API_KEY:$KEY_SECRET@rdf.s4.ontotext.com/$USER/$DATABASE"
curl -X POST -H “Content-Type:application/x-turtle”
-T config.ttl $SERVICE_ENDPOINT/repositories/SYSTEM/rdf-graphs/service?graph=http://example.com#g1
curl -X POST -H “Content-Type:application/x-turtle”
-d “<http://example.com#g1> a <http://www.openrdf.org/config/repository#RepositoryContext>.”
$SERVICE_ENDPOINT/repositories/SYSTEM/statements
curl -X POST -H "Content-Type:application/rdf+xml;charset=UTF-8" -T example.rdf
$SERVICE_ENDPOINT/repositories/$REPOSITORY/statements
Create a repository
Upload sample data
from example.rdf
• User: 4730361296
• Database: demo01
• Repository: test01
• Configuration: config.ttl
#2b Create a repository & load data
(OpenRDF Workbench)
#39RDF DBaaS with S4 / AKSW Colloquium Apr 2015
DBaaS endpoint
#2b Create a repository & load data
(OpenRDF Workbench)
#40RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#2b Create a repository & load data
(OpenRDF Workbench)
#41RDF DBaaS with S4 / AKSW Colloquium Apr 2015
DBaaS endpoint
#2b Create a repository & load data
(OpenRDF Workbench)
#42RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#2b Create a repository & load data
(OpenRDF Workbench)
#43RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#3a SPARQL query
(OpenRDF Workbench)
#44RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#3a SPARQL query
(OpenRDF Workbench)
#45RDF DBaaS with S4 / AKSW Colloquium Apr 2015
#3b SPARQL query
(from the S4 Management Console)
#46RDF DBaaS with S4 / AKSW Colloquium Apr 2015
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbp-prop: <http://dbpedia.org/property/>
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
SELECT ?name ?numberOfStudents ?staff ?established
WHERE {
dbpedia:University_of_Leipzig rdfs:label ?name ;
dbp-prop:students ?numberOfStudents ;
dbp-prop:staff ?staff ;
dbp-prop:established ?established .
}
• S4 provides an enterprise RDF DBaaS
• Resilient design, high availability
• Instantly available whenever needed, easy to use,
OpenRDF REST services
• Zero administration: automated operations,
maintenance & upgrades
• Free DBs up to 1M triples (even more for research
teams & projects)
• Check out http://s4.ontotext.com
Key takeaways
#47RDF DBaaS with S4 / AKSW Colloquium Apr 2015
Thank you!
#48RDF DBaaS with S4 / AKSW Colloquium Apr 2015

Weitere ähnliche Inhalte

Was ist angesagt?

Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 

Was ist angesagt? (20)

Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020
 
Presto Strata London 2019: Cost-Based Optimizer for interactive SQL on anything
Presto Strata London 2019: Cost-Based Optimizer for interactive SQL on anythingPresto Strata London 2019: Cost-Based Optimizer for interactive SQL on anything
Presto Strata London 2019: Cost-Based Optimizer for interactive SQL on anything
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
 
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
 
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
Big Data Day LA 2015 - Introducing N1QL: SQL for Documents by Jeff Morris of ...
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
 
Data processing with spark in r &amp; python
Data processing with spark in r &amp; pythonData processing with spark in r &amp; python
Data processing with spark in r &amp; python
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data IntegrationA Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
 
sitMAI, Helping a Friend
sitMAI, Helping a FriendsitMAI, Helping a Friend
sitMAI, Helping a Friend
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
 
introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 

Andere mochten auch

Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
Marin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
Nikolay Stoitsev
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
Marin Dimitrov
 

Andere mochten auch (12)

Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
From Big Data to Smart Data
From Big Data to Smart DataFrom Big Data to Smart Data
From Big Data to Smart Data
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 

Ähnlich wie RDF Database-as-a-Service with S4

Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
DataWorks Summit
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
DataWorks Summit
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
Michael Kehoe
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
 

Ähnlich wie RDF Database-as-a-Service with S4 (20)

Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Lens at apachecon
Lens at apacheconLens at apachecon
Lens at apachecon
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Building Fast Applications for Streaming Data
Building Fast Applications for Streaming DataBuilding Fast Applications for Streaming Data
Building Fast Applications for Streaming Data
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
DBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data ApplicationsDBP-010_Using Azure Data Services for Modern Data Applications
DBP-010_Using Azure Data Services for Modern Data Applications
 
Ai big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenkoAi big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenko
 
Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"
 
Boston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the EnterpriseBoston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the Enterprise
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
Otimizaçþes de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizaçþes de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizaçþes de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizaçþes de Projetos de Big Data, Dw e AI no Microsoft Azure
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
How to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentHow to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environment
 
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
 
Not only SQL - Database Choices
Not only SQL - Database ChoicesNot only SQL - Database Choices
Not only SQL - Database Choices
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
 

Mehr von Marin Dimitrov

Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
Marin Dimitrov
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
Marin Dimitrov
 

Mehr von Marin Dimitrov (14)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 

KĂźrzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

KĂźrzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

RDF Database-as-a-Service with S4

  • 1. RDF Database-as-a-Service with S4 Marin Dimitrov, CTO of Ontotext Apr 27th, 2015 RDF DBaaS with S4 / AKSW Colloquium #1Apr 2015
  • 2. • Self-Service Semantic Suite (S4) • RDF DBaaS on AWS • Demo Contents #2RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 3. About Ontotext • Provides products & solutions for content enrichment and metadata management – 70 employees, headquarters in Sofia (Bulgaria) – Sales presence in London, Washington & Boston • Major clients and industries – Media & Publishing – Health Care & Life Sciences – Cultural Heritage & Digital Libraries – Government – Education #3RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 4. The Self-Service Semantic Suite (S4) #4RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 5. • On-demand capabilities for text analytics, content enrichment and metadata management – Text analytics for news, life sciences and social media – RDF graph database as-a-service – Access to large open knowledge graphs • Available anytime, anywhere – Simple RESTful services • Simple, pay-per-use pricing – No upfront commitments What is S4? #5RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 6. • Enables quick prototyping – Instantly available, no provisioning & operations required – Focus on building applications, don’t worry about infrastructure • Free tier – Even bigger free quotas for research groups & projects • Easy to start, shorter learning curve – Various add-ons, SDKs and demo code • Based on enterprise semantic technology by Ontotext S4 benefits #6RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 7. • Text analytics services – News annotation – News categorisation – Biomedical – Twitter • Entity linking & disambiguation – Mappings to DBpedia & GeoNames instances – Mappings to biomedical data sources (LinkedLifeData) • HTML, MS Word, XML, plain text input • Simple JSON output Text analytics with S4 #7RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 8. News analytics example #8 S4 result RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 9. • Available from AWS Marketplace • Variety of hardware configurations – 2 to 8 CPU cores / 8 to 61 GB RAM – IOPS performance & encryption (EBS) • Manage large data volumes • Pay-per-hour pricing Self-managed RDF DB in the Cloud #9RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 10. • Low-cost DBaaS available 24/7 • Ideal for small & moderate data volumes • Instantly deploy new databases when needed • Zero administration: automated operations, maintenance & upgrades • Users pay only for the actual database utilisation – Number of triples stored + number of queries per month Fully managed RDF DB in the Cloud #10RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 11. • SPARQL query endpoint to the FactForge knowledge graph – 500 million entities / 5 billion triples • Key LOD datasets integrated – DBpedia, Freebase, GeoNames, WordNet – Dublin Core, SKOS, PROTON ontologies and vocabularies Knowledge graphs with S4 #11RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 12. • (available soon) • Knowledge Graph bundles – DBpedia, Wikidata, GeoNames, … – GraphDB RDF database (self-managed @ AWS) – 3rd party interactive data exploration tool (faceted search, data navigation, dynamic charts) • Get instant & reliable access to KGs without dealing with provisioning, data import, maintenance, … Knowledge graphs with S4 #12RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 13. • Java & C# SDKs • Sample code – Java, C#, NodeJS, JavaScript, Python, PHP, Groovy – Curl examples for the most impatient • GATE & UIMA plugins • Firefox & Chrome add-ons • Online documentation S4 for developers #13RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 14. • DaPaaS & ProDataMarket – Goal: Open Data / Linked Data publishing & hosting – S4 role: scalable Linked Data hosting infrastructure • KConnect – Goal: semantic annotation, search & analytics for healthcare data – S4 role: scalable text analytics & RDF data management infrastructure Research projects using S4 #14RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 15. Fully Managed RDF Database- as-a-Service #15RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 16. • Elastic – dynamically adapt to data & query volumes • High availability & resilience – no SPFs, “graceful degradation” of performance upon failures • Cost efficient – cost aware architecture – Key aspect for Open Data scenarios like DaPaaS & ProDataMarket • Isolation of the multi-tenant databases • Fair use of shared resources Requirements #16RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 17. • Micro DB – Up to 1M triples – FREE, available now • Extra Small DB (10M triples) • Small DB (50M) • Medium DB (250M) • Large DB (1B) RDF DBaaS options on S4 #17RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 18. • AWS based – Storage, compute, load balancing, integration services… • Ontotext GraphDB for the database instances • OpenRDF REST services • Docker for containerisation • Network-attached volumes (EBS) for data storage • A DBaaS on S4 is… – A GraphDB instance – Running within a Docker container – With a private EBS data volume Implementation #18RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 19. • Routing nodes – Expose OpenRDF RESTful services to apps – Access control & quota checks – Forward client requests to the proper data node – Temporarily queue requests when necessary • Data nodes – Multiple Docker containers (GDB+EBS) per node • Coordinator (single) – Distribute DB initialisation / creation tasks to data nodes • Management Console S4 DBaaS architecture #19RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 20. S4 DBaaS architecture #20RDF DBaaS with S4 / AKSW Colloquium Apr 2015 REST apps 3rd party RDF tools Quota&AccessControl routers data nodes coordinator EBS backups SNS Docker Repository Account management Quota management reporting Monitoring & Logging Dynamo Amazon S3 images
  • 21. • CRUD – Router node receives a request – Routes it to the proper data node & container – Receives a response, forwards it back to client app • Routing updates – Data nodes push notification via SNS – “hearbeats” + changes regarding the hosted DBs (if any) – Each routing node receives the notifications (via SNS) and updates its routing tables – Coordinator also receives notifications, learns which DBs are operational / down for maintenance Normal operations #21RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 22. Failure case #1 – data node crash #22RDF DBaaS with S4 / AKSW Colloquium Apr 2015 REST apps 3rd party RDF tools Quota&AccessControl routers data nodes coordinator EBS SNS Docker Repository 12 2 2 3
  • 23. Recovery from a data node crash #23RDF DBaaS with S4 / AKSW Colloquium Apr 2015 REST apps 3rd party RDF Visualisation Quota&AccessControl routers data nodes Coordinator EBS SNS Docker Repository 1 2 3+4 56 6 6 7 Auto Scaling
  • 24. Failure case #2 – router crash & recovery #24RDF DBaaS with S4 / AKSW Colloquium Apr 2015 REST apps 3rd party RDF tools Quota&AccessControl routers data nodes coordinator EBS SNS Docker Repository 13 Auto Scaling 4 5 6 7 8 2
  • 25. • (open connections from client apps to the node are terminated) • Auto-scaler starts a new router node – New router subscribes to SNS for heartbeats & updates • Load balancer starts sending new client requests to router – Router puts them in the local queue (if routing table is still incomplete) • Heartbeats from data nodes are received – Routing information is now complete – Router starts sending the queued requests to data nodes Recovery from a router crash #25RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 26. Failure case #3 – coordinator crash & recovery #26RDF DBaaS with S4 / AKSW Colloquium Apr 2015 REST apps 3rd party RDF tools Quota&AccessControl routers data nodes coordinator EBS SNS Docker Repository 2 Auto Scaling 4 5 6 6 3 Create DB 1
  • 27. • Routers can route requests to data nodes as usual – … but new DBs cannot be created temporarily – … and data nodes with free container slots can’t get info on DBs waiting for initialisation • AWS Auto-scaler starts a new Coordinator node – Coordinator reads a list of all registered DBs from the metadata store & subscribes to SNS • Coordinator starts receiving heartbeats & updates from data nodes – … learns which DBs are operational / pending – … and resumes distributing new / pending DBs initialisation tasks to the data nodes with free slots Failure case #3 – coordinator crash & recovery #27RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 28. • Combination of coordinator + data node + routing node crash – same as #1 + #2 + #3 • Routers depend on data nodes • Data nodes depend on Coordinator • Coordinator does not depend on other nodes – No heartbeats coming, means all DBs are down – Start distributing DB initialisation tasks whenever a request comes from a working data node – Eventually, all data nodes are up, DBs initialised, heartbeats & routing updates start coming – … and routers can start routing client requests Composite failure & recovery #28RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 29. Management interface #29RDF DBaaS with S4 / AKSW Colloquium Apr 2015 Micro, XS, S, M, or L I/O performance R/O access to Open Data services or open knowledge graphs
  • 30. Management interface #30RDF DBaaS with S4 / AKSW Colloquium Apr 2015 DBaaS endpoint DB details summary Backup, export, change settings, delete Run a test query
  • 31. • Gradually introduce XS, S, M and L instances • Integration with the GraphDB Workbench management UI • LDF based containers • Multi-datacenter deployment • Replication across datacenters (single master) Roadmap #31RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 32. • “On-demand Text Analytics and Metadata Management with S4” (ESaaSA @ CLOSER’2015) • “Text Analytics and Linked Data Management As- a-Service with S4” (Wasabi @ ESWC’2015) • “Low-cost Open Data As-a-Service in the Cloud” (SemDev @ ESWC’2015) More Details #32RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 33. Demo #33RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 34. • (create an account & generate an API key pair) • Create a new DB • Create a new repository in the DB – via the REST API / OpenRDF Java SDK / curl – …or via UI tools like the OpenRDF Workbench • Import sample data (REST / OpenRDF Workbench) • Run a query through the public SPARQL endpoint Demo scenario #34RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 35. Demo data – Universities in Saxony #35RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 36. #1 Create a database #36RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 37. #2a Create a repository & load data (curl) #37RDF DBaaS with S4 / AKSW Colloquium Apr 2015 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix rep: <http://www.openrdf.org/config/repository#>. @prefix sr: <http://www.openrdf.org/config/repository/sail#>. @prefix sail: <http://www.openrdf.org/config/sail#>. @prefix graphdb: <http://www.ontotext.com/trree/owlim#>. [] a rep:Repository ; rep:repositoryID “test01" ; rdfs:label "Description of my repository" ; rep:repositoryImpl [ rep:repositoryType "openrdf:SailRepository" ; sr:sailImpl [ graphdb:ruleset "owl-horst-optimized" ; sail:sailType "owlim:Sail" ; graphdb:base-URL "http://example.org/graphdb#" ; graphdb:repository-type "file-repository" ; ] ]. Repository configuration file config.ttl • Repository name: ”test01” • OWL-Horst reasoning ruleset
  • 38. #2a Create a repository & load data (curl) #38RDF DBaaS with S4 / AKSW Colloquium Apr 2015 API_KEY=… KEY_SECRET=… USER=… DATABASE=… REPOSITORY=… SERVICE_ENDPOINT="https://$API_KEY:$KEY_SECRET@rdf.s4.ontotext.com/$USER/$DATABASE" curl -X POST -H “Content-Type:application/x-turtle” -T config.ttl $SERVICE_ENDPOINT/repositories/SYSTEM/rdf-graphs/service?graph=http://example.com#g1 curl -X POST -H “Content-Type:application/x-turtle” -d “<http://example.com#g1> a <http://www.openrdf.org/config/repository#RepositoryContext>.” $SERVICE_ENDPOINT/repositories/SYSTEM/statements curl -X POST -H "Content-Type:application/rdf+xml;charset=UTF-8" -T example.rdf $SERVICE_ENDPOINT/repositories/$REPOSITORY/statements Create a repository Upload sample data from example.rdf • User: 4730361296 • Database: demo01 • Repository: test01 • Configuration: config.ttl
  • 39. #2b Create a repository & load data (OpenRDF Workbench) #39RDF DBaaS with S4 / AKSW Colloquium Apr 2015 DBaaS endpoint
  • 40. #2b Create a repository & load data (OpenRDF Workbench) #40RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 41. #2b Create a repository & load data (OpenRDF Workbench) #41RDF DBaaS with S4 / AKSW Colloquium Apr 2015 DBaaS endpoint
  • 42. #2b Create a repository & load data (OpenRDF Workbench) #42RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 43. #2b Create a repository & load data (OpenRDF Workbench) #43RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 44. #3a SPARQL query (OpenRDF Workbench) #44RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 45. #3a SPARQL query (OpenRDF Workbench) #45RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 46. #3b SPARQL query (from the S4 Management Console) #46RDF DBaaS with S4 / AKSW Colloquium Apr 2015 PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbp-prop: <http://dbpedia.org/property/> PREFIX dbp-ont: <http://dbpedia.org/ontology/> SELECT ?name ?numberOfStudents ?staff ?established WHERE { dbpedia:University_of_Leipzig rdfs:label ?name ; dbp-prop:students ?numberOfStudents ; dbp-prop:staff ?staff ; dbp-prop:established ?established . }
  • 47. • S4 provides an enterprise RDF DBaaS • Resilient design, high availability • Instantly available whenever needed, easy to use, OpenRDF REST services • Zero administration: automated operations, maintenance & upgrades • Free DBs up to 1M triples (even more for research teams & projects) • Check out http://s4.ontotext.com Key takeaways #47RDF DBaaS with S4 / AKSW Colloquium Apr 2015
  • 48. Thank you! #48RDF DBaaS with S4 / AKSW Colloquium Apr 2015