1) GRLC is a tool that generates Linked Data APIs from SPARQL queries stored in a GitHub repository. It automatically builds Swagger specifications and API code by mapping the GitHub repository structure and SPARQL queries.
2) This allows SPARQL queries to be organized and maintained externally to applications in a version controlled way. The APIs generated hide the complexity of SPARQL from clients.
3) GRLC was used to build APIs for accessing historical census data, hiding SPARQL from historians. It was also used to reduce coupling between SPARQL and R code for a project analyzing the impact of early life conditions on later outcomes.
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
grlc Makes GitHub Taste Like Linked Data APIs
1. ‹#› Het begint met een idee
GRLC MAKES GITHUB TASTE
LIKE LINKED DATA APIS
Chefs
Albert Meroño-Peñuela
Rinke Hoekstra
Services and Applications over
Linked APIs and Data (SALAD)
ESWC
29-05-2016
2. Vrije Universiteit Amsterdam
VU University Amsterdam – Computer
Science (Knowledge Representation &
Reasoning group)
International Institute of Social
History (IISG), Amsterdam
CLARIAH – National Infrastructure for
Digital Humanities
> DataLegend : Structured Data Hub
Previously incubated by CEDAR –
Dutch historical censuses as 5-star
LOD
2
INSTITUTIONAL SLIDE
3. ‹#› Het begint met een idee
DISCLAIMER
3
Frustration-
driven
research
4. ‹#› Het begint met een idee
1. LD-CONSUMING
APPLICATIONS
4
5. ‹#› Het begint met een idee
5 Het begint met een idee
Publishing Dutch historical
censuses as 5-star LD
> Intensive use of RDF Data Cube
> Harmonization rules
> Provenance
1st historical census data as Linked
Data (1795-1971)
8 million observations (sex, marital
status, occupation position, housing type,
residence status)
External links
> Geographical: 2.7M
> Occupations: 350K
> Belief: 250K
High value for social historians
5 Faculty / department / title presentation
THE CEDAR STORY
6. Vrije Universiteit Amsterdam
Historians can’t really write SPARQL
Variety of access interfaces needed
6
CENSUS DATA QUERYING INTERFACES
7. Vrije Universiteit Amsterdam
CLARIAH-WP4: Structured
data hub for social historians
IPUMS, NAPP, CEDAR, etc
> Macro-, micro-, meso-data
> Civil registries, occupation, religion,
country-level economic indicators
> National (Netherlands) and
international
Mostly CSV tables turned
into RDF Data Cube and
CSVW
More than 1B triples already
Higher variety of humanities
scholars higher variety of
data access requirements)
7
SCALING VARIETY
Exi sts
Frequency Table
Variable does not yet existVariables
Mappings
Publish
Augment
Includes both external LinkedDataand
standard vocabularies, e.g. World Bank
External (Meta)Data
Existing Variables
& Codes
Provenance tracking of a
External Datasets
StructuredDataHub
11. ‹#› Het begint met een idee
11 Het begint met een idee
One .rq file for SPARQL query
Good support of query curation
processes
> Versioning
> Branching
> Clone-pull-push
Web-friendly features!
> One URI per query
> Uniquely identifiable
> De-referenceable
(raw.githubusercontent.com)
11 Faculty / department / title presentation
GITHUB AS A HUB OF
SPARQL QUERIES
12. ‹#› Het begint met een idee
LESSON 1
12
Query
centralization
helps
maintaining
distributed
applications
14. Vrije Universiteit Amsterdam
Linked Data APIs emerge
RESTful entry point to Linked Data hubs for Web applications
OpenPHACTS
…but the Linked Data API (e.g. Swagger spec, code itself) still
needs to be coded and maintained
14
MEANWHILE IN THE SEMANTIC WEB…
15. Vrije Universiteit Amsterdam
Love story – thanks KMi!
Automatically builds Swagger
specs and API code
Takes SPARQL queries as input
(1 API operation = 1 SPARQL
query)
> API call functionality limited to SPARQL
expressivity
Makes SPARQL queries uniquely
referenceable by using their
equivalent LDA operation
> Stores SPARQL internally
> But we already have uniquely
referenceable SPARQL…
15
BASIL
16. ‹#› Het begint met een idee
FRUSTRATION 2
16
Copy-pasting 200
queries!!!
&
Organization
problem
17. ‹#› Het begint met een idee
17 Het begint met een idee
Cousin of BASIL in a SALAD
Same basic principle: 1 SPARQL
query = 1 API operation
Automatically builds Swagger spec
and UI from SPARQL
But:
External query management
Organization of SPARQL queries in
the GitHub repo matches
organization of the API
Thin layer – nothing stored server-
side
Maps
> GitHub API
> Swagger spec
17 Faculty / department / title presentation
20. Vrije Universiteit Amsterdam
20
THE GRLC SERVICE
Assuming your repo is at https://github.com/:owner/:repo
and your grlc instance at :host,
> http://:host/:owner/:repo/spec returns the JSON swagger spec
> http://:host/:owner/:repo/api-docs returns the swagger UI
> http://:host/:owner/:repo/:operation?p_1=v_1...p_n=v_n calls
operation with specifiec parameter values
> Uses BASIL’s SPARQL variable name convention for query parameters
Sends requests to
> https://api.github.com/repos/:owner/:repo to look for SPARQL queries and their
decorators
> https://raw.githubusercontent.com/:owner/:repo/master/file.rq to dereference
queries, get the SPARQL, and parse it
22. Vrije Universiteit Amsterdam
22
EVALUATION – USE CASES
CEDAR: Access to census data for
historians
> Hides SPARQL
> Allows them to fill query parameters
through forms
> Co-existence of SPARQL and non-SPARQL
clients
CLARIAH - Born Under a Bad Sign:
Do prenatal and early-life
conditions have an impact on
socioeconomic and health
outcomes later in life? (uses 1891
Canada and Sweden Linked Census Data)
> Reduction of coupling between SPARQL
libs and R
> Shorter R code – input stream as CSV
23. Vrije Universiteit Amsterdam
The spectrum of Linked Data clients: SPARQL intensive applications
vs RESTful API applications
grlc uses decoupling of SPARQL from all client applications
(including LDA) as a powerful practice
Separates query curation workflows from everything else
Allows at the same time
> Web-friendly SPARQL queries
> Web-friendly RESTful APIs
Helps you to easily organise your LDA – just organise your SPARQL
repository and you’re set
Try it out!
> http://grlc.clariah-sdh.eculture.labs.vu.nl
> https://github.com/CLARIAH/grlc
23
CONCLUSIONS
24. ‹#› Het begint met een idee
THANK YOU!
@ALBERTMERONYO
DATALEGEND.NET
CLARIAH.NL
24
Hinweis der Redaktion
Organization of the GitHub query repo matches the organization
we want for its equivalent API
Inception of the idea that the repo actually matches the API…