Streamlining Python Development: A Guide to a Modern Project Setup
Loupe's Model
1. LOUPE’S MODEL
USE CASES AND REQUIREMENTS
Nandana Mihindukulasooriya, María Poveda Villalón,
Raúl García Castro
Ontology Engineering Group. Departamento de Inteligencia Artificial.
Facultad de Informática, Universidad Politécnica de Madrid.
Campus de Montegancedo s/n.
28660 Boadilla del Monte. Madrid. Spain
{nandana, mpoveda, rgarcia}@fi.upm.es
3. Loupe - Overview
3
Explore the vocabularies used and the abstract triple patterns in 5+
billion triples including all Dbpedia datasets, Wikidata, Linked Brainz,
Bio2RDF.
Loupe helps to understand data, uncover patterns, formulate queries, and detect
quality issues
4. Loupe - Overview
4
Explore the vocabularies used and the abstract triple patterns in 5+
billion triples including all Dbpedia datasets, Wikidata, Linked Brainz,
Bio2RDF.
Loupe helps to understand data, uncover patterns, formulate queries, and detect
quality issues
No RDF data, No Public API
5. Loupe - Google Analytics
5
• Users from 86 countries
• Spain(23.76%), US (16.69%), Germany
(10.64%), UK (9.14%), Italy (4.51%)
8. Loupe – LOD Laundromat integration
8Nandana Mihindukulasooriya, OEG
• LOD Laundromat
• 32 billion triples from 650K documents
• cleaned for syntax errors and duplicates
• coverage of smaller documents
• Collaboration with VU University Amsterdam
• Indexing all data from LOD Laundromat
10. Dataset descriptions
• Bridge between publishers and consumers
• A dataset description expresses metadata about
RDF datasets (e.g., DCAT, VoID)
• statistics, vocabularies, structural metadata.
• A dataset profile is a set of dataset
characteristics that allow
• To describe in the best possible way a dataset
• To separate it maximally from other datasets
• Can be used for dataset recommendation
10
17. UC::ex4 - Automatic Dataset Classification
• Generic vs Domain specific datasets
• size
• number of vocabularies
• number of classes
• number of properties
• Detection of the domain using the vocabularies used
• High-level domains (E.g., cross domain, life sciences,
publications, government, geographic)
17
22. UC::ex6 - Dataset Discovery / Search
• Simple
• I want to find dataset(s) that
• contain information about persons with some concrete
information
• E.g., “give me datasets that have more than 500
instances of foaf:Person that have the dbo:birthPlace
property”
• Advanced
• I want to find dataset(s) that
• can answer a given sparql query
• contain data that fit to a given W3C RDF data shape
22
23. UC::ex7 - Dataset ranking
• Ranking metrics
• Size
• number of triples (of a given pattern)
• number of instance of a given class
• Richness
• the avg number of properties per instance
• General vs Domain specific dataset
• # classes, # of properties, # triples
• Provence information
23
26. Ontology development UC
• Reuse ontology elements used in datasets
• Look for patterns
• Ontology reuse reports
26
27. Ontology development UC
• Reuse ontology elements used in datasets
• Look for patterns
• Ontology reuse reports
• Ontology monitoring
• Why some classes or properties are not used?
• Aren’t they relevant?
• Are other classes are used for the same purpose?
27
28. Ontology development UC
• Reuse ontology elements used in datasets
• Look for patterns
• Ontology reuse reports
• Ontology monitoring
• Why some classes or properties are not used?
• Aren’t they relevant?
• Are other classes are used for the same purpose?
• Ontology comparison reports
28
48. Triple patterns
48
How many triples that have a given
subject class, property and object
class are there.
< s, P, o >
< s, a, C1 >
< o, a, C2 >
Count
51. Languages
51
How many strings tagged with
a given language are there.
< x, b, “”@lang >
Count
Fixed
How many triples tagged with
a given language are there.
52. Languages
52
How many strings tagged with
a given language are there.
< x, b, “”@lang >
Count
Fixed
How many triples tagged with
a given language are there.
< s,b, “”@lang >
Fixed
Count
55. LOUPE’S MODEL
USE CASES AND REQUIREMENTS
Nandana Mihindukulasooriya, María Poveda Villalón,
Raúl García Castro
Ontology Engineering Group. Departamento de Inteligencia Artificial.
Facultad de Informática, Universidad Politécnica de Madrid.
Campus de Montegancedo s/n.
28660 Boadilla del Monte. Madrid. Spain
{nandana, mpoveda, rgarcia}@fi.upm.es