1. How much semantic data on small devices? Mathieu d’Aquin, AndriyNikolov and Enrico Motta Knowledge Media Institute, The Open Univeristy, UK m.daquin@open.ac.uk @mdaquin
4. Extracting sets of small-scale ontologies Clusters of ontologies having similar characteristics, except for size
5. Extracting sets of small-scale Ontologies Characteristics of ontologies Size (tiples): varies from very small scale to medium scale Ratio class/prop: allowing 50% variance Ratio class/inst.: allowing 50% variance DL expressivity: Complexity of the language 99 automatically created clusters Manual selection of 10
7. Queries Using real life ontologies need domain independent Queries A set of 8 generic queries of varying complexity, and which results might depend on inference Select all instances of all classes Select all comments Select all labels and comments Select all labels Select all classes (RDFS/OWL/DAML) Select all properties by their domain Select all RDFS classes Select all properties applied to instances of all classes
8. Running the benchmarks – Triple Stores Jena with TDB persistent storage R As above + RDFS reasoning Sesame with persistent storage R As above + RDFS reasoning Mulgara with default configuration
10. Running the benchmarks - Measures Loading time: for each ontologies in an empty, re-initialized store. Disk Space: of the persistent store right after loading. Memory consumption: of the triple store process right after loading the ontology. Query time: for each ontology, averaged over the 8 queries.
19. Conclusion – on tests Sesame performs best in almost all aspects, even when including reasoning Reasoning has big impact on Jena TDB at query time Mulgara is clearly not adequate in a small-scale scenario
20. Conclusion – on small-scale benchmarking Validates our assumption that small-scale benchmarks give different results than large-scale benchmarks Points out the need for more work to tackle the small-scale scenarios Results are not always clear cut in every aspects: benchmarks as support to decide which tool to use, depending on the application constraints