SlideShare a Scribd company logo
1 of 17
SSONDE:Semantic Similarity On
     liNked Data Entities
       (Example/DEMO)
                  Riccardo Albertoni
             ralbertoni@delicias.dia.fi.upm.es
                Ontology Engineering Group
             Departamento de Inteligencia Artificial
                   Facultad de Informática
              Universidad Politécnica de Madrid


       Acknowledgment: SSONDE in its current
(pre)release results mainly from the research activity
   I did at IMATI-CNR in collaboration with M. De
                      Martino
Linked data Crawling architectural pattern

                           Build analysis   Cluster analysis   Explorative search
                             services                            on resources

                                                       SSONDE


                     LDSPIDER/FUSE           LDIF
                           KI




Riccardo Albertoni                                                             2
Getting started

        SSONDE’s code
        • is hosted as an Google Code
         project, http://code.google.com/p/ssonde/
        • licenced as open source code (GNU GPL v3)

        Info about how getting started are available at
         http://code.google.com/p/ssonde/wiki/GettingStarted




Riccardo Albertoni                                               3
Applying SSONDE

Let’s apply SSONDE
• on Linked data resources exposed by
   third parties
     • Info about CNR provided by
         Gangemi et all, Data.cnr.it
• To work out similarities among
   researchers according to their research
   interests




                                             Date: 15/03/2012
                                                         4
What to download/install for this Example

        - SSONDE
                     -   svn checkout
                         http://ssonde.googlecode.com/svn/SSONDEv1
                         SSONDEV1DEMO


         For crawling data
        - LDSpider
                     - code.google.com/p/ldspider/
        - Fuseski
                     - http://jena.apache.org/documentation/serving_data/index.ht
                       ml




Riccardo Albertoni                                                                  5
Crawling RDF fragments from data.cnr.it

        Which resources are you interested on?
        1.   Identify a set of seeds entities
                     • E.g. researchers,
                       http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleEste
                       rno/ID226 (Riccardo Albertoni)
        2. Figure out which entity features (i.e., Object
        properties and Data properties) are interesting
                     • Analyse schemas which are deployed to characterize that
                       kind of resources, e.g.,
                         • http://www.cnr.it/ontology/cnr/pubblicazioni.owl
                           (prefix:pub)
                             • pub:autoreCNRDi (the papers written by the
                               authors)
                             • dc:subject (authors scientific interest)

Riccardo Albertoni                                                                     6
Crawling RDF fragments from data.cnr.it

        3. Crawl metadata of resources to be analyzed
        creating a seedDATACNRIT.txt file
                     • java -jar ./LDSpider/ldspider-1.1e.jar -s seedDATACNRIT.txt
                       -b 5 -1 -oe http://localhost:3030/XXX/update
        3 bis. if you have a more precise idea of which
          feature you need, the list of properties to be
          traversed can be specified by –f option
                     • java -jar ./LDSpider/ldspider-1.1e.jar -b 5 -1
                     -f http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autoreCNRDi
                     -f http://www.cnr.it/ontology/cnr/pubblicazioni.owl#coautore
                     -f http://purl.org/dc/terms/subject -f
                         http://www.w3.org/2004/02/skos/core#broader -s
                       seedDATACNRIT.txt -oe "http://localhost:3030/XXX/update"



Riccardo Albertoni                                                                     7
Configuration                          Similarity                                   Output
    List of Instances                    Context Layer
     Java Class to                                                                Similarity matrix in
                                         Ontology Layer                                  CVS
    generate the list




                                                                                                          SSONDE
                                          Data Layer
     Ref. Context                                                                   n-most similar
                                                                                       entities
     Kind of Store                     Data wrappers                                  In JSON
       Ref. Rules          JENA   JENA       JENA        Virtuoso    ...
   (e.g., JENA rules)      MEM     SDB        TDB         Wrppr




  Local Data Store
      /Cache
                           RDF    SDB        TDB
                          Dumps                          virtuoso    ….
                                  Rep.       Rep.


                                   Crawling architectural pattern




                                                                                                         WEB OF
Linked data consumption




                                                                                                          DATA
                            LDSpider +Fuseki                  LDIF


      Third parties        RDF    HTTP DEREFERENCIABLE                SPARQL
 Served Linked dataset    Dumps            URIs                      End Points




                                                                                                                   9
Configuration file 1

        { "StoreConfiguration":{
                 "KindOfStore":"JENATDB",
                 "RDFDocumentURIs":[ ],
                 "TDBDirectory":"data/CNRIT/TDB-0.8.9/CNRR/"
            },                                              List of LOD Entities URI
        "InstanceConfiguration":{                Java class Implementing ListOfInputInstances
                 "InstanceURIsClass":"application.dataCNRIt.GetResearcherIMATIplusCoauthor"
            },
            "OutputConfiguration":{       Similarity Matrix CSV - JSON encoding of top n-most
                 "KindOfOutput":"JSONOrderedResult",              similar
                 "NumberOfOrderedResult":”20",
                 "FilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CRRIIntPub.res.json"
            },                               Context Encoded in a format in-house text format/
                                                         hopefully soon in JSON
         "ContextConfiguration":{
                 "ContextFilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CCRIIntPub.ctx"
            }
        }

Riccardo Albertoni                                                                                  10
Data.cnr.it – defining a context
                                                 pub:autoreCNRdi
                 Crawled by Data.CNR.it
                                                     dc:subject
                 pub: 22
                                                  skos:broader

                                  Res 226

                 pub: 26                           Crawled by DBPEDIA

                                                                    Topic:27
                                  Res 225         Topic:25

                 pub: 29

                                                                   Topic:2

                                                    Topic:26
                                   Res 226


                                                                    Topic:23


PREFIX dc: <http://purl.org/dc/terms/>
                                                                             No data
PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#>           properties are
[owl:Thing]-> {{}, { (pub:autoreCNRDi, Inter),(dc:subject, Simil)}}     considered in this
                                                                             context
[owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}}
  Riccardo Albertoni                                                                    11
run SSONDE



From the directory where you have downloaded SONDE (
  e.g., SSONDEV1DEMO)

• java -classpath ./lib/*:./bin SSONDEv1.SemSim
–conf JSONconfigurationFile1

• java -classpath ./lib/*:./bin/ SSONDEv1.SemSim -conf
  conf/dataCNRIt/ComplexContextResearchInterest/CCRIInt
  Pub.param.json




Riccardo Albertoni                                        12
Ex. Of Tool to interpret SSONDE results
SSONDE does not have its
  own GUI,
results are interpreted by
  elaborating its similarity
  matrix via

1) Excel files
2) Hierarchical Clustering
Explorer, 3.0, Human-
Computer Interaction Lab
University of Maryland.
http://www.cs.umd.edu/hcil/
multi-cluster/.

                                                             14
Riccardo Albertoni   15
Hierarchical Clustering




Riccardo Albertoni                        16
Data.cnr.it – defining a context
                                                 pub:autoreCNRdi
                 Crawled by Data.CNR.it
                                                     dc:subject
                 pub: 22
                                                  skos:broader

                                  Res 226

                 pub: 26                           Crawled by DBPEDIA

                                                                    Topic:27
                                  Res 225         Topic:25

                 pub: 29

                                                                   Topic:2

                                                    Topic:26
                                   Res 226


                                                                    Topic:23


PREFIX dc: <http://purl.org/dc/terms/>
PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#>
[owl:Thing]->{{},{ (pub:autoreCNRDi, Inter),(dc:subject, Simil)}}
[owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}}
  Riccardo Albertoni                                                           17
What next?

        (i) semantic similarity optimization:
                (i)    the caching of intermediate similarity results
                (ii)   the adoption of MapReduce paradigm to speed up the
                       assessment of semantic similarity;
        (ii) domain driven extensions at data layer:
                (i)  defining new data layer measures suited for geo-
                     referenced entities
                (ii) the multilingual similarity recently proposed by Kartic
                     and Jorge could be considered for inclusion in the
                     data layer of SSONDE
        (iii) definition of interfaces sifting entities according to
            their similarity, e.g., exploiting existing visualization
            frameworks (e.g., Exibit, Google visualization and
            JavaScript InfoVis Toolkit).

Riccardo Albertoni                                                             18
What I’d like you to think about …

        •  Can SSONDE be useful in any of your current
           research activities ?
        • Can SSONDE be deployed in some of your future
           projects (proposal)?
        • Are you interested in contributing somehow to
           SSONDE?
        Further details are available in
        • R. Albertoni, M. De Martino, SSONDE: Semantic
          Similarity On liNked Data Entities, 6th Metadata and
          Semantics Research Conference, 28-30 November
          2012 - Cádiz (Spain) [to appear]
        • Framework documentation
                     • http://code.google.com/p/ssonde/wiki/GettingStarted

Riccardo Albertoni                                                           19

More Related Content

Similar to SSONDE: Semantic Similarity On liNked Data Entities

TripFS presentation at ldow 2010
TripFS presentation at ldow 2010TripFS presentation at ldow 2010
TripFS presentation at ldow 2010Niko Popitsch
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data cloudsdamienjoyce
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsAndreas Kamilaris
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Mathieu d'Aquin
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2Daniel JACOB
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Sasa Nesic - PhD Dissertation Defense
Sasa Nesic - PhD Dissertation DefenseSasa Nesic - PhD Dissertation Defense
Sasa Nesic - PhD Dissertation DefenseSasa Nesic
 
SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsRinke Hoekstra
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET Journal
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Netgramana
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sourcesrumito
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Cory Lampert
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasMapR Technologies
 

Similar to SSONDE: Semantic Similarity On liNked Data Entities (20)

TripFS presentation at ldow 2010
TripFS presentation at ldow 2010TripFS presentation at ldow 2010
TripFS presentation at ldow 2010
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data clouds
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
 
Biodiversity Informatics on the Semantic Web
Biodiversity Informatics on the Semantic WebBiodiversity Informatics on the Semantic Web
Biodiversity Informatics on the Semantic Web
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
Timbuctoo 2 EASY
Timbuctoo 2 EASYTimbuctoo 2 EASY
Timbuctoo 2 EASY
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Exploring Linked Data
Exploring Linked DataExploring Linked Data
Exploring Linked Data
 
Sasa Nesic - PhD Dissertation Defense
Sasa Nesic - PhD Dissertation DefenseSasa Nesic - PhD Dissertation Defense
Sasa Nesic - PhD Dissertation Defense
 
SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n Bolts
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description Framework
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 

More from Riccardo Albertoni

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Riccardo Albertoni
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentRiccardo Albertoni
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...Riccardo Albertoni
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Riccardo Albertoni
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni
 
SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...Riccardo Albertoni
 

More from Riccardo Albertoni (9)

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for Environment
 
Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)
 
Linkset quality
Linkset qualityLinkset quality
Linkset quality
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...
 

SSONDE: Semantic Similarity On liNked Data Entities

  • 1. SSONDE:Semantic Similarity On liNked Data Entities (Example/DEMO) Riccardo Albertoni ralbertoni@delicias.dia.fi.upm.es Ontology Engineering Group Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Acknowledgment: SSONDE in its current (pre)release results mainly from the research activity I did at IMATI-CNR in collaboration with M. De Martino
  • 2. Linked data Crawling architectural pattern Build analysis Cluster analysis Explorative search services on resources SSONDE LDSPIDER/FUSE LDIF KI Riccardo Albertoni 2
  • 3. Getting started SSONDE’s code • is hosted as an Google Code project, http://code.google.com/p/ssonde/ • licenced as open source code (GNU GPL v3) Info about how getting started are available at http://code.google.com/p/ssonde/wiki/GettingStarted Riccardo Albertoni 3
  • 4. Applying SSONDE Let’s apply SSONDE • on Linked data resources exposed by third parties • Info about CNR provided by Gangemi et all, Data.cnr.it • To work out similarities among researchers according to their research interests Date: 15/03/2012 4
  • 5. What to download/install for this Example - SSONDE - svn checkout http://ssonde.googlecode.com/svn/SSONDEv1 SSONDEV1DEMO For crawling data - LDSpider - code.google.com/p/ldspider/ - Fuseski - http://jena.apache.org/documentation/serving_data/index.ht ml Riccardo Albertoni 5
  • 6. Crawling RDF fragments from data.cnr.it Which resources are you interested on? 1. Identify a set of seeds entities • E.g. researchers, http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleEste rno/ID226 (Riccardo Albertoni) 2. Figure out which entity features (i.e., Object properties and Data properties) are interesting • Analyse schemas which are deployed to characterize that kind of resources, e.g., • http://www.cnr.it/ontology/cnr/pubblicazioni.owl (prefix:pub) • pub:autoreCNRDi (the papers written by the authors) • dc:subject (authors scientific interest) Riccardo Albertoni 6
  • 7. Crawling RDF fragments from data.cnr.it 3. Crawl metadata of resources to be analyzed creating a seedDATACNRIT.txt file • java -jar ./LDSpider/ldspider-1.1e.jar -s seedDATACNRIT.txt -b 5 -1 -oe http://localhost:3030/XXX/update 3 bis. if you have a more precise idea of which feature you need, the list of properties to be traversed can be specified by –f option • java -jar ./LDSpider/ldspider-1.1e.jar -b 5 -1 -f http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autoreCNRDi -f http://www.cnr.it/ontology/cnr/pubblicazioni.owl#coautore -f http://purl.org/dc/terms/subject -f http://www.w3.org/2004/02/skos/core#broader -s seedDATACNRIT.txt -oe "http://localhost:3030/XXX/update" Riccardo Albertoni 7
  • 8. Configuration Similarity Output List of Instances Context Layer Java Class to Similarity matrix in Ontology Layer CVS generate the list SSONDE Data Layer Ref. Context n-most similar entities Kind of Store Data wrappers In JSON Ref. Rules JENA JENA JENA Virtuoso ... (e.g., JENA rules) MEM SDB TDB Wrppr Local Data Store /Cache RDF SDB TDB Dumps virtuoso …. Rep. Rep. Crawling architectural pattern WEB OF Linked data consumption DATA LDSpider +Fuseki LDIF Third parties RDF HTTP DEREFERENCIABLE SPARQL Served Linked dataset Dumps URIs End Points 9
  • 9. Configuration file 1 { "StoreConfiguration":{ "KindOfStore":"JENATDB", "RDFDocumentURIs":[ ], "TDBDirectory":"data/CNRIT/TDB-0.8.9/CNRR/" }, List of LOD Entities URI "InstanceConfiguration":{ Java class Implementing ListOfInputInstances "InstanceURIsClass":"application.dataCNRIt.GetResearcherIMATIplusCoauthor" }, "OutputConfiguration":{ Similarity Matrix CSV - JSON encoding of top n-most "KindOfOutput":"JSONOrderedResult", similar "NumberOfOrderedResult":”20", "FilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CRRIIntPub.res.json" }, Context Encoded in a format in-house text format/ hopefully soon in JSON "ContextConfiguration":{ "ContextFilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CCRIIntPub.ctx" } } Riccardo Albertoni 10
  • 10. Data.cnr.it – defining a context pub:autoreCNRdi Crawled by Data.CNR.it dc:subject pub: 22 skos:broader Res 226 pub: 26 Crawled by DBPEDIA Topic:27 Res 225 Topic:25 pub: 29 Topic:2 Topic:26 Res 226 Topic:23 PREFIX dc: <http://purl.org/dc/terms/> No data PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#> properties are [owl:Thing]-> {{}, { (pub:autoreCNRDi, Inter),(dc:subject, Simil)}} considered in this context [owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}} Riccardo Albertoni 11
  • 11. run SSONDE From the directory where you have downloaded SONDE ( e.g., SSONDEV1DEMO) • java -classpath ./lib/*:./bin SSONDEv1.SemSim –conf JSONconfigurationFile1 • java -classpath ./lib/*:./bin/ SSONDEv1.SemSim -conf conf/dataCNRIt/ComplexContextResearchInterest/CCRIInt Pub.param.json Riccardo Albertoni 12
  • 12. Ex. Of Tool to interpret SSONDE results SSONDE does not have its own GUI, results are interpreted by elaborating its similarity matrix via 1) Excel files 2) Hierarchical Clustering Explorer, 3.0, Human- Computer Interaction Lab University of Maryland. http://www.cs.umd.edu/hcil/ multi-cluster/. 14
  • 15. Data.cnr.it – defining a context pub:autoreCNRdi Crawled by Data.CNR.it dc:subject pub: 22 skos:broader Res 226 pub: 26 Crawled by DBPEDIA Topic:27 Res 225 Topic:25 pub: 29 Topic:2 Topic:26 Res 226 Topic:23 PREFIX dc: <http://purl.org/dc/terms/> PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#> [owl:Thing]->{{},{ (pub:autoreCNRDi, Inter),(dc:subject, Simil)}} [owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}} Riccardo Albertoni 17
  • 16. What next? (i) semantic similarity optimization: (i) the caching of intermediate similarity results (ii) the adoption of MapReduce paradigm to speed up the assessment of semantic similarity; (ii) domain driven extensions at data layer: (i) defining new data layer measures suited for geo- referenced entities (ii) the multilingual similarity recently proposed by Kartic and Jorge could be considered for inclusion in the data layer of SSONDE (iii) definition of interfaces sifting entities according to their similarity, e.g., exploiting existing visualization frameworks (e.g., Exibit, Google visualization and JavaScript InfoVis Toolkit). Riccardo Albertoni 18
  • 17. What I’d like you to think about … • Can SSONDE be useful in any of your current research activities ? • Can SSONDE be deployed in some of your future projects (proposal)? • Are you interested in contributing somehow to SSONDE? Further details are available in • R. Albertoni, M. De Martino, SSONDE: Semantic Similarity On liNked Data Entities, 6th Metadata and Semantics Research Conference, 28-30 November 2012 - Cádiz (Spain) [to appear] • Framework documentation • http://code.google.com/p/ssonde/wiki/GettingStarted Riccardo Albertoni 19