SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Multilingual Linked Open Data
                         Patterns
                                       Jose Emilio Labra Gayo
                                            University of Oviedo, Spain


                         Joint work with:
                              Dimitris Kontokostas                    Sören Auer
                                            Universität Leipzig, Germany




                                  More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
MLOD Patterns
         From best practices (Dublin) to patterns (Rome)

         We propose a catalog of 20 MLOD patterns
                  Pattern = generic solution to a problem in a context
                         Common vocabulary
                         Patterns can be related to each other
                                                                         Each pattern contains
                                                                            Name
                         Some patterns can contradict other patterns        Description
                  There are already Linked data patterns                    Context
                         We focus on multilingual linked data patterns      Example
                                                                            Discussion
                                                                            See also

         Based on DBPedia I18n experience


                                    More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
MLOD Patterns
         Patterns are classified by activity:
                  Naming:
                         URI design, URIs, IRIs, etc.
                  Dereference:
                         How is the content that we return affected by multilingualism
                  Labeling:
                         Handling multilingual labels
                  Longer descriptions:
                         Longer textual descriptions
                  Linking
                         Links between concepts in different languages?
                  Reuse
                         Vocabularies and multilingualism

                                    More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
General overview
              Goal                           Pattern                                                 Description

                          Descriptive URIs                        Use descriptive URIs with ASCII characters, % encoding extended characters

                          Opaque URIs                             Use non human-readable URIs

            Naming        Full IRIs                               Use IRIs with unicode characters

                          Internationalized local names           Use Unicode characters only for local names

                          Language in URIs                        Include language information in the URI

                          Return Language independent data        Return the same triples independently of the language
         Dereference
                          Language content negotiation            Return different triples depending on user agent preferences

                          Label everything                        Define labels for all the resources

           Labeling       Multilingual labels                     Add language tags to labels

                          Labels without language tag             Add labels without language tags in a default language

                          Divide longer descriptions              Replace long descriptions by more resources with labels

    Longer descriptions   Add lexical information                 Add lexical information to long descriptions

                          Structured literals                     Use HTML/XML literals for longer descriptions

                          Identity links                          Use owl:sameAs and similar predicates

            Linking       Soft links                              Use predicates with soft semantics

                          Linguistic metadata                     Add linguistic metadata about the dataset terms

                          Monolingual vocabularies                Attach labels to vocabularies in a single language

                          Multilingual vocabularies               Prefer multilingual vocabularies
             Reuse
                          Localize existing vocabularies          Translate labels of existing vocabularies

                                       More info: http://www.weso.es/MLODPatterns ones
                          Create new localized vocabulariesCreate custom vocabularies and link to existing
Jose Emilio Labra Gayo
Motivating example
         Juan is an armenian professor at the University of León.



                                  birthPlace
                                                   Armenia


                                                       Professor
                                      position
                          Juan

                                 worksAt
                                                 University of León




                          More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
http://օրինակ.օրգ#Հայաստան
   http://օրինակ.օրգ#Հայաստան



                                          Naming
         Selecting a URI scheme for Armenia

    Descriptive URIs                        http://example.org/Armenia
    Human-readable                        Only ASCII, %-encode non-ASCII characters
    Opaquesupport
    Good tool URIs                          http://example.org/I23AX45
                                               http://example.org/Universidad%20de%Le%C3%B3n

    Independence between concept and          Non Human-readable
                                            http://օրինակ.օրգ#Հայաստան
    Full IRIs
    natural language representation           Difficult to handle by developers

    Internationalized local names
    More natural for non-Latin based        http://example.org#Հայաստան
                                              Subject to visual spoofing attacks
    languages                                 Not so good tool support

   Avoids domain name spoofing              http://hy.example.org#Հայաստան
                                             Visual spoofing attacks can still possible
   Include language in URIs
   More human-friendly identifiers          http://en.example.org#Armenia
                                              Adding language info to the URI can become unwieldy
  Independent development of datasets         Example: languages & sublanguages
  by language                                      hy-Latin-IT-arevela
                                              Where should we put the language tag?
                                More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Dereference
         Language based content negotiation?


          No language content negotiation       http://example.org/Armenia

                                                   Always returns the same data
            Easy to develop                        Clients have to filter triples in other languages
           Language content negotiation
            Consistency of data                  http://example.org/Armenia
                                                   Computation & network overhead
                                                  Returns different data depending on
                                                           Accept-language
                          Accept-language:en                                  Accept-language:hy

           :Armenia rdfs:label "Armenia"@en .           :Armenia rdfs:label "Հայաստան "@hy .
           Improves clients performance           Difficult to implement
           Less network overhead                  Semantic equivalence between data
                                 More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Labeling
            Label everything
                            :Armenia rdfs:label "Armenia" .
                            :juan     rdfs:label "Juan López" .
                            :position rdfs:label "Job position" .

            User agents can show labels instead or URIs
            Multilingual Labels                            Not always feasible, which labels?
            Labels can be used for searching               Labels are for humans
                            :juan :position       "Professor"@en .
                                                           Avoid machine-oriented notations
                            :juan :position "Catedrático"@es .

            Labels without are part of RDF Model
             Multilingual labels language tag              SPARQL can be more difficult
                                                               SELECT * WHERE {
                            :juan :position       "Professor"@en ?x :position "Professor" .
                                                                  .
                            :juan :position       "Catedrático"@es .
                                                               }
                            :juan :position "Professor" .
                    SPARQLing is easier                   Choosing a default language is controversial



                                   More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Longer descriptions
        Divide long descriptions

                   :juan :jobtitle
                        "Professor at the University of León"@en .                    
                   :juan :position :professor .


                                                                                      
                   :juan :workPlace :uniLeón .

                   :professor rdfs:label "Professor"@en .
                   :uniLeón rdfs:label "University of León"@en .

              Fine-grained data is more amenable to semantic web apps   More complexity of the model
              Apps can generate more readable information               Not always possible



                                  More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Longer descriptions
     Provide lexical information
                                       :uniLeón a lemon:LexicalEntry ;
                                         lemon:decomposition (
                                           [ lemon:element :University ]
     More value to dataset                 [ lemon:element :Of ]
     Automatic manipulation                [ lemon:element :León ]
     Can improve message generation     );
                                        rdfs:label "University of León"@en     .

     Complexity overhead               :University a lemon:LexicalEntry ;
     Feasibility                          lexinfo:partOfSpeech lexinfo:commonNoun ;
                                          rdfs:label "University"@en ;
                                          rdfs:label "Universidad"@es .

                                       :Of     a lemon:LexicalEntry ;
                                             lexinfo:partOfSpeech lexinfo:preposition ;
                                             rdfs:label "of"@en ;
                                             rdfs:label "de"@es .

                                       :León a lemon:LexicalEntry ;
                                          lexinfo:partOfSpeech lexinfo:properNoun ;
                                          rdfs:label "León"
                                                               Example provided by J. McCrae
                             More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Longer descriptions
     Structured literals



                         :uniLeón :desc
                           "<p>University of
                              <span translate="no">León</span>,
                              Spain.
                            </p>"^^rdf:XMLLiteral .


         Leverage existing I18N techniques            Interaction between 2 abstracion models
              Bidi, Ruby, Localization notes, ...           RDF vs XML/HTML
                                                      Large portions of structured literals can hinder LD




                                  More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Linking
     Inter-language identity links
                                           <http://hy.example.org#Հայաստան>
                                             owl:sameAs
                                               <http://en.example.org#Armenia> .

     Soft Inter-language links
     owl:sameAs is a well known property              Too strong semantics of owl:sameAs
     Already supported by linked data applications    Concepts may not be the same
                                              <http://hy.example.org#Հայաստան>
                                                            Contradictions
                                             rdfs:seeAlso
                                               <http://en.example.org#Armenia> .
     Link linguistic meta-data
     Several predicates                            No standard property
               rdfs:seeAlso                        No support for inference
                                           :Catedrático
               skos:related
               dbo:wikiPageLanguageLink
                                                lexvo:means wordnet:Professor ;
                                                lexvo:language
                                                  <http://lexvo.org/id/iso639-3/spa> .
       Links between multilingual labels               No standard practice
       Can declare language of a dataset               Semantic equivalence between concepts

                               More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Reuse
     Monolingual vocabularies
                                             FOAF, Dublin Core, OWL, RDF Schema, ... are
                                                            only in English
      Multilingual vocabularies
     Easier to control vocabulary evolution       Monolingual vocabularies in multilingual applications require
                                            :position a owl:DatatypeProperty ;
          Avoid bad translations, ambiguities     a translation layer
                                                  rdfs:domain :UniversityStaff ;
                                                  rdfs:label "Position"@en ;
                                                  rdfs:label "Puesto"@es .

                                            :UniversityStaff a owl:Class ;
                                              rdfs:label "University staff"@en ;
                                              rdfs:label "Trabajador universitario"@es .

      Elegant solution in multilingual contexts       Some common vocabularies are monolingual
           More control over translations             Maintenance is more difficult




                                 More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Reuse
     Localize existing vocabularies
                               dc:contributor           rdfs:label        "Colaborador"@es .


     Transparently select label in preferred language    Polluting well known vocabularies = controversial
     Principle AAA
           Anyone can say Anything about Any topic

     Create new localized vocabularies
                               dc:contributor
                                   owl:equivalentProperty :colaborador .
                               :colaborador
                                   rdfs:label "Colaborador"@es .
   Freedom to taylor vocabulary to specific needs       More difficult to humans/agents to recognize new
                                                        properties/classes



                                More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
Future work
                  The catalog is not closed
                         Other issues & patterns
                            Microdata & RDFa
                            Other I18n topics
                            Handle big datasets
                            Localization workflows
                  Feedback from the community
                         Best practices, Patterns, Anti-patterns?




                                More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
End of presentation




                          More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo
DBPedia I18n
         From Descriptive URIs to Internationalized
           local names
         Soft & Identity inter-language links
         Label everything




                         More info: http://www.weso.es/MLODPatterns
Jose Emilio Labra Gayo

Weitere ähnliche Inhalte

Andere mochten auch

A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyTimm Heuss
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?WiLS
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Olivier Grisel
 
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...Guy De Pauw
 
Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Stephen Shellman
 
QER : query entity recognition
QER : query entity recognitionQER : query entity recognition
QER : query entity recognitionDhwaj Raj
 
The named entity recognition (ner)2
The named entity recognition (ner)2The named entity recognition (ner)2
The named entity recognition (ner)2Arabic_NLP_ImamU2013
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationRichard Littauer
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataDave Lewis
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsINRIA-OAK
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsJulien PLU
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing Rajnish Raj
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...giuseppe_futia
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesPanos Alexopoulos
 

Andere mochten auch (20)

A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyA Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
SYNERGY - A Named Entity Recognition System for Resource-scarce Languages suc...
 
Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER) Understanding Named-Entity Recognition (NER)
Understanding Named-Entity Recognition (NER)
 
QER : query entity recognition
QER : query entity recognitionQER : query entity recognition
QER : query entity recognition
 
The named entity recognition (ner)2
The named entity recognition (ner)2The named entity recognition (ner)2
The named entity recognition (ner)2
 
Text mining
Text miningText mining
Text mining
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 Presentation
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
 
Discoverers of Surface Analysis
Discoverers of Surface AnalysisDiscoverers of Surface Analysis
Discoverers of Surface Analysis
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER Models
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Recipes for PhD
Recipes for PhDRecipes for PhD
Recipes for PhD
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 

Ähnlich wie Multlingual Linked Data Patterns

Best Practices for multilingual linked open data
Best Practices for multilingual linked open dataBest Practices for multilingual linked open data
Best Practices for multilingual linked open dataJose Emilio Labra Gayo
 
Best Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataBest Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataJose Emilio Labra Gayo
 
Basic techniques in nlp
Basic techniques in nlpBasic techniques in nlp
Basic techniques in nlpSumit Sony
 
PhiloWeb panel. "Philosophy" of the Web
PhiloWeb panel. "Philosophy" of the WebPhiloWeb panel. "Philosophy" of the Web
PhiloWeb panel. "Philosophy" of the WebPhiloWeb
 
American Language Institute
American Language InstituteAmerican Language Institute
American Language InstituteTiffini Travis
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfdevangmittal4
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentationDouglas Randall
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
 
ISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other VocabulariesISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other VocabulariesMarcia Zeng
 
Lecture4202011 110420175305-phpapp01
Lecture4202011 110420175305-phpapp01Lecture4202011 110420175305-phpapp01
Lecture4202011 110420175305-phpapp01Tarek Koudsi
 
SKOS, RDFa, Microformats, Microdata
SKOS, RDFa, Microformats, MicrodataSKOS, RDFa, Microformats, Microdata
SKOS, RDFa, Microformats, MicrodataBernhard Haslhofer
 
Fao Semantics Related Projects
Fao Semantics Related ProjectsFao Semantics Related Projects
Fao Semantics Related ProjectsMargherita Sini
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
5810 day 3 sept 20 2014
5810 day 3 sept 20 2014 5810 day 3 sept 20 2014
5810 day 3 sept 20 2014 SVTaylor123
 

Ähnlich wie Multlingual Linked Data Patterns (20)

Best Practices for multilingual linked open data
Best Practices for multilingual linked open dataBest Practices for multilingual linked open data
Best Practices for multilingual linked open data
 
Best Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open DataBest Practices for Multilingual Linked Open Data
Best Practices for Multilingual Linked Open Data
 
Patterns of Value
Patterns of ValuePatterns of Value
Patterns of Value
 
Basic techniques in nlp
Basic techniques in nlpBasic techniques in nlp
Basic techniques in nlp
 
Multi-word items
Multi-word itemsMulti-word items
Multi-word items
 
PhiloWeb panel. "Philosophy" of the Web
PhiloWeb panel. "Philosophy" of the WebPhiloWeb panel. "Philosophy" of the Web
PhiloWeb panel. "Philosophy" of the Web
 
American Language Institute
American Language InstituteAmerican Language Institute
American Language Institute
 
LectureNotes-01-DSA
LectureNotes-01-DSALectureNotes-01-DSA
LectureNotes-01-DSA
 
Textmining
TextminingTextmining
Textmining
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdf
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
ISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other VocabulariesISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other Vocabularies
 
Tutorial 1-Ontologies
Tutorial 1-OntologiesTutorial 1-Ontologies
Tutorial 1-Ontologies
 
Lecture4202011 110420175305-phpapp01
Lecture4202011 110420175305-phpapp01Lecture4202011 110420175305-phpapp01
Lecture4202011 110420175305-phpapp01
 
SKOS, RDFa, Microformats, Microdata
SKOS, RDFa, Microformats, MicrodataSKOS, RDFa, Microformats, Microdata
SKOS, RDFa, Microformats, Microdata
 
Fao Semantics Related Projects
Fao Semantics Related ProjectsFao Semantics Related Projects
Fao Semantics Related Projects
 
Parafraseo-Chenggang.pdf
Parafraseo-Chenggang.pdfParafraseo-Chenggang.pdf
Parafraseo-Chenggang.pdf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
5810 day 3 sept 20 2014
5810 day 3 sept 20 2014 5810 day 3 sept 20 2014
5810 day 3 sept 20 2014
 

Mehr von Jose Emilio Labra Gayo

Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctoradoJose Emilio Labra Gayo
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapesJose Emilio Labra Gayo
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data qualityJose Emilio Labra Gayo
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesJose Emilio Labra Gayo
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesJose Emilio Labra Gayo
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosJose Emilio Labra Gayo
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorJose Emilio Labra Gayo
 

Mehr von Jose Emilio Labra Gayo (20)

Publicaciones de investigación
Publicaciones de investigaciónPublicaciones de investigación
Publicaciones de investigación
 
Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctorado
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data quality
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 
Wikidata
WikidataWikidata
Wikidata
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologies
 
ShEx by Example
ShEx by ExampleShEx by Example
ShEx by Example
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introducción a la Web Semántica
Introducción a la Web SemánticaIntroducción a la Web Semántica
Introducción a la Web Semántica
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
2017 Tendencias en informática
2017 Tendencias en informática2017 Tendencias en informática
2017 Tendencias en informática
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
19 javascript servidor
19 javascript servidor19 javascript servidor
19 javascript servidor
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazados
 
16 Alternativas XML
16 Alternativas XML16 Alternativas XML
16 Alternativas XML
 
XSLT
XSLTXSLT
XSLT
 
XPath
XPathXPath
XPath
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el Servidor
 

Kürzlich hochgeladen

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Multlingual Linked Data Patterns

  • 1. Multilingual Linked Open Data Patterns Jose Emilio Labra Gayo University of Oviedo, Spain Joint work with: Dimitris Kontokostas Sören Auer Universität Leipzig, Germany More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 2. MLOD Patterns From best practices (Dublin) to patterns (Rome) We propose a catalog of 20 MLOD patterns Pattern = generic solution to a problem in a context Common vocabulary Patterns can be related to each other Each pattern contains Name Some patterns can contradict other patterns Description There are already Linked data patterns Context We focus on multilingual linked data patterns Example Discussion See also Based on DBPedia I18n experience More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 3. MLOD Patterns Patterns are classified by activity: Naming: URI design, URIs, IRIs, etc. Dereference: How is the content that we return affected by multilingualism Labeling: Handling multilingual labels Longer descriptions: Longer textual descriptions Linking Links between concepts in different languages? Reuse Vocabularies and multilingualism More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 4. General overview Goal Pattern Description Descriptive URIs Use descriptive URIs with ASCII characters, % encoding extended characters Opaque URIs Use non human-readable URIs Naming Full IRIs Use IRIs with unicode characters Internationalized local names Use Unicode characters only for local names Language in URIs Include language information in the URI Return Language independent data Return the same triples independently of the language Dereference Language content negotiation Return different triples depending on user agent preferences Label everything Define labels for all the resources Labeling Multilingual labels Add language tags to labels Labels without language tag Add labels without language tags in a default language Divide longer descriptions Replace long descriptions by more resources with labels Longer descriptions Add lexical information Add lexical information to long descriptions Structured literals Use HTML/XML literals for longer descriptions Identity links Use owl:sameAs and similar predicates Linking Soft links Use predicates with soft semantics Linguistic metadata Add linguistic metadata about the dataset terms Monolingual vocabularies Attach labels to vocabularies in a single language Multilingual vocabularies Prefer multilingual vocabularies Reuse Localize existing vocabularies Translate labels of existing vocabularies More info: http://www.weso.es/MLODPatterns ones Create new localized vocabulariesCreate custom vocabularies and link to existing Jose Emilio Labra Gayo
  • 5. Motivating example Juan is an armenian professor at the University of León. birthPlace Armenia Professor position Juan worksAt University of León More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 6. http://օրինակ.օրգ#Հայաստան http://օրինակ.օրգ#Հայաստան Naming Selecting a URI scheme for Armenia Descriptive URIs http://example.org/Armenia Human-readable Only ASCII, %-encode non-ASCII characters Opaquesupport Good tool URIs http://example.org/I23AX45 http://example.org/Universidad%20de%Le%C3%B3n Independence between concept and Non Human-readable http://օրինակ.օրգ#Հայաստան Full IRIs natural language representation Difficult to handle by developers Internationalized local names More natural for non-Latin based http://example.org#Հայաստան Subject to visual spoofing attacks languages Not so good tool support Avoids domain name spoofing http://hy.example.org#Հայաստան Visual spoofing attacks can still possible Include language in URIs More human-friendly identifiers http://en.example.org#Armenia Adding language info to the URI can become unwieldy Independent development of datasets Example: languages & sublanguages by language hy-Latin-IT-arevela Where should we put the language tag? More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 7. Dereference Language based content negotiation? No language content negotiation http://example.org/Armenia Always returns the same data Easy to develop Clients have to filter triples in other languages Language content negotiation Consistency of data http://example.org/Armenia Computation & network overhead Returns different data depending on Accept-language Accept-language:en Accept-language:hy :Armenia rdfs:label "Armenia"@en . :Armenia rdfs:label "Հայաստան "@hy . Improves clients performance Difficult to implement Less network overhead Semantic equivalence between data More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 8. Labeling Label everything :Armenia rdfs:label "Armenia" . :juan rdfs:label "Juan López" . :position rdfs:label "Job position" . User agents can show labels instead or URIs Multilingual Labels Not always feasible, which labels? Labels can be used for searching Labels are for humans :juan :position "Professor"@en . Avoid machine-oriented notations :juan :position "Catedrático"@es . Labels without are part of RDF Model Multilingual labels language tag SPARQL can be more difficult SELECT * WHERE { :juan :position "Professor"@en ?x :position "Professor" . . :juan :position "Catedrático"@es . } :juan :position "Professor" . SPARQLing is easier Choosing a default language is controversial More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 9. Longer descriptions Divide long descriptions :juan :jobtitle "Professor at the University of León"@en .  :juan :position :professor .  :juan :workPlace :uniLeón . :professor rdfs:label "Professor"@en . :uniLeón rdfs:label "University of León"@en . Fine-grained data is more amenable to semantic web apps More complexity of the model Apps can generate more readable information Not always possible More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 10. Longer descriptions Provide lexical information :uniLeón a lemon:LexicalEntry ; lemon:decomposition ( [ lemon:element :University ] More value to dataset [ lemon:element :Of ] Automatic manipulation [ lemon:element :León ] Can improve message generation ); rdfs:label "University of León"@en . Complexity overhead :University a lemon:LexicalEntry ; Feasibility lexinfo:partOfSpeech lexinfo:commonNoun ; rdfs:label "University"@en ; rdfs:label "Universidad"@es . :Of a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:preposition ; rdfs:label "of"@en ; rdfs:label "de"@es . :León a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:properNoun ; rdfs:label "León" Example provided by J. McCrae More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 11. Longer descriptions Structured literals :uniLeón :desc "<p>University of <span translate="no">León</span>, Spain. </p>"^^rdf:XMLLiteral . Leverage existing I18N techniques Interaction between 2 abstracion models Bidi, Ruby, Localization notes, ... RDF vs XML/HTML Large portions of structured literals can hinder LD More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 12. Linking Inter-language identity links <http://hy.example.org#Հայաստան> owl:sameAs <http://en.example.org#Armenia> . Soft Inter-language links owl:sameAs is a well known property Too strong semantics of owl:sameAs Already supported by linked data applications Concepts may not be the same <http://hy.example.org#Հայաստան> Contradictions rdfs:seeAlso <http://en.example.org#Armenia> . Link linguistic meta-data Several predicates No standard property rdfs:seeAlso No support for inference :Catedrático skos:related dbo:wikiPageLanguageLink lexvo:means wordnet:Professor ; lexvo:language <http://lexvo.org/id/iso639-3/spa> . Links between multilingual labels No standard practice Can declare language of a dataset Semantic equivalence between concepts More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 13. Reuse Monolingual vocabularies FOAF, Dublin Core, OWL, RDF Schema, ... are only in English Multilingual vocabularies Easier to control vocabulary evolution Monolingual vocabularies in multilingual applications require :position a owl:DatatypeProperty ; Avoid bad translations, ambiguities a translation layer rdfs:domain :UniversityStaff ; rdfs:label "Position"@en ; rdfs:label "Puesto"@es . :UniversityStaff a owl:Class ; rdfs:label "University staff"@en ; rdfs:label "Trabajador universitario"@es . Elegant solution in multilingual contexts Some common vocabularies are monolingual More control over translations Maintenance is more difficult More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 14. Reuse Localize existing vocabularies dc:contributor rdfs:label "Colaborador"@es . Transparently select label in preferred language Polluting well known vocabularies = controversial Principle AAA Anyone can say Anything about Any topic Create new localized vocabularies dc:contributor owl:equivalentProperty :colaborador . :colaborador rdfs:label "Colaborador"@es . Freedom to taylor vocabulary to specific needs More difficult to humans/agents to recognize new properties/classes More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 15. Future work The catalog is not closed Other issues & patterns Microdata & RDFa Other I18n topics Handle big datasets Localization workflows Feedback from the community Best practices, Patterns, Anti-patterns? More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 16. End of presentation More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo
  • 17. DBPedia I18n From Descriptive URIs to Internationalized local names Soft & Identity inter-language links Label everything More info: http://www.weso.es/MLODPatterns Jose Emilio Labra Gayo