SlideShare ist ein Scribd-Unternehmen logo
1 von 25
.nju.edu.cn




                      BipRank: Ranking and Summarizing
                         RDF Vocabulary Descriptions




          Gong Cheng1, Feng Ji2, Shengmei Luo2, Weiyi Ge1, Yuzhong Qu1

1State   Key Laboratory for Novel Software Technology, Nanjing University, China
          2Communication Services R&D Institute, ZTE Corporation, China



                             Presented at JIST2011
Outline
                                    ws .nju.edu.cn

        Introduction
        Salience measurement
        Vocabulary summarization
        Conclusions




Gong Cheng (程龚) gcheng@nju.edu.cn   2 of 25
Vocabularies and Linked Data
                                                  ws .nju.edu.cn


   Vocabularies                         Your own vocabulary


                                Reuse




   Linked Data




Gong Cheng (程龚) gcheng@nju.edu.cn                 3 of 25
Vocabulary search engines
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   4 of 25
Vocabularies
                                    ws .nju.edu.cn




                                      Scale




Gong Cheng (程龚) gcheng@nju.edu.cn   5 of 25
Vocabulary snippets --- state of the art
                                               ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn              6 of 25
Vocabulary snippets --- our approach
                                           ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn          7 of 25
Vocabulary summarization
                                                                            ws .nju.edu.cn




           Vocabulary summarization = ranking and selecting RDF sentences



Gong Cheng (程龚) gcheng@nju.edu.cn                                           8 of 25
Outline
                                    ws .nju.edu.cn

        Introduction
        Salience measurement
        Vocabulary summarization
        Conclusions




Gong Cheng (程龚) gcheng@nju.edu.cn   9 of 25
A bipartite view of vocabulary description
                                                 ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn                10 of 25
Surfer behavior --- type A
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   11 of 25
Surfer behavior --- type B
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   12 of 25
BipRank
                                                                     ws .nju.edu.cn




       Next step                ?   Uniform   Current step



                                                             type-A behavior


                                                             type-B behavior




Gong Cheng (程龚) gcheng@nju.edu.cn                                    13 of 25
Pattern of RDF sentence
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   14 of 25
p(s|u)
                                                                         ws .nju.edu.cn

        Frequency of Pattern(s)
             #RDF_sentence in the vocabulary that has the same pattern
        Popularity of Pattern(s)
             #Vocabulary in the repository that has the same pattern




Gong Cheng (程龚) gcheng@nju.edu.cn                                        15 of 25
Evaluation setting
                                                                           ws .nju.edu.cn

        Test cases
            9 moderate-sized vocabularies randomly selected from Falcons
        Gold standard
            Salience given by 6 human experts
        Competitors
            Cp: Zhang et al. (WWW2007)
            Our approach
                 BipRank-U: pattern-unaware
                 BipRank-F: using pattern frequency
                 BipRank-P: using pattern popularity
        Metric
            Pearson product-moment correlation coefficient




Gong Cheng (程龚) gcheng@nju.edu.cn                                          16 of 25
Evaluation results
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   17 of 25
Outline
                                    ws .nju.edu.cn

        Introduction
        Salience measurement
        Vocabulary summarization
        Conclusions




Gong Cheng (程龚) gcheng@nju.edu.cn   18 of 25
Goodness of a summary
                                                           ws .nju.edu.cn

        Salience




        Query relevance
            Textual similarity between query and summary




        Cohesion
            Term overlap between RDF sentences




Gong Cheng (程龚) gcheng@nju.edu.cn                          19 of 25
Looking for the best summary
                                              ws .nju.edu.cn

        Multi-objective optimization
        Single aggregate objective function




        Solution: a greedy strategy




Gong Cheng (程龚) gcheng@nju.edu.cn             20 of 25
Evaluation setting
                                                                      ws .nju.edu.cn

        Judges
            18 human experts
        Test cases
            190 searches over 2,012 vocabularies crawled by Falcons
        Competitors
            Generic: Zhang et al. (WWW2007)
            Our approach
                 QR: query relevance
                 QR+S: query relevance + salience
                 QR+C: query relevance + cohesion
        Metric
            Rating on a 10-point scale




Gong Cheng (程龚) gcheng@nju.edu.cn                                     21 of 25
Evaluation results
                                    ws .nju.edu.cn




Gong Cheng (程龚) gcheng@nju.edu.cn   22 of 25
Performance testing
                                                                ws .nju.edu.cn


                           Size of summary   Runtime




                                               Size of vocabulary




Gong Cheng (程龚) gcheng@nju.edu.cn                               23 of 25
Outline
                                    ws .nju.edu.cn

        Introduction
        Salience measurement
        Vocabulary summarization
        Conclusions




Gong Cheng (程龚) gcheng@nju.edu.cn   24 of 25
Conclusions
                                                           ws .nju.edu.cn

        Salience measurement
            Sentence-term graph
            BipRank
            Pattern of RDF sentence
        Vocabulary summarization
            Salience
            Query relevance
            Cohesion


        Implemented in Falcons Ontology Search
            http://ws.nju.edu.cn/falcons/ontologysearch/




Gong Cheng (程龚) gcheng@nju.edu.cn                          25 of 25

Weitere ähnliche Inhalte

Mehr von Gong Cheng

常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
Gong Cheng
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Gong Cheng
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
Gong Cheng
 
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
Gong Cheng
 
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
RELIN: Relatedness and Informativeness-based Centrality for Entity SummarizationRELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
Gong Cheng
 
Browsing Linked Data with MyView
Browsing Linked Data with MyViewBrowsing Linked Data with MyView
Browsing Linked Data with MyView
Gong Cheng
 

Mehr von Gong Cheng (20)

Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity Summarization
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the Web
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic Data
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
 
知识的摘要
知识的摘要知识的摘要
知识的摘要
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
 
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based Approach
 
NJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary RepositoryNJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary Repository
 
Web的图结构分析
Web的图结构分析Web的图结构分析
Web的图结构分析
 
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
 
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
RELIN: Relatedness and Informativeness-based Centrality for Entity SummarizationRELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
 
Browsing Linked Data with MyView
Browsing Linked Data with MyViewBrowsing Linked Data with MyView
Browsing Linked Data with MyView
 
Towards Supporting the Life Cycle of Web Data
Towards Supporting the Life Cycle of Web DataTowards Supporting the Life Cycle of Web Data
Towards Supporting the Life Cycle of Web Data
 
Falcons Explorer: Tabular and Relational End-user Programming for the Web of ...
Falcons Explorer: Tabular and Relational End-user Programming for the Web of ...Falcons Explorer: Tabular and Relational End-user Programming for the Web of ...
Falcons Explorer: Tabular and Relational End-user Programming for the Web of ...
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

  • 1. .nju.edu.cn BipRank: Ranking and Summarizing RDF Vocabulary Descriptions Gong Cheng1, Feng Ji2, Shengmei Luo2, Weiyi Ge1, Yuzhong Qu1 1State Key Laboratory for Novel Software Technology, Nanjing University, China 2Communication Services R&D Institute, ZTE Corporation, China Presented at JIST2011
  • 2. Outline ws .nju.edu.cn Introduction Salience measurement Vocabulary summarization Conclusions Gong Cheng (程龚) gcheng@nju.edu.cn 2 of 25
  • 3. Vocabularies and Linked Data ws .nju.edu.cn Vocabularies Your own vocabulary Reuse Linked Data Gong Cheng (程龚) gcheng@nju.edu.cn 3 of 25
  • 4. Vocabulary search engines ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 4 of 25
  • 5. Vocabularies ws .nju.edu.cn Scale Gong Cheng (程龚) gcheng@nju.edu.cn 5 of 25
  • 6. Vocabulary snippets --- state of the art ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 6 of 25
  • 7. Vocabulary snippets --- our approach ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 7 of 25
  • 8. Vocabulary summarization ws .nju.edu.cn Vocabulary summarization = ranking and selecting RDF sentences Gong Cheng (程龚) gcheng@nju.edu.cn 8 of 25
  • 9. Outline ws .nju.edu.cn Introduction Salience measurement Vocabulary summarization Conclusions Gong Cheng (程龚) gcheng@nju.edu.cn 9 of 25
  • 10. A bipartite view of vocabulary description ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 10 of 25
  • 11. Surfer behavior --- type A ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 11 of 25
  • 12. Surfer behavior --- type B ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 12 of 25
  • 13. BipRank ws .nju.edu.cn Next step ? Uniform Current step type-A behavior type-B behavior Gong Cheng (程龚) gcheng@nju.edu.cn 13 of 25
  • 14. Pattern of RDF sentence ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 14 of 25
  • 15. p(s|u) ws .nju.edu.cn Frequency of Pattern(s) #RDF_sentence in the vocabulary that has the same pattern Popularity of Pattern(s) #Vocabulary in the repository that has the same pattern Gong Cheng (程龚) gcheng@nju.edu.cn 15 of 25
  • 16. Evaluation setting ws .nju.edu.cn Test cases 9 moderate-sized vocabularies randomly selected from Falcons Gold standard Salience given by 6 human experts Competitors Cp: Zhang et al. (WWW2007) Our approach BipRank-U: pattern-unaware BipRank-F: using pattern frequency BipRank-P: using pattern popularity Metric Pearson product-moment correlation coefficient Gong Cheng (程龚) gcheng@nju.edu.cn 16 of 25
  • 17. Evaluation results ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 17 of 25
  • 18. Outline ws .nju.edu.cn Introduction Salience measurement Vocabulary summarization Conclusions Gong Cheng (程龚) gcheng@nju.edu.cn 18 of 25
  • 19. Goodness of a summary ws .nju.edu.cn Salience Query relevance Textual similarity between query and summary Cohesion Term overlap between RDF sentences Gong Cheng (程龚) gcheng@nju.edu.cn 19 of 25
  • 20. Looking for the best summary ws .nju.edu.cn Multi-objective optimization Single aggregate objective function Solution: a greedy strategy Gong Cheng (程龚) gcheng@nju.edu.cn 20 of 25
  • 21. Evaluation setting ws .nju.edu.cn Judges 18 human experts Test cases 190 searches over 2,012 vocabularies crawled by Falcons Competitors Generic: Zhang et al. (WWW2007) Our approach QR: query relevance QR+S: query relevance + salience QR+C: query relevance + cohesion Metric Rating on a 10-point scale Gong Cheng (程龚) gcheng@nju.edu.cn 21 of 25
  • 22. Evaluation results ws .nju.edu.cn Gong Cheng (程龚) gcheng@nju.edu.cn 22 of 25
  • 23. Performance testing ws .nju.edu.cn Size of summary Runtime Size of vocabulary Gong Cheng (程龚) gcheng@nju.edu.cn 23 of 25
  • 24. Outline ws .nju.edu.cn Introduction Salience measurement Vocabulary summarization Conclusions Gong Cheng (程龚) gcheng@nju.edu.cn 24 of 25
  • 25. Conclusions ws .nju.edu.cn Salience measurement Sentence-term graph BipRank Pattern of RDF sentence Vocabulary summarization Salience Query relevance Cohesion Implemented in Falcons Ontology Search http://ws.nju.edu.cn/falcons/ontologysearch/ Gong Cheng (程龚) gcheng@nju.edu.cn 25 of 25