SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Generating Researcher Networks
                    with Identified Persons
               on a Semantic Service Platform



                        15 Sep. 2009

                         Hanmin Jung
                           KISTI


BlogTalk2009                  1          Copyright © 2004-2009, KISTI
Agenda
     Research networks would be useful for finding
           Collaborators
           Speakers (Key persons of a researcher group)
     Issues
           Getting sources
           Resolving identities
           Finding experts
           Generating networks




BlogTalk2009                                2             Copyright © 2004-2009, KISTI
Getting sources …




BlogTalk2009            3   Copyright © 2004-2009, KISTI
Sources
     Identified Entities
           Papers: 453,124
               Elsevier international journal papers with full-texts and metadata

           Persons: 1,352,220
           Topics: 339,947
           Institutions: 91,514
           Locations: 409,575 (with GPS coordinate)
           RDF Triples: 283,087,518 (2008.11)




BlogTalk2009                                     4                        Copyright © 2004-2009, KISTI
Resolving identities …   How to resolve identities?
                          How to merge different identifiers as one?




BlogTalk2009                 5                     Copyright © 2004-2009, KISTI
OntoFrame
                                            OntoFrame 2008 Service

                                                              WS API                                  WS API/SPARQL
                                                  XML

                                            Ontology               Search Engine
           Ontologies                                                                   XML
                                            Schemata
                                                                                                             OntoReasoner®

                                 OntoURI®


                                                                                     WS API
                                                                Triple
Legacy DB Table      Field             Listener                Generator
                                                                                               SQL/
                  Information                                                             Expanded Triples
      …




                                                                                        WS API                Answers
                                                       Ontology
                           Field                       Instances
                        Information
Legacy DB Table




                                                  DB Tables
                                                                                   RDF Triple Store
BlogTalk2009                                                       6                         Copyright © 2004-2009, KISTI
Ontology
     Reference and Academic Knowledge Ontologies




BlogTalk2009                 7            Copyright © 2004-2009, KISTI
OntoFrame
     Syntactic-to-Semantic Process
                       Design Ontology Model
                       Design Ontology Model           Edit URI Generation Rules
                                                       Edit URI Generation Rules

 Modeling-Time       Select Database & Ontology       Edit Identity Resolution Rules
                      Select Database & Ontology      Edit Identity Resolution Rules
        Process
                         Edit Mapping Rules
                         Edit Mapping Rules              Test Mapping Process
                                                         Test Mapping Process


                       Normalize Field Values
                       Normalize Field Values               Crawl Database
                                                            Crawl Database

                    Apply Identity Resolution Rules
                    Apply Identity Resolution Rules      Refer Authority Data
                                                         Refer Authority Data

 Indexing-Time
                          Resolve Identities
                          Resolve Identities                 Extract Topics
                                                             Extract Topics
        Process
                            Assign URIs
                            Assign URIs                  Apply Mapping Rules
                                                         Apply Mapping Rules

                     Apply URI Generation Rules
                     Apply URI Generation Rules          Generate RDF Triples
                                                         Generate RDF Triples

        Run-Time
          Process                                       Apply sameAs Relations
                                                        Apply sameAs Relations

BlogTalk2009                               8                   Copyright © 2004-2009, KISTI
Identity Resolution



                            case 1                            case 2




               Barry G.T.             Barry       Christian             Christian
                Lowden               Lowden        Becker                Becker




BlogTalk2009                                  9                  Copyright © 2004-2009, KISTI
Identity Resolution
     Rules for Resolving Personal Identities




          Class     Resource         Kind      Match    Relation      Source      Weight
         Person                    Order                                             1

         Person   Name             Pivot       Exact    Single     OntoURI

         Person   hasInstitution   Feature     Exact    Single     OntoURI           2

         Person   Email            Feature     Number   Single                       4

         Person   hasCoauthor      Feature     Number   Multiple   OntoReasoner      1

         Person   hasTopic         threshold                                        0.8




BlogTalk2009                                     10                     Copyright © 2004-2009, KISTI
Identity Resolution
     Authority Data




          Normalized Form   Variant Form                                  Kind           Class
          IBM               International Business Machines Corporation   Abbreviation   Institution
          Microsoft         MS                                            Abbreviation   Institution
          Microsoft         마이크로소프트                                       Korean         Institution
          London            런던                                            Korean         Location
          Academic Inc.     Academic Press Inc, LTD                       Alternative    Publication




BlogTalk2009                                          11                            Copyright © 2004-2009, KISTI
Identity Resolution
      sameAs
               Authorization




                                    ∅

BlogTalk2009                   12       Copyright © 2004-2009, KISTI
Identity Resolution
     sameAs
           Candidates




BlogTalk2009            13   Copyright © 2004-2009, KISTI
ReSIST (2006 ~ 2008)




BlogTalk2009     14     Copyright © 2004-2009, KISTI
ReSIST (2006 ~ 2008)
     Resilience Knowledge Base




                                  "Deliverable D31: Final Workshop report" by ReSIST

BlogTalk2009                 15            Copyright © 2004-2009, KISTI
LOD Project
     Linking Open Data Community Project
           Available in RDF and SVG (Scalable Vector Graphics) versions




                                                                        KISTI




                                                                 http://richard.cyganiak.de/2007/10/lod/

BlogTalk2009                               16                   Copyright © 2004-2009, KISTI
Finding experts …   How to extract topics?
                         How to determine topics of a researcher?




BlogTalk2009              17                    Copyright © 2004-2009, KISTI
Topic Extraction
     System Architecture




BlogTalk2009               18   Copyright © 2004-2009, KISTI
Topic Propagation
      Propagating Topics of Entities



               Article                 Person




BlogTalk2009                    19              Copyright © 2004-2009, KISTI
Experts Finding
     Process
           Knowledge expansion
               Making direct relations for shorter access path

           Experts retrieval
               Querying with SPARQL for a given topic
               Converting SPARQL-to-SQL
               Using backward chaining path

           Post-processing
               Grouping and counting retrieved authors
               Ranking by names or the number of achievements
               Making an XML document as the result




BlogTalk2009                                    20               Copyright © 2004-2009, KISTI
Knowledge Expansion
     Inference Rule
           @prefix isrl: <http://www.kisti.re.kr/isrl/ResearchRefOntology#>
           (?x isrl:hasCreatorInfo ?y) (?y isrl:hasCreator ?z) ->
               (?x isrl:createdByPerson ?z)




                                                               Article

                                                                              hasCreatorInfo

                                                                               CreatorInfo
                                                       createdByPerson

                                                                         hasCreator


                                                               Person                        ……




BlogTalk2009                                  21                         Copyright © 2004-2009, KISTI
Experts Retrieval
     Backward Chaining Path




BlogTalk2009                  22   Copyright © 2004-2009, KISTI
Generating networks …        How to find a researcher group?
                             How about similar researchers?




BlogTalk2009            23                     Copyright © 2004-2009, KISTI
OntoFrame 2008




BlogTalk2009      24   Copyright © 2004-2009, KISTI
Researcher Networks (T, P)




BlogTalk2009     25       Copyright © 2004-2009, KISTI
Researcher Networks (T, P)
     Process
           Getting co-author pairs for a target topic (T)
               SELECT DISTINCT ?person1 ?person2
               WHERE {
               ?article aca:yearOfAccomplishment ?year .
               FILTER(?year>=startYear && ?year<=endYear) .

               ?article aca:hasTopicOfArticle <topURI> .
               ?article aca:createdByPerson ?person1 .
               ?article aca:createdByPerson ?person2 .
               FILTER(?person1 < ?person2) .
               }

           Selecting a target researcher (P) in the pairs
           Tracing group members connected with him (seed)




BlogTalk2009                                      26          Copyright © 2004-2009, KISTI
Researcher Networks (P)




BlogTalk2009     27        Copyright © 2004-2009, KISTI
Researcher Networks (P)
     Process
           Getting co-author pairs including a target researcher (P)
               SELECT ?per1 ?per2
               WHERE {
               ?article aca:yearOfAccomplishment ?year .
               FILTER(?year>=startYear && ?year<=endYear) .

               ?article aca:createdByPerson ?per1 .
               ?article aca:createdByPerson ?per2 .
               FILTER(?per1 < ?per2) .
               FILTER(?per1=<perURI> || ?per2=<perURI>) .
               }

           Ranking them with the frequency of co-authorship




BlogTalk2009                                   28                  Copyright © 2004-2009, KISTI
Similar Researchers




BlogTalk2009     29    Copyright © 2004-2009, KISTI
Similar Researchers (P)
     Process (1/2)
           Getting topics of a target researcher (P)
               SELECT ?per1 ?topic
               WHERE {
               ?article aca:createdByPerson ?per1 .
               ?article aca:hasTopicArea ?topicArea .
               ?topicArea aca:hasTopicOfTopicArea ?topic .

               FILTER(?per1=<perURI>) .
               }

           Ranking and selecting top n topics for him




BlogTalk2009                                     30          Copyright © 2004-2009, KISTI
Similar Researchers
     Process (2/2)
           Getting researchers who largely share topics with him
               SELECT DISTINCT ?per2
               WHERE {
               ?per2 aca:hasTopicOfPerson ?topic1 .
               ?per2 aca:hasTopicOfPerson ?topic2 .
               ?per2 aca:hasTopicOfPerson ?topic3 .
               ?per2 aca:hasTopicOfPerson ?topic4 .

               FILTER(?per2!=<perURI>) .
               FILTER(?topic1 < ?topic2 && ?topic2 < ?topic3 && ?topic3 < ?topic4) .
               {
               FILTER(?topic1=<topic[0]> || ?topic1=<topic[1]> || ?topic1=<topic[2]>
                 || ?topic1=<topic[3]> || ?topic1=<topic[4]>) .
               FILTER(?topic2=<topic[0]> || ?topic2=<topic[1]> || ?topic2=<topic[2]>
                 || ?topic2=<topic[3]> || ?topic2=<topic[4]>) .
               FILTER(?topic3=<topic[0]> || ?topic3=<topic[1]> || ?topic3=<topic[2]>
                 || ?topic3=<topic[3]> || ?topic3=<topic[4]>) .
               FILTER(?topic4=<topic[0]> || ?topic4=<topic[1]> || ?topic4=<topic[2]>
                 || ?topic4=<topic[3]> || ?topic4=<topic[4]>) .
               }
BlogTalk2009                                      31                      Copyright © 2004-2009, KISTI
Conclusions
     Processes to Generate Researcher Networks
           Getting sources: Papers
           Resolving identities: Rules, Authority data, sameAs
           Finding experts: Topics, Reasoning
           Generating networks: Topic-, Person-constrained
     Next Research Topic
           Service mashup to get researcher networks directly




BlogTalk2009                                32                   Copyright © 2004-2009, KISTI
“A lot of times, people don’t know what they want until you show it to them.”
                                                                     by Steve Jobs




                           Thank you
                           jhm@kisti.re.kr



       BlogTalk2009                      33                 Copyright © 2004-2009, KISTI

Weitere ähnliche Inhalte

Ähnlich wie Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Scaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and DjangoScaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and Djangotow21
 
Architecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkSaltmarch Media
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicagoDeborah McGuinness
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebMarin Dimitrov
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsMelanie Courtot
 
Tagging Up - MMS and Taxonomy In SharePoint 2010
Tagging Up - MMS and Taxonomy In SharePoint 2010Tagging Up - MMS and Taxonomy In SharePoint 2010
Tagging Up - MMS and Taxonomy In SharePoint 2010Chris McNulty
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnRobert H. McDonald
 
From WWW to GGG Ignite Athens 2012
From WWW to GGG Ignite Athens 2012From WWW to GGG Ignite Athens 2012
From WWW to GGG Ignite Athens 2012healis
 
Www 2 ggg Athanassios Hatzis
Www 2 ggg Athanassios HatzisWww 2 ggg Athanassios Hatzis
Www 2 ggg Athanassios HatzisIgnite_Athens
 
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrSharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrMongoDB
 
Mongo la search platform - january 2013
Mongo la   search platform - january 2013Mongo la   search platform - january 2013
Mongo la search platform - january 2013MongoDB
 
The Role of Kerberos in Identity Mgmt
The Role of Kerberos in Identity MgmtThe Role of Kerberos in Identity Mgmt
The Role of Kerberos in Identity MgmtISACA New England
 

Ähnlich wie Generating Researcher Networks with Identified Persons on a Semantic Service Platform (20)

Scaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and DjangoScaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and Django
 
Role of Semantic Web in Health Informatics
Role of Semantic Web in Health InformaticsRole of Semantic Web in Health Informatics
Role of Semantic Web in Health Informatics
 
Provenance and Trust
Provenance and TrustProvenance and Trust
Provenance and Trust
 
Enterprise Search @EPAM
Enterprise Search @EPAMEnterprise Search @EPAM
Enterprise Search @EPAM
 
Architecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity Framework
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
SIL rapid capture
SIL rapid captureSIL rapid capture
SIL rapid capture
 
MongoDB for Genealogy
MongoDB for GenealogyMongoDB for Genealogy
MongoDB for Genealogy
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
 
Tagging Up - MMS and Taxonomy In SharePoint 2010
Tagging Up - MMS and Taxonomy In SharePoint 2010Tagging Up - MMS and Taxonomy In SharePoint 2010
Tagging Up - MMS and Taxonomy In SharePoint 2010
 
Semtech2006
Semtech2006Semtech2006
Semtech2006
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
 
From WWW to GGG Ignite Athens 2012
From WWW to GGG Ignite Athens 2012From WWW to GGG Ignite Athens 2012
From WWW to GGG Ignite Athens 2012
 
Www 2 ggg Athanassios Hatzis
Www 2 ggg Athanassios HatzisWww 2 ggg Athanassios Hatzis
Www 2 ggg Athanassios Hatzis
 
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrSharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
 
Mongo la search platform - january 2013
Mongo la   search platform - january 2013Mongo la   search platform - january 2013
Mongo la search platform - january 2013
 
3 022
3 0223 022
3 022
 
The Role of Kerberos in Identity Mgmt
The Role of Kerberos in Identity MgmtThe Role of Kerberos in Identity Mgmt
The Role of Kerberos in Identity Mgmt
 
Fqas09
Fqas09Fqas09
Fqas09
 

Mehr von Korea Institute of Science and Technology Information (9)

Recent Internet and Communications Technologies and Business Mind (4/4)
Recent Internet and Communications Technologies and Business Mind (4/4)Recent Internet and Communications Technologies and Business Mind (4/4)
Recent Internet and Communications Technologies and Business Mind (4/4)
 
Recent Internet and Communications Technologies and Business Mind (3/4)
Recent Internet and Communications Technologies and Business Mind (3/4)Recent Internet and Communications Technologies and Business Mind (3/4)
Recent Internet and Communications Technologies and Business Mind (3/4)
 
Recent Internet and Communications Technologies and Business Mind (2/4)
Recent Internet and Communications Technologies and Business Mind (2/4)Recent Internet and Communications Technologies and Business Mind (2/4)
Recent Internet and Communications Technologies and Business Mind (2/4)
 
Recent Internet and Communications Technologies and Business Mind (1/4)
Recent Internet and Communications Technologies and Business Mind (1/4)Recent Internet and Communications Technologies and Business Mind (1/4)
Recent Internet and Communications Technologies and Business Mind (1/4)
 
우리 앞에 다가오는 미래 세상
우리 앞에 다가오는 미래 세상우리 앞에 다가오는 미래 세상
우리 앞에 다가오는 미래 세상
 
프레젠테이션 훈련
프레젠테이션 훈련프레젠테이션 훈련
프레젠테이션 훈련
 
InSciTe Project
InSciTe ProjectInSciTe Project
InSciTe Project
 
미래 세상은 어떨까
미래 세상은 어떨까미래 세상은 어떨까
미래 세상은 어떨까
 
Big Data Curation And Its Application
Big Data Curation And Its ApplicationBig Data Curation And Its Application
Big Data Curation And Its Application
 

Kürzlich hochgeladen

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Kürzlich hochgeladen (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Generating Researcher Networks with Identified Persons on a Semantic Service Platform

  • 1. Generating Researcher Networks with Identified Persons on a Semantic Service Platform 15 Sep. 2009 Hanmin Jung KISTI BlogTalk2009 1 Copyright © 2004-2009, KISTI
  • 2. Agenda Research networks would be useful for finding Collaborators Speakers (Key persons of a researcher group) Issues Getting sources Resolving identities Finding experts Generating networks BlogTalk2009 2 Copyright © 2004-2009, KISTI
  • 3. Getting sources … BlogTalk2009 3 Copyright © 2004-2009, KISTI
  • 4. Sources Identified Entities Papers: 453,124 Elsevier international journal papers with full-texts and metadata Persons: 1,352,220 Topics: 339,947 Institutions: 91,514 Locations: 409,575 (with GPS coordinate) RDF Triples: 283,087,518 (2008.11) BlogTalk2009 4 Copyright © 2004-2009, KISTI
  • 5. Resolving identities … How to resolve identities? How to merge different identifiers as one? BlogTalk2009 5 Copyright © 2004-2009, KISTI
  • 6. OntoFrame OntoFrame 2008 Service WS API WS API/SPARQL XML Ontology Search Engine Ontologies XML Schemata OntoReasoner® OntoURI® WS API Triple Legacy DB Table Field Listener Generator SQL/ Information Expanded Triples … WS API Answers Ontology Field Instances Information Legacy DB Table DB Tables RDF Triple Store BlogTalk2009 6 Copyright © 2004-2009, KISTI
  • 7. Ontology Reference and Academic Knowledge Ontologies BlogTalk2009 7 Copyright © 2004-2009, KISTI
  • 8. OntoFrame Syntactic-to-Semantic Process Design Ontology Model Design Ontology Model Edit URI Generation Rules Edit URI Generation Rules Modeling-Time Select Database & Ontology Edit Identity Resolution Rules Select Database & Ontology Edit Identity Resolution Rules Process Edit Mapping Rules Edit Mapping Rules Test Mapping Process Test Mapping Process Normalize Field Values Normalize Field Values Crawl Database Crawl Database Apply Identity Resolution Rules Apply Identity Resolution Rules Refer Authority Data Refer Authority Data Indexing-Time Resolve Identities Resolve Identities Extract Topics Extract Topics Process Assign URIs Assign URIs Apply Mapping Rules Apply Mapping Rules Apply URI Generation Rules Apply URI Generation Rules Generate RDF Triples Generate RDF Triples Run-Time Process Apply sameAs Relations Apply sameAs Relations BlogTalk2009 8 Copyright © 2004-2009, KISTI
  • 9. Identity Resolution case 1 case 2 Barry G.T. Barry Christian Christian Lowden Lowden Becker Becker BlogTalk2009 9 Copyright © 2004-2009, KISTI
  • 10. Identity Resolution Rules for Resolving Personal Identities Class Resource Kind Match Relation Source Weight Person Order 1 Person Name Pivot Exact Single OntoURI Person hasInstitution Feature Exact Single OntoURI 2 Person Email Feature Number Single 4 Person hasCoauthor Feature Number Multiple OntoReasoner 1 Person hasTopic threshold 0.8 BlogTalk2009 10 Copyright © 2004-2009, KISTI
  • 11. Identity Resolution Authority Data Normalized Form Variant Form Kind Class IBM International Business Machines Corporation Abbreviation Institution Microsoft MS Abbreviation Institution Microsoft 마이크로소프트 Korean Institution London 런던 Korean Location Academic Inc. Academic Press Inc, LTD Alternative Publication BlogTalk2009 11 Copyright © 2004-2009, KISTI
  • 12. Identity Resolution sameAs Authorization ∅ BlogTalk2009 12 Copyright © 2004-2009, KISTI
  • 13. Identity Resolution sameAs Candidates BlogTalk2009 13 Copyright © 2004-2009, KISTI
  • 14. ReSIST (2006 ~ 2008) BlogTalk2009 14 Copyright © 2004-2009, KISTI
  • 15. ReSIST (2006 ~ 2008) Resilience Knowledge Base "Deliverable D31: Final Workshop report" by ReSIST BlogTalk2009 15 Copyright © 2004-2009, KISTI
  • 16. LOD Project Linking Open Data Community Project Available in RDF and SVG (Scalable Vector Graphics) versions KISTI http://richard.cyganiak.de/2007/10/lod/ BlogTalk2009 16 Copyright © 2004-2009, KISTI
  • 17. Finding experts … How to extract topics? How to determine topics of a researcher? BlogTalk2009 17 Copyright © 2004-2009, KISTI
  • 18. Topic Extraction System Architecture BlogTalk2009 18 Copyright © 2004-2009, KISTI
  • 19. Topic Propagation Propagating Topics of Entities Article Person BlogTalk2009 19 Copyright © 2004-2009, KISTI
  • 20. Experts Finding Process Knowledge expansion Making direct relations for shorter access path Experts retrieval Querying with SPARQL for a given topic Converting SPARQL-to-SQL Using backward chaining path Post-processing Grouping and counting retrieved authors Ranking by names or the number of achievements Making an XML document as the result BlogTalk2009 20 Copyright © 2004-2009, KISTI
  • 21. Knowledge Expansion Inference Rule @prefix isrl: <http://www.kisti.re.kr/isrl/ResearchRefOntology#> (?x isrl:hasCreatorInfo ?y) (?y isrl:hasCreator ?z) -> (?x isrl:createdByPerson ?z) Article hasCreatorInfo CreatorInfo createdByPerson hasCreator Person …… BlogTalk2009 21 Copyright © 2004-2009, KISTI
  • 22. Experts Retrieval Backward Chaining Path BlogTalk2009 22 Copyright © 2004-2009, KISTI
  • 23. Generating networks … How to find a researcher group? How about similar researchers? BlogTalk2009 23 Copyright © 2004-2009, KISTI
  • 24. OntoFrame 2008 BlogTalk2009 24 Copyright © 2004-2009, KISTI
  • 25. Researcher Networks (T, P) BlogTalk2009 25 Copyright © 2004-2009, KISTI
  • 26. Researcher Networks (T, P) Process Getting co-author pairs for a target topic (T) SELECT DISTINCT ?person1 ?person2 WHERE { ?article aca:yearOfAccomplishment ?year . FILTER(?year>=startYear && ?year<=endYear) . ?article aca:hasTopicOfArticle <topURI> . ?article aca:createdByPerson ?person1 . ?article aca:createdByPerson ?person2 . FILTER(?person1 < ?person2) . } Selecting a target researcher (P) in the pairs Tracing group members connected with him (seed) BlogTalk2009 26 Copyright © 2004-2009, KISTI
  • 27. Researcher Networks (P) BlogTalk2009 27 Copyright © 2004-2009, KISTI
  • 28. Researcher Networks (P) Process Getting co-author pairs including a target researcher (P) SELECT ?per1 ?per2 WHERE { ?article aca:yearOfAccomplishment ?year . FILTER(?year>=startYear && ?year<=endYear) . ?article aca:createdByPerson ?per1 . ?article aca:createdByPerson ?per2 . FILTER(?per1 < ?per2) . FILTER(?per1=<perURI> || ?per2=<perURI>) . } Ranking them with the frequency of co-authorship BlogTalk2009 28 Copyright © 2004-2009, KISTI
  • 29. Similar Researchers BlogTalk2009 29 Copyright © 2004-2009, KISTI
  • 30. Similar Researchers (P) Process (1/2) Getting topics of a target researcher (P) SELECT ?per1 ?topic WHERE { ?article aca:createdByPerson ?per1 . ?article aca:hasTopicArea ?topicArea . ?topicArea aca:hasTopicOfTopicArea ?topic . FILTER(?per1=<perURI>) . } Ranking and selecting top n topics for him BlogTalk2009 30 Copyright © 2004-2009, KISTI
  • 31. Similar Researchers Process (2/2) Getting researchers who largely share topics with him SELECT DISTINCT ?per2 WHERE { ?per2 aca:hasTopicOfPerson ?topic1 . ?per2 aca:hasTopicOfPerson ?topic2 . ?per2 aca:hasTopicOfPerson ?topic3 . ?per2 aca:hasTopicOfPerson ?topic4 . FILTER(?per2!=<perURI>) . FILTER(?topic1 < ?topic2 && ?topic2 < ?topic3 && ?topic3 < ?topic4) . { FILTER(?topic1=<topic[0]> || ?topic1=<topic[1]> || ?topic1=<topic[2]> || ?topic1=<topic[3]> || ?topic1=<topic[4]>) . FILTER(?topic2=<topic[0]> || ?topic2=<topic[1]> || ?topic2=<topic[2]> || ?topic2=<topic[3]> || ?topic2=<topic[4]>) . FILTER(?topic3=<topic[0]> || ?topic3=<topic[1]> || ?topic3=<topic[2]> || ?topic3=<topic[3]> || ?topic3=<topic[4]>) . FILTER(?topic4=<topic[0]> || ?topic4=<topic[1]> || ?topic4=<topic[2]> || ?topic4=<topic[3]> || ?topic4=<topic[4]>) . } BlogTalk2009 31 Copyright © 2004-2009, KISTI
  • 32. Conclusions Processes to Generate Researcher Networks Getting sources: Papers Resolving identities: Rules, Authority data, sameAs Finding experts: Topics, Reasoning Generating networks: Topic-, Person-constrained Next Research Topic Service mashup to get researcher networks directly BlogTalk2009 32 Copyright © 2004-2009, KISTI
  • 33. “A lot of times, people don’t know what they want until you show it to them.” by Steve Jobs Thank you jhm@kisti.re.kr BlogTalk2009 33 Copyright © 2004-2009, KISTI