SlideShare ist ein Scribd-Unternehmen logo
1 von 103
Downloaden Sie, um offline zu lesen
Chances and Challenges in Comparing
  Cross-Language Retrieval Tools

             Giovanna Roda
              Vienna, Austria



    Irf Symposium 2010 / June 3, 2010
CLEF-IP: the Intellectual Property track at CLEF




  CLEF-IP is an evaluation track within the Cross Language
  Evaluation Forum (Clef). 1




    1
        http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF




  CLEF-IP is an evaluation track within the Cross Language
  Evaluation Forum (Clef). 1

         organized by the IRF




    1
        http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF




  CLEF-IP is an evaluation track within the Cross Language
  Evaluation Forum (Clef). 1

         organized by the IRF
         first track ran in 2009




    1
        http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF




  CLEF-IP is an evaluation track within the Cross Language
  Evaluation Forum (Clef). 1

         organized by the IRF
         first track ran in 2009
         running this year for the second time




    1
        http://www.clef-campaign.org
CLEF-IP: the Intellectual Property track at CLEF




  CLEF-IP is an evaluation track within the Cross Language
  Evaluation Forum (Clef). 1

         organized by the IRF
         first track ran in 2009
         running this year for the second time




    1
        http://www.clef-campaign.org
What is an evaluation track?

  An evaluation track in Information Retrieval is a cooperative action
  aimed at comparing different techniques on a common retrieval
  task.
What is an evaluation track?

  An evaluation track in Information Retrieval is a cooperative action
  aimed at comparing different techniques on a common retrieval
  task.
      produces experimental data that can be analyzed and used to
      improve existing systems
What is an evaluation track?

  An evaluation track in Information Retrieval is a cooperative action
  aimed at comparing different techniques on a common retrieval
  task.
      produces experimental data that can be analyzed and used to
      improve existing systems
      fosters exchange of ideas and cooperation
What is an evaluation track?

  An evaluation track in Information Retrieval is a cooperative action
  aimed at comparing different techniques on a common retrieval
  task.
      produces experimental data that can be analyzed and used to
      improve existing systems
      fosters exchange of ideas and cooperation
      produces a reusable test collection, sets milestones
What is an evaluation track?

  An evaluation track in Information Retrieval is a cooperative action
  aimed at comparing different techniques on a common retrieval
  task.
       produces experimental data that can be analyzed and used to
       improve existing systems
       fosters exchange of ideas and cooperation
       produces a reusable test collection, sets milestones



  Test collection
  A test collection consists traditionally of target data, a set of
  queries, and relevance assessments for each query.
Clef–Ip 2009: the task


  The main task in the Clef–Ip track was to find prior art for a
  given patent.
Clef–Ip 2009: the task


  The main task in the Clef–Ip track was to find prior art for a
  given patent.



  Prior art search
  Prior art search consists in identifying all information (including
  non-patent literature) that might be relevant to a patent’s claim of
  novelty.
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
    4 University of Tampere - Info Studies (FI)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
    4 University of Tampere - Info Studies (FI)
    5 Interactive Media and Swedish Institute of
      Computer Science (SE)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
    4 University of Tampere - Info Studies (FI)
    5 Interactive Media and Swedish Institute of
      Computer Science (SE)
    6 Geneva Univ. - Centre Universitaire
      d’Informatique (CH)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
    4 University of Tampere - Info Studies (FI)
    5 Interactive Media and Swedish Institute of
      Computer Science (SE)
    6 Geneva Univ. - Centre Universitaire
      d’Informatique (CH)
    7 Glasgow Univ. - IR Group Keith (UK)
Participants - 2009 track

    1 Tech. Univ. Darmstadt, Dept. of CS,
      Ubiquitous Knowledge Processing Lab (DE)
    2 Univ. Neuchatel - Computer Science (CH)
    3 Santiago de Compostela Univ. - Dept.
      Electronica y Computacion (ES)
    4 University of Tampere - Info Studies (FI)
    5 Interactive Media and Swedish Institute of
      Computer Science (SE)
    6 Geneva Univ. - Centre Universitaire
      d’Informatique (CH)
    7 Glasgow Univ. - IR Group Keith (UK)
    8 Centrum Wiskunde & Informatica - Interactive
      Information Access (NL)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
   11 Dublin City Univ. - School of Computing (IE)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
   11 Dublin City Univ. - School of Computing (IE)
   12 Radboud Univ. Nijmegen - Centre for Language
      Studies & Speech Technologies (NL)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
   11 Dublin City Univ. - School of Computing (IE)
   12 Radboud Univ. Nijmegen - Centre for Language
      Studies & Speech Technologies (NL)
   13 Hildesheim Univ. - Information Systems &
      Machine Learning Lab (DE)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
   11 Dublin City Univ. - School of Computing (IE)
   12 Radboud Univ. Nijmegen - Centre for Language
      Studies & Speech Technologies (NL)
   13 Hildesheim Univ. - Information Systems &
      Machine Learning Lab (DE)
   14 Technical Univ. Valencia - Natural Language
      Engineering (ES)
Participants - 2009 track

    9 Geneva Univ. Hospitals - Service of Medical
      Informatics (CH)
   10 Humboldt Univ. - Dept. of German Language
      and Linguistics (DE)
   11 Dublin City Univ. - School of Computing (IE)
   12 Radboud Univ. Nijmegen - Centre for Language
      Studies & Speech Technologies (NL)
   13 Hildesheim Univ. - Information Systems &
      Machine Learning Lab (DE)
   14 Technical Univ. Valencia - Natural Language
      Engineering (ES)
   15 Al. I. Cuza University of Iasi - Natural Language
      Processing (RO)
Participants - 2009 track
Participants - 2009 track




                            15 participants
Participants - 2009 track




                            15 participants
                            48 experiments
                            submitted for the main
                            task
Participants - 2009 track




                            15 participants
                            48 experiments
                            submitted for the main
                            task
                            10 experiments
                            submitted for the
                            language tasks
2009-2010: participants
2009-2010: evolution of the CLEF-IP track

   2009

   1 task: prior art search


   targeting granted patents

   15 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009

   1 task: prior art search


   targeting granted patents

   15 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search


   targeting granted patents

   15 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents

   15 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants              20 participants

   all from academia

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants              20 participants

   all from academia            4 industrial participants

   families and citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants              20 participants

   all from academia            4 industrial participants

   families and citations       include forward citations

   manual assessments


   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants              20 participants

   all from academia            4 industrial participants

   families and citations       include forward citations

   manual assessments           expanded lists of relevant
                                docs

   standard evaluation mea-
   sures
2009-2010: evolution of the CLEF-IP track

   2009                         2010

   1 task: prior art search     prior art candidate search
                                and classification task

   targeting granted patents    patent applications

   15 participants              20 participants

   all from academia            4 industrial participants

   families and citations       include forward citations

   manual assessments           expanded lists of relevant
                                docs

   standard evaluation mea-     new measure: pres, more
   sures                        recall-oriented
What are relevance assessments



  A test collection (also known as gold standard) consists of a target
  dataset, a set of queries, and relevance assessments corresponding
  to each query.
What are relevance assessments



  A test collection (also known as gold standard) consists of a target
  dataset, a set of queries, and relevance assessments corresponding
  to each query.

  The CLEF-IP test collection:
What are relevance assessments



  A test collection (also known as gold standard) consists of a target
  dataset, a set of queries, and relevance assessments corresponding
  to each query.

  The CLEF-IP test collection:

      target data: 2 million EP patents
What are relevance assessments



  A test collection (also known as gold standard) consists of a target
  dataset, a set of queries, and relevance assessments corresponding
  to each query.

  The CLEF-IP test collection:

      target data: 2 million EP patents
      queries: full-text patents (without images)
What are relevance assessments



  A test collection (also known as gold standard) consists of a target
  dataset, a set of queries, and relevance assessments corresponding
  to each query.

  The CLEF-IP test collection:

      target data: 2 million EP patents
      queries: full-text patents (without images)
      relevance assessments: extended citations
Relevance assessments


  We used patents cited as prior art as relevance assessments.
Relevance assessments


  We used patents cited as prior art as relevance assessments.


  Sources of citations:
Relevance assessments


  We used patents cited as prior art as relevance assessments.


  Sources of citations:
    1   applicant’s disclosure: the Uspto requires applicants to
        disclose all known relevant publications
Relevance assessments


  We used patents cited as prior art as relevance assessments.


  Sources of citations:
    1   applicant’s disclosure: the Uspto requires applicants to
        disclose all known relevant publications
    2   patent office search report: each patent office will do a search
        for prior art to judge the novelty of a patent
Relevance assessments


  We used patents cited as prior art as relevance assessments.


  Sources of citations:
    1   applicant’s disclosure: the Uspto requires applicants to
        disclose all known relevant publications
    2   patent office search report: each patent office will do a search
        for prior art to judge the novelty of a patent
    3   opposition procedures: patents cited to prove that a granted
        patent is not novel
Extended citations as relevance assessments




  direct citations and their families
Extended citations as relevance assessments




  direct citations of family members ...
Extended citations as relevance assessments




  ... and their families
Patent families




  A patent family consists of patents granted by different patent
  authorities but related to the same invention.
Patent families




  A patent family consists of patents granted by different patent
  authorities but related to the same invention.
  simple family all family members share the same priority number
Patent families




  A patent family consists of patents granted by different patent
  authorities but related to the same invention.
  simple family all family members share the same priority number
  extended family there are several definitions, in the INPADOC
              database all documents which are directly or
              indirectly linked via a priority number belong to the
              same family
Patent families




Patent documents are linked by
priorities
Patent families




Patent documents are linked by
                                 INPADOC family.
priorities
Patent families




Patent documents are linked by
                                 Clef–Ip uses simple families.
priorities
Relevance assessments 2010




  Expanding the 2009 extended citations:
Relevance assessments 2010




  Expanding the 2009 extended citations:
    1   include citations of forward citations ...
Relevance assessments 2010




  Expanding the 2009 extended citations:
    1   include citations of forward citations ...
    2   ... and their families
Relevance assessments 2010




  Expanding the 2009 extended citations:
    1   include citations of forward citations ...
    2   ... and their families

  This is apparently a well-known method among patent searchers.
Relevance assessments 2010




  Expanding the 2009 extended citations:
    1   include citations of forward citations ...
    2   ... and their families

  This is apparently a well-known method among patent searchers.
  Zig-zag search?
How good are the CLEF-IP relevance assessments?




CLEF-IP uses families + citations:
How good are the CLEF-IP relevance assessments?


    how complete are extended
    citations as a relevance
    assessments?
How good are the CLEF-IP relevance assessments?


    how complete are extended
    citations as a relevance
    assessments?
    will every prior art patent be
    included in this set?
How good are the CLEF-IP relevance assessments?


    how complete are extended
    citations as a relevance
    assessments?
    will every prior art patent be
    included in this set?
    and if not, what percentage
    of prior art items are captured
    by extended citations?
How good are the CLEF-IP relevance assessments?


    how complete are extended
    citations as a relevance
    assessments?
    will every prior art patent be
    included in this set?
    and if not, what percentage
    of prior art items are captured
    by extended citations?
    when considering forward
    citations, how good are
    extended citations as a prior
    art candidate set?
Feedback from patent experts needed




       Quality of prior art candidate sets has to be assessed
Feedback from patent experts needed




          Know-how of patent search experts is needed
Feedback from patent experts needed




     at Clef–Ip 2009 7 patent search professionals assessed 12
     search results
Feedback from patent experts needed




     at Clef–Ip 2009 7 patent search professionals assessed 12
     search results
     the task was not well defined and there were
     misunderstandings on the concept of relevance
Feedback from patent experts needed




     at Clef–Ip 2009 7 patent search professionals assessed 12
     search results
     the task was not well defined and there were
     misunderstandings on the concept of relevance
     amount of data was not sufficient to draw conclusions
Feedback from patent experts needed
Some initiatives associated with Clef–Ip




  The results of evaluation tracks are mostly useful for the research
  community.
Some initiatives associated with Clef–Ip




  The results of evaluation tracks are mostly useful for the research
  community.

  This community often produces prototypes that are of little
  interest to the end-user.
Some initiatives associated with Clef–Ip




  The results of evaluation tracks are mostly useful for the research
  community.

  This community often produces prototypes that are of little
  interest to the end-user.



  Next I’d like to present two concrete outcomes - not of Clef–Ip
  directly but arising from work in patent retrieval evaluation
Soire
Soire




        developed at Matrixware
Soire




        developed at Matrixware
        service-oriented architecture - available as a a Web service
Soire




        developed at Matrixware
        service-oriented architecture - available as a a Web service
        allows to replicate IR experiments based on classical
        evaluation model
Soire




        developed at Matrixware
        service-oriented architecture - available as a a Web service
        allows to replicate IR experiments based on classical
        evaluation model
        tested on the CLEF-IP data
Soire




        developed at Matrixware
        service-oriented architecture - available as a a Web service
        allows to replicate IR experiments based on classical
        evaluation model
        tested on the CLEF-IP data
        customized for the evaluation of machine translation
Spinque
Spinque




     a spin-off (2010) from CWI (the Dutch National Research
     Center in Computer Science and Mathematics)
Spinque




     a spin-off (2010) from CWI (the Dutch National Research
     Center in Computer Science and Mathematics)
     introduces search-by-strategy
Spinque




     a spin-off (2010) from CWI (the Dutch National Research
     Center in Computer Science and Mathematics)
     introduces search-by-strategy
     provides optimized strategies for patent search - tested on
     CLEF-IP data
Spinque




     a spin-off (2010) from CWI (the Dutch National Research
     Center in Computer Science and Mathematics)
     introduces search-by-strategy
     provides optimized strategies for patent search - tested on
     CLEF-IP data
     transparency: understand your search results to improve
     strategy
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.

  The model combined several strategies:
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.

  The model combined several strategies:
      using metadata (IPC, ECLA)
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.

  The model combined several strategies:
      using metadata (IPC, ECLA)
      indexes built at lemma level
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.

  The model combined several strategies:
      using metadata (IPC, ECLA)
      indexes built at lemma level
      an additional phrase index for English
Clef–Ip 2009 learnings



  The Humboldt University implemented a model for patent search
  that produced the best results.

  The model combined several strategies:
      using metadata (IPC, ECLA)
      indexes built at lemma level
      an additional phrase index for English
      crosslingual concept index (multilingual terminological
      database)
Some additional investigations




  Some citations were hard to find
Some additional investigations




                                       % runs          class
                                        ≤5             hard
                                     5 < x ≤ 10    very difficult
  Some citations were hard to find
                                    10 < x ≤ 50      difficult
                                    50 < x ≤ 75      medium
                                    75 < x ≤ 100       easy
Some additional investigations


      We looked at the content of citations and citing patents.
Some additional investigations




                   Ongoing investigations.
Thank you for your attention.

Weitere ähnliche Inhalte

Was ist angesagt?

Methods for Validating and Testing Software Requirements (lecture slides)
Methods for Validating and Testing Software Requirements (lecture slides)Methods for Validating and Testing Software Requirements (lecture slides)
Methods for Validating and Testing Software Requirements (lecture slides)Dagmar Monett
 
A Structured Approach to Requirements Analysis (lecture slides)
A Structured Approach to Requirements Analysis (lecture slides)A Structured Approach to Requirements Analysis (lecture slides)
A Structured Approach to Requirements Analysis (lecture slides)Dagmar Monett
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...Europeana
 
Mchristy-eMOP-workflows2-24x7
Mchristy-eMOP-workflows2-24x7Mchristy-eMOP-workflows2-24x7
Mchristy-eMOP-workflows2-24x7Matt Christy
 
Admixture of Poisson MRFs: A New Topic Model with Word Dependencies
Admixture of Poisson MRFs: A New Topic Model with Word DependenciesAdmixture of Poisson MRFs: A New Topic Model with Word Dependencies
Admixture of Poisson MRFs: A New Topic Model with Word DependenciesDavid Inouye
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeGeorg Rehm
 
DLF Forum 2015: Beyond eMOP
DLF Forum 2015: Beyond eMOPDLF Forum 2015: Beyond eMOP
DLF Forum 2015: Beyond eMOPMatt Christy
 
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsNavigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsLiz Grumbach
 
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issues
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issuesLDAC 2015 - Towards an industry-wide ifcOWL: choices and issues
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issuesPieter Pauwels
 

Was ist angesagt? (10)

Methods for Validating and Testing Software Requirements (lecture slides)
Methods for Validating and Testing Software Requirements (lecture slides)Methods for Validating and Testing Software Requirements (lecture slides)
Methods for Validating and Testing Software Requirements (lecture slides)
 
A Structured Approach to Requirements Analysis (lecture slides)
A Structured Approach to Requirements Analysis (lecture slides)A Structured Approach to Requirements Analysis (lecture slides)
A Structured Approach to Requirements Analysis (lecture slides)
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
 
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
LOD2 Plenary Meeting 2011: University of Economics, Prague – Partner Introduc...
 
Mchristy-eMOP-workflows2-24x7
Mchristy-eMOP-workflows2-24x7Mchristy-eMOP-workflows2-24x7
Mchristy-eMOP-workflows2-24x7
 
Admixture of Poisson MRFs: A New Topic Model with Word Dependencies
Admixture of Poisson MRFs: A New Topic Model with Word DependenciesAdmixture of Poisson MRFs: A New Topic Model with Word Dependencies
Admixture of Poisson MRFs: A New Topic Model with Word Dependencies
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for Europe
 
DLF Forum 2015: Beyond eMOP
DLF Forum 2015: Beyond eMOPDLF Forum 2015: Beyond eMOP
DLF Forum 2015: Beyond eMOP
 
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering StandardsNavigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
Navigating the Storm: eMOP, Big DH Projects, and Agile Steering Standards
 
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issues
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issuesLDAC 2015 - Towards an industry-wide ifcOWL: choices and issues
LDAC 2015 - Towards an industry-wide ifcOWL: choices and issues
 

Andere mochten auch

Cross language information retrieval (clir)slide
Cross language information retrieval (clir)slideCross language information retrieval (clir)slide
Cross language information retrieval (clir)slideMohd Iqbal Al-farabi
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3alaa223
 
Ir 1 lec 7
Ir 1 lec 7Ir 1 lec 7
Ir 1 lec 7alaa223
 
Explicit vs. latent concept models for cross language information retrieval
Explicit vs. latent concept models for cross language information retrievalExplicit vs. latent concept models for cross language information retrieval
Explicit vs. latent concept models for cross language information retrievalNitish Aggarwal
 
STL: A similarity measure based on semantic and linguistic information
STL: A similarity measure based on semantic and linguistic informationSTL: A similarity measure based on semantic and linguistic information
STL: A similarity measure based on semantic and linguistic informationNitish Aggarwal
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsNitish Aggarwal
 
Cross-Language Information Retrieval
Cross-Language Information RetrievalCross-Language Information Retrieval
Cross-Language Information RetrievalSumin Byeon
 
Cross-Language Qualitative Research
Cross-Language Qualitative ResearchCross-Language Qualitative Research
Cross-Language Qualitative Researchmmacle01
 

Andere mochten auch (8)

Cross language information retrieval (clir)slide
Cross language information retrieval (clir)slideCross language information retrieval (clir)slide
Cross language information retrieval (clir)slide
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
 
Ir 1 lec 7
Ir 1 lec 7Ir 1 lec 7
Ir 1 lec 7
 
Explicit vs. latent concept models for cross language information retrieval
Explicit vs. latent concept models for cross language information retrievalExplicit vs. latent concept models for cross language information retrieval
Explicit vs. latent concept models for cross language information retrieval
 
STL: A similarity measure based on semantic and linguistic information
STL: A similarity measure based on semantic and linguistic informationSTL: A similarity measure based on semantic and linguistic information
STL: A similarity measure based on semantic and linguistic information
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
 
Cross-Language Information Retrieval
Cross-Language Information RetrievalCross-Language Information Retrieval
Cross-Language Information Retrieval
 
Cross-Language Qualitative Research
Cross-Language Qualitative ResearchCross-Language Qualitative Research
Cross-Language Qualitative Research
 

Ähnlich wie Comparing Cross-Language Retrieval Tools at CLEF-IP

1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner IntroductionsRIILP
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift
 
New trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsMaría Poveda Villalón
 
Gli standard per l’interoperabilità dei sistemi studenti in ambito Europeo
Gli standard per l’interoperabilità dei sistemi studenti in ambito EuropeoGli standard per l’interoperabilità dei sistemi studenti in ambito Europeo
Gli standard per l’interoperabilità dei sistemi studenti in ambito EuropeoSimone Ravaioli
 
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsThe Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsIstvanKoren
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Sandro D'Elia
 
OpenAIRE presentation for ICT - Brussels 27-29 Sept, 2010
OpenAIRE presentation for  ICT - Brussels 27-29 Sept, 2010OpenAIRE presentation for  ICT - Brussels 27-29 Sept, 2010
OpenAIRE presentation for ICT - Brussels 27-29 Sept, 2010OpenAIRE
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 
Exlporing New challenges in TELL: Language Learning MOOCs
Exlporing New challenges in TELL: Language Learning MOOCsExlporing New challenges in TELL: Language Learning MOOCs
Exlporing New challenges in TELL: Language Learning MOOCsMaria Perifanou
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)cneudecker
 
cv_romain_gehrig_2015-10-08
cv_romain_gehrig_2015-10-08cv_romain_gehrig_2015-10-08
cv_romain_gehrig_2015-10-08Romain Gehrig
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EUmoocs
 
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningThe Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningRalf Klamma
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware ac.uk
 
Epic2011 Assessement Portfolios
Epic2011 Assessement PortfoliosEpic2011 Assessement Portfolios
Epic2011 Assessement PortfoliosRaynauld Jacques
 
Open Accessibility EverywhereGroundwork, Infrastructure, Standards
Open Accessibility EverywhereGroundwork, Infrastructure, StandardsOpen Accessibility EverywhereGroundwork, Infrastructure, Standards
Open Accessibility EverywhereGroundwork, Infrastructure, StandardsAEGIS-ACCESSIBLE Projects
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeISSGC Summer School
 

Ähnlich wie Comparing Cross-Language Retrieval Tools at CLEF-IP (20)

1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
 
LOD2: Guest presentation: French datalift project
LOD2: Guest presentation: French datalift projectLOD2: Guest presentation: French datalift project
LOD2: Guest presentation: French datalift project
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011
 
New trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and tools
 
Gli standard per l’interoperabilità dei sistemi studenti in ambito Europeo
Gli standard per l’interoperabilità dei sistemi studenti in ambito EuropeoGli standard per l’interoperabilità dei sistemi studenti in ambito Europeo
Gli standard per l’interoperabilità dei sistemi studenti in ambito Europeo
 
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsThe Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
OpenAIRE presentation for ICT - Brussels 27-29 Sept, 2010
OpenAIRE presentation for  ICT - Brussels 27-29 Sept, 2010OpenAIRE presentation for  ICT - Brussels 27-29 Sept, 2010
OpenAIRE presentation for ICT - Brussels 27-29 Sept, 2010
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 
Exlporing New challenges in TELL: Language Learning MOOCs
Exlporing New challenges in TELL: Language Learning MOOCsExlporing New challenges in TELL: Language Learning MOOCs
Exlporing New challenges in TELL: Language Learning MOOCs
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)
 
cv_romain_gehrig_2015-10-08
cv_romain_gehrig_2015-10-08cv_romain_gehrig_2015-10-08
cv_romain_gehrig_2015-10-08
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
 
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningThe Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
 
Wimmer Egov
Wimmer EgovWimmer Egov
Wimmer Egov
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers
 
Epic2011 Assessement Portfolios
Epic2011 Assessement PortfoliosEpic2011 Assessement Portfolios
Epic2011 Assessement Portfolios
 
Open Accessibility EverywhereGroundwork, Infrastructure, Standards
Open Accessibility EverywhereGroundwork, Infrastructure, StandardsOpen Accessibility EverywhereGroundwork, Infrastructure, Standards
Open Accessibility EverywhereGroundwork, Infrastructure, Standards
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in Europe
 
Grial introduction for eLearning Training Days
Grial introduction for eLearning Training DaysGrial introduction for eLearning Training Days
Grial introduction for eLearning Training Days
 

Mehr von Giovanna Roda

Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for EveryoneGiovanna Roda
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopGiovanna Roda
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2Giovanna Roda
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1Giovanna Roda
 
The need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningThe need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningGiovanna Roda
 
Apache Spark™ is here to stay
Apache Spark™ is here to stayApache Spark™ is here to stay
Apache Spark™ is here to stayGiovanna Roda
 
CLEF-IP 2009: retrieval experiments in the Intellectual Property domain
CLEF-IP 2009: retrieval experiments in the Intellectual Property domainCLEF-IP 2009: retrieval experiments in the Intellectual Property domain
CLEF-IP 2009: retrieval experiments in the Intellectual Property domainGiovanna Roda
 
Patent Search: An important new test bed for IR
Patent Search: An important new test bed for IRPatent Search: An important new test bed for IR
Patent Search: An important new test bed for IRGiovanna Roda
 

Mehr von Giovanna Roda (8)

Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for Everyone
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
 
The need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningThe need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioning
 
Apache Spark™ is here to stay
Apache Spark™ is here to stayApache Spark™ is here to stay
Apache Spark™ is here to stay
 
CLEF-IP 2009: retrieval experiments in the Intellectual Property domain
CLEF-IP 2009: retrieval experiments in the Intellectual Property domainCLEF-IP 2009: retrieval experiments in the Intellectual Property domain
CLEF-IP 2009: retrieval experiments in the Intellectual Property domain
 
Patent Search: An important new test bed for IR
Patent Search: An important new test bed for IRPatent Search: An important new test bed for IR
Patent Search: An important new test bed for IR
 

Kürzlich hochgeladen

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Kürzlich hochgeladen (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Comparing Cross-Language Retrieval Tools at CLEF-IP

  • 1. Chances and Challenges in Comparing Cross-Language Retrieval Tools Giovanna Roda Vienna, Austria Irf Symposium 2010 / June 3, 2010
  • 2. CLEF-IP: the Intellectual Property track at CLEF CLEF-IP is an evaluation track within the Cross Language Evaluation Forum (Clef). 1 1 http://www.clef-campaign.org
  • 3. CLEF-IP: the Intellectual Property track at CLEF CLEF-IP is an evaluation track within the Cross Language Evaluation Forum (Clef). 1 organized by the IRF 1 http://www.clef-campaign.org
  • 4. CLEF-IP: the Intellectual Property track at CLEF CLEF-IP is an evaluation track within the Cross Language Evaluation Forum (Clef). 1 organized by the IRF first track ran in 2009 1 http://www.clef-campaign.org
  • 5. CLEF-IP: the Intellectual Property track at CLEF CLEF-IP is an evaluation track within the Cross Language Evaluation Forum (Clef). 1 organized by the IRF first track ran in 2009 running this year for the second time 1 http://www.clef-campaign.org
  • 6. CLEF-IP: the Intellectual Property track at CLEF CLEF-IP is an evaluation track within the Cross Language Evaluation Forum (Clef). 1 organized by the IRF first track ran in 2009 running this year for the second time 1 http://www.clef-campaign.org
  • 7. What is an evaluation track? An evaluation track in Information Retrieval is a cooperative action aimed at comparing different techniques on a common retrieval task.
  • 8. What is an evaluation track? An evaluation track in Information Retrieval is a cooperative action aimed at comparing different techniques on a common retrieval task. produces experimental data that can be analyzed and used to improve existing systems
  • 9. What is an evaluation track? An evaluation track in Information Retrieval is a cooperative action aimed at comparing different techniques on a common retrieval task. produces experimental data that can be analyzed and used to improve existing systems fosters exchange of ideas and cooperation
  • 10. What is an evaluation track? An evaluation track in Information Retrieval is a cooperative action aimed at comparing different techniques on a common retrieval task. produces experimental data that can be analyzed and used to improve existing systems fosters exchange of ideas and cooperation produces a reusable test collection, sets milestones
  • 11. What is an evaluation track? An evaluation track in Information Retrieval is a cooperative action aimed at comparing different techniques on a common retrieval task. produces experimental data that can be analyzed and used to improve existing systems fosters exchange of ideas and cooperation produces a reusable test collection, sets milestones Test collection A test collection consists traditionally of target data, a set of queries, and relevance assessments for each query.
  • 12. Clef–Ip 2009: the task The main task in the Clef–Ip track was to find prior art for a given patent.
  • 13. Clef–Ip 2009: the task The main task in the Clef–Ip track was to find prior art for a given patent. Prior art search Prior art search consists in identifying all information (including non-patent literature) that might be relevant to a patent’s claim of novelty.
  • 14. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE)
  • 15. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH)
  • 16. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES)
  • 17. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES) 4 University of Tampere - Info Studies (FI)
  • 18. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES) 4 University of Tampere - Info Studies (FI) 5 Interactive Media and Swedish Institute of Computer Science (SE)
  • 19. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES) 4 University of Tampere - Info Studies (FI) 5 Interactive Media and Swedish Institute of Computer Science (SE) 6 Geneva Univ. - Centre Universitaire d’Informatique (CH)
  • 20. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES) 4 University of Tampere - Info Studies (FI) 5 Interactive Media and Swedish Institute of Computer Science (SE) 6 Geneva Univ. - Centre Universitaire d’Informatique (CH) 7 Glasgow Univ. - IR Group Keith (UK)
  • 21. Participants - 2009 track 1 Tech. Univ. Darmstadt, Dept. of CS, Ubiquitous Knowledge Processing Lab (DE) 2 Univ. Neuchatel - Computer Science (CH) 3 Santiago de Compostela Univ. - Dept. Electronica y Computacion (ES) 4 University of Tampere - Info Studies (FI) 5 Interactive Media and Swedish Institute of Computer Science (SE) 6 Geneva Univ. - Centre Universitaire d’Informatique (CH) 7 Glasgow Univ. - IR Group Keith (UK) 8 Centrum Wiskunde & Informatica - Interactive Information Access (NL)
  • 22. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH)
  • 23. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE)
  • 24. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE) 11 Dublin City Univ. - School of Computing (IE)
  • 25. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE) 11 Dublin City Univ. - School of Computing (IE) 12 Radboud Univ. Nijmegen - Centre for Language Studies & Speech Technologies (NL)
  • 26. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE) 11 Dublin City Univ. - School of Computing (IE) 12 Radboud Univ. Nijmegen - Centre for Language Studies & Speech Technologies (NL) 13 Hildesheim Univ. - Information Systems & Machine Learning Lab (DE)
  • 27. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE) 11 Dublin City Univ. - School of Computing (IE) 12 Radboud Univ. Nijmegen - Centre for Language Studies & Speech Technologies (NL) 13 Hildesheim Univ. - Information Systems & Machine Learning Lab (DE) 14 Technical Univ. Valencia - Natural Language Engineering (ES)
  • 28. Participants - 2009 track 9 Geneva Univ. Hospitals - Service of Medical Informatics (CH) 10 Humboldt Univ. - Dept. of German Language and Linguistics (DE) 11 Dublin City Univ. - School of Computing (IE) 12 Radboud Univ. Nijmegen - Centre for Language Studies & Speech Technologies (NL) 13 Hildesheim Univ. - Information Systems & Machine Learning Lab (DE) 14 Technical Univ. Valencia - Natural Language Engineering (ES) 15 Al. I. Cuza University of Iasi - Natural Language Processing (RO)
  • 30. Participants - 2009 track 15 participants
  • 31. Participants - 2009 track 15 participants 48 experiments submitted for the main task
  • 32. Participants - 2009 track 15 participants 48 experiments submitted for the main task 10 experiments submitted for the language tasks
  • 34. 2009-2010: evolution of the CLEF-IP track 2009 1 task: prior art search targeting granted patents 15 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 35. 2009-2010: evolution of the CLEF-IP track 2009 1 task: prior art search targeting granted patents 15 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 36. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search targeting granted patents 15 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 37. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents 15 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 38. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 39. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants 20 participants all from academia families and citations manual assessments standard evaluation mea- sures
  • 40. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants 20 participants all from academia 4 industrial participants families and citations manual assessments standard evaluation mea- sures
  • 41. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants 20 participants all from academia 4 industrial participants families and citations include forward citations manual assessments standard evaluation mea- sures
  • 42. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants 20 participants all from academia 4 industrial participants families and citations include forward citations manual assessments expanded lists of relevant docs standard evaluation mea- sures
  • 43. 2009-2010: evolution of the CLEF-IP track 2009 2010 1 task: prior art search prior art candidate search and classification task targeting granted patents patent applications 15 participants 20 participants all from academia 4 industrial participants families and citations include forward citations manual assessments expanded lists of relevant docs standard evaluation mea- new measure: pres, more sures recall-oriented
  • 44. What are relevance assessments A test collection (also known as gold standard) consists of a target dataset, a set of queries, and relevance assessments corresponding to each query.
  • 45. What are relevance assessments A test collection (also known as gold standard) consists of a target dataset, a set of queries, and relevance assessments corresponding to each query. The CLEF-IP test collection:
  • 46. What are relevance assessments A test collection (also known as gold standard) consists of a target dataset, a set of queries, and relevance assessments corresponding to each query. The CLEF-IP test collection: target data: 2 million EP patents
  • 47. What are relevance assessments A test collection (also known as gold standard) consists of a target dataset, a set of queries, and relevance assessments corresponding to each query. The CLEF-IP test collection: target data: 2 million EP patents queries: full-text patents (without images)
  • 48. What are relevance assessments A test collection (also known as gold standard) consists of a target dataset, a set of queries, and relevance assessments corresponding to each query. The CLEF-IP test collection: target data: 2 million EP patents queries: full-text patents (without images) relevance assessments: extended citations
  • 49. Relevance assessments We used patents cited as prior art as relevance assessments.
  • 50. Relevance assessments We used patents cited as prior art as relevance assessments. Sources of citations:
  • 51. Relevance assessments We used patents cited as prior art as relevance assessments. Sources of citations: 1 applicant’s disclosure: the Uspto requires applicants to disclose all known relevant publications
  • 52. Relevance assessments We used patents cited as prior art as relevance assessments. Sources of citations: 1 applicant’s disclosure: the Uspto requires applicants to disclose all known relevant publications 2 patent office search report: each patent office will do a search for prior art to judge the novelty of a patent
  • 53. Relevance assessments We used patents cited as prior art as relevance assessments. Sources of citations: 1 applicant’s disclosure: the Uspto requires applicants to disclose all known relevant publications 2 patent office search report: each patent office will do a search for prior art to judge the novelty of a patent 3 opposition procedures: patents cited to prove that a granted patent is not novel
  • 54. Extended citations as relevance assessments direct citations and their families
  • 55. Extended citations as relevance assessments direct citations of family members ...
  • 56. Extended citations as relevance assessments ... and their families
  • 57. Patent families A patent family consists of patents granted by different patent authorities but related to the same invention.
  • 58. Patent families A patent family consists of patents granted by different patent authorities but related to the same invention. simple family all family members share the same priority number
  • 59. Patent families A patent family consists of patents granted by different patent authorities but related to the same invention. simple family all family members share the same priority number extended family there are several definitions, in the INPADOC database all documents which are directly or indirectly linked via a priority number belong to the same family
  • 60. Patent families Patent documents are linked by priorities
  • 61. Patent families Patent documents are linked by INPADOC family. priorities
  • 62. Patent families Patent documents are linked by Clef–Ip uses simple families. priorities
  • 63. Relevance assessments 2010 Expanding the 2009 extended citations:
  • 64. Relevance assessments 2010 Expanding the 2009 extended citations: 1 include citations of forward citations ...
  • 65. Relevance assessments 2010 Expanding the 2009 extended citations: 1 include citations of forward citations ... 2 ... and their families
  • 66. Relevance assessments 2010 Expanding the 2009 extended citations: 1 include citations of forward citations ... 2 ... and their families This is apparently a well-known method among patent searchers.
  • 67. Relevance assessments 2010 Expanding the 2009 extended citations: 1 include citations of forward citations ... 2 ... and their families This is apparently a well-known method among patent searchers. Zig-zag search?
  • 68. How good are the CLEF-IP relevance assessments? CLEF-IP uses families + citations:
  • 69. How good are the CLEF-IP relevance assessments? how complete are extended citations as a relevance assessments?
  • 70. How good are the CLEF-IP relevance assessments? how complete are extended citations as a relevance assessments? will every prior art patent be included in this set?
  • 71. How good are the CLEF-IP relevance assessments? how complete are extended citations as a relevance assessments? will every prior art patent be included in this set? and if not, what percentage of prior art items are captured by extended citations?
  • 72. How good are the CLEF-IP relevance assessments? how complete are extended citations as a relevance assessments? will every prior art patent be included in this set? and if not, what percentage of prior art items are captured by extended citations? when considering forward citations, how good are extended citations as a prior art candidate set?
  • 73. Feedback from patent experts needed Quality of prior art candidate sets has to be assessed
  • 74. Feedback from patent experts needed Know-how of patent search experts is needed
  • 75. Feedback from patent experts needed at Clef–Ip 2009 7 patent search professionals assessed 12 search results
  • 76. Feedback from patent experts needed at Clef–Ip 2009 7 patent search professionals assessed 12 search results the task was not well defined and there were misunderstandings on the concept of relevance
  • 77. Feedback from patent experts needed at Clef–Ip 2009 7 patent search professionals assessed 12 search results the task was not well defined and there were misunderstandings on the concept of relevance amount of data was not sufficient to draw conclusions
  • 78. Feedback from patent experts needed
  • 79. Some initiatives associated with Clef–Ip The results of evaluation tracks are mostly useful for the research community.
  • 80. Some initiatives associated with Clef–Ip The results of evaluation tracks are mostly useful for the research community. This community often produces prototypes that are of little interest to the end-user.
  • 81. Some initiatives associated with Clef–Ip The results of evaluation tracks are mostly useful for the research community. This community often produces prototypes that are of little interest to the end-user. Next I’d like to present two concrete outcomes - not of Clef–Ip directly but arising from work in patent retrieval evaluation
  • 82. Soire
  • 83. Soire developed at Matrixware
  • 84. Soire developed at Matrixware service-oriented architecture - available as a a Web service
  • 85. Soire developed at Matrixware service-oriented architecture - available as a a Web service allows to replicate IR experiments based on classical evaluation model
  • 86. Soire developed at Matrixware service-oriented architecture - available as a a Web service allows to replicate IR experiments based on classical evaluation model tested on the CLEF-IP data
  • 87. Soire developed at Matrixware service-oriented architecture - available as a a Web service allows to replicate IR experiments based on classical evaluation model tested on the CLEF-IP data customized for the evaluation of machine translation
  • 89. Spinque a spin-off (2010) from CWI (the Dutch National Research Center in Computer Science and Mathematics)
  • 90. Spinque a spin-off (2010) from CWI (the Dutch National Research Center in Computer Science and Mathematics) introduces search-by-strategy
  • 91. Spinque a spin-off (2010) from CWI (the Dutch National Research Center in Computer Science and Mathematics) introduces search-by-strategy provides optimized strategies for patent search - tested on CLEF-IP data
  • 92. Spinque a spin-off (2010) from CWI (the Dutch National Research Center in Computer Science and Mathematics) introduces search-by-strategy provides optimized strategies for patent search - tested on CLEF-IP data transparency: understand your search results to improve strategy
  • 93. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results.
  • 94. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results. The model combined several strategies:
  • 95. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results. The model combined several strategies: using metadata (IPC, ECLA)
  • 96. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results. The model combined several strategies: using metadata (IPC, ECLA) indexes built at lemma level
  • 97. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results. The model combined several strategies: using metadata (IPC, ECLA) indexes built at lemma level an additional phrase index for English
  • 98. Clef–Ip 2009 learnings The Humboldt University implemented a model for patent search that produced the best results. The model combined several strategies: using metadata (IPC, ECLA) indexes built at lemma level an additional phrase index for English crosslingual concept index (multilingual terminological database)
  • 99. Some additional investigations Some citations were hard to find
  • 100. Some additional investigations % runs class ≤5 hard 5 < x ≤ 10 very difficult Some citations were hard to find 10 < x ≤ 50 difficult 50 < x ≤ 75 medium 75 < x ≤ 100 easy
  • 101. Some additional investigations We looked at the content of citations and citing patents.
  • 102. Some additional investigations Ongoing investigations.
  • 103. Thank you for your attention.