SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Improving Text Mining with Controlled
Natural Language:
A Case Study for Protein Interactions
Tobias Kuhn (speaker)
Loïc Royer
Norbert E. Fuchs
Michael Schroeder
DILS'06, Hinxton (UK)
21 July 2006
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2
Cooperation of
University of Zurich
(Norbert E. Fuchs, Tobias Kuhn)
and
TU Dresden
(Loïc Royer, Michael Schroeder)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3
Introduction
 Biomedical literature is growing at a
tremendous pace
 PubMed contains 16 million articles and
grows by over 600'000 articles per year
 Computational support is needed!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4
Today's Solution
NLP, manual
annotation
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5
Our Approach
 Let the researchers express their own
results in a formal language
 Perfect processing of scientific results by
computers
 This formal language has to be ...
 easy to learn and understand
 expressive enough to express even
complicated scientific results
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6
Knowledge Representation
Languages
OWL with RDF/XML
Description Logics
first-order logic
ACE
UML
has
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7
Attempto Controlled English
(ACE)
 Formal language that looks like natural
English
 Unambiguously translatable into first-
order logic
 Restricted grammar
 Unlimited vocabulary
 www.ifi.unizh.ch/attempto
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8
Formal Summaries
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9
Formal Summaries
BubR1 interacts-with a trunk-domain of Beta2-Adaptin.
[A, B, C, D]
named(A, BubR1)-1
object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
named(B, Beta2-Adaptin)-1
object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1
relation(C, trunk-domain, of, B)-1
predicate(D, unspecified, interact_with, A, C)-1
ACE text
Logical representation (DRS)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10
Ontology for Protein Interactions
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11
Empirical Study
 “How suitable is ACE together with our
ontology to express scientific results of
protein interactions?”
 Manual translation of 273 facts about
protein interactions
 These facts are subheadings of the
“Results”-sections of 89 articles (journals
by Elsevier)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12
Empirical Study
154
57
62
matched perfectly
matched partially
unmatched not covered by the model
relations of relations
fuzzy
21
56
11
31
not understood
Total: Non-perfect:
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13
Authoring tool
 Helps writing ACE sentences
 Shows step by step the possible
continuations of the sentence
 New words can be created on-the-fly
 Awareness of the underlying ontology
 The users do not need to know the details
of the ACE syntax and of the underlying
ontology
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14
Authoring tool:
Prototype demo
http://gopubmed.biotec.tu-dresden.de/AceWiki/
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15
Benefits of our Approach
 Consistency / redundancy checks
 “Is there a paper that contradicts my results?”
 “Is there a paper that comes to the same or similar
results?”
 Answer extraction
 “Which proteins interact with a certain domain of
protein X?”
 Automatically updated knowledge bases
 “Give me an overview of the relations of a protein X
to other proteins!”
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16
Conclusions
 Formal summaries for scientific articles
can make text mining easier and more
powerful
 ACE combines the power of ontologies
with the convenience of natural language
 Let the researchers formalize their own
results!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17
Thank you for your attention!
Questions
&
Discussion
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18
Subheadings: Example
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19
Degree of Matching: Examples
 Matched perfectly:
 Interaction of Act1 with TRAF6
 → Act1 interacts-with TRAF6.
 Matched partially:
 The mtFabD protein is part of the core of the FAS-II
complex
 → MtFabD is a subunit of FAS-II.
 Unmatched:
 Cav1 interacts differentially with distinct Dyn2 forms
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20
Reasons for Non-perfect
Matching: Examples
 Not covered by the model:
 Daxx Potentiates Fas-Mediated Apoptosis
 Relations of relations:
 Kal-GEF1 activation of Pak does not require GEF activity
 Fuzzy:
 ANKRD1 contains potential CASQ2 binding sequences
located in both its NT- and CT-regions
 Not understood:
 hSrb7 does not interact with other nuclear receptors

Weitere ähnliche Inhalte

Andere mochten auch

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsChris Willmott
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseThe Higher Education Academy
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Cary Gillenwater
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Studycahafner
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewStefanie Panke
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case studyKobi Vider
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisvibhabhagat2007
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...dr m m bagali, phd in hr
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Amanda Preston
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Zaana Jaclyn
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus Schoolnainadesh
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaDr Mohan Savade
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense PresentationDavid Onoue
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a SchoolSarthak Kaura
 

Andere mochten auch (19)

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethics
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case study
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
 

Ähnlich wie Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisAlison Specht
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualizationLars Juhl Jensen
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?hsls
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notationkhinsen
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourBria Davis
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated dataRobin Haunschild
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment ICZN
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1hcsc2016
 

Ähnlich wie Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions (11)

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualization
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
 
BACE1 inhibitor
BACE1 inhibitorBACE1 inhibitor
BACE1 inhibitor
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
 

Mehr von Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataTobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Tobias Kuhn
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksTobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 

Mehr von Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 

Kürzlich hochgeladen

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

  • 1. Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions Tobias Kuhn (speaker) Loïc Royer Norbert E. Fuchs Michael Schroeder DILS'06, Hinxton (UK) 21 July 2006
  • 2. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2 Cooperation of University of Zurich (Norbert E. Fuchs, Tobias Kuhn) and TU Dresden (Loïc Royer, Michael Schroeder)
  • 3. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3 Introduction  Biomedical literature is growing at a tremendous pace  PubMed contains 16 million articles and grows by over 600'000 articles per year  Computational support is needed!
  • 4. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4 Today's Solution NLP, manual annotation
  • 5. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5 Our Approach  Let the researchers express their own results in a formal language  Perfect processing of scientific results by computers  This formal language has to be ...  easy to learn and understand  expressive enough to express even complicated scientific results
  • 6. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6 Knowledge Representation Languages OWL with RDF/XML Description Logics first-order logic ACE UML has
  • 7. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7 Attempto Controlled English (ACE)  Formal language that looks like natural English  Unambiguously translatable into first- order logic  Restricted grammar  Unlimited vocabulary  www.ifi.unizh.ch/attempto
  • 8. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8 Formal Summaries
  • 9. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9 Formal Summaries BubR1 interacts-with a trunk-domain of Beta2-Adaptin. [A, B, C, D] named(A, BubR1)-1 object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 named(B, Beta2-Adaptin)-1 object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1 relation(C, trunk-domain, of, B)-1 predicate(D, unspecified, interact_with, A, C)-1 ACE text Logical representation (DRS)
  • 10. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10 Ontology for Protein Interactions
  • 11. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11 Empirical Study  “How suitable is ACE together with our ontology to express scientific results of protein interactions?”  Manual translation of 273 facts about protein interactions  These facts are subheadings of the “Results”-sections of 89 articles (journals by Elsevier)
  • 12. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12 Empirical Study 154 57 62 matched perfectly matched partially unmatched not covered by the model relations of relations fuzzy 21 56 11 31 not understood Total: Non-perfect:
  • 13. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13 Authoring tool  Helps writing ACE sentences  Shows step by step the possible continuations of the sentence  New words can be created on-the-fly  Awareness of the underlying ontology  The users do not need to know the details of the ACE syntax and of the underlying ontology
  • 14. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14 Authoring tool: Prototype demo http://gopubmed.biotec.tu-dresden.de/AceWiki/
  • 15. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15 Benefits of our Approach  Consistency / redundancy checks  “Is there a paper that contradicts my results?”  “Is there a paper that comes to the same or similar results?”  Answer extraction  “Which proteins interact with a certain domain of protein X?”  Automatically updated knowledge bases  “Give me an overview of the relations of a protein X to other proteins!”
  • 16. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16 Conclusions  Formal summaries for scientific articles can make text mining easier and more powerful  ACE combines the power of ontologies with the convenience of natural language  Let the researchers formalize their own results!
  • 17. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17 Thank you for your attention! Questions & Discussion
  • 18. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18 Subheadings: Example
  • 19. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19 Degree of Matching: Examples  Matched perfectly:  Interaction of Act1 with TRAF6  → Act1 interacts-with TRAF6.  Matched partially:  The mtFabD protein is part of the core of the FAS-II complex  → MtFabD is a subunit of FAS-II.  Unmatched:  Cav1 interacts differentially with distinct Dyn2 forms
  • 20. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20 Reasons for Non-perfect Matching: Examples  Not covered by the model:  Daxx Potentiates Fas-Mediated Apoptosis  Relations of relations:  Kal-GEF1 activation of Pak does not require GEF activity  Fuzzy:  ANKRD1 contains potential CASQ2 binding sequences located in both its NT- and CT-regions  Not understood:  hSrb7 does not interact with other nuclear receptors