SlideShare ist ein Scribd-Unternehmen logo
1 von 60
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A Literature based framework for semantic descriptions of e-Science resources [email_address]
Who am I ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
e-Science Perspective ,[object Object],[object Object],[object Object],[object Object],[object Object]
e-Science Resources ,[object Object],[object Object],[object Object],[object Object]
Semantic Web ,[object Object],[object Object]
Semantic Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bioinformatics  e-Resources ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bioinformatics e-Resources
Semantic Descriptions of Bioinformatics e-Resources ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
BioCatalogue Beta version at http://beta.biocatalogue.org/ Launch June 2009 at ISMB
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Semantic Descriptions in Bioinformatics Domain
Our approach – Mine the literature Literature:  Still the largest and most popular source of knowledge. Hypothesis : The semantic profiles of entities and events can be extracted from the domain literature.
Example Semantically Annotated  Web Service Annotations combine  textual descriptions  ontological mappings text
Detailed approach
The rest of the talk ,[object Object],[object Object],[object Object],[object Object],[object Object]
1 st  Module  Building Controlled Vocabulary from Literature
 
Terminology Building ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Controlled Vocabulary Building – a challenging task ,[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Controlled Vocabulary Building – Solution
Building controlled vocabulary from literature
Term Classification driven approach 1) get a corpus 2) get all terms 3) get seed examples 4) find relevant ones   using term profiling    and comparison to    seed examples Learn bioinformatics terms from literature
Bioinformatics terminology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bioinformatics terminology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Lexical Profile Term (t) Lexical Profile LP(t) protein (1) Protein Protein sequence (1) protein (2) sequence (3) protein sequence protein sequence alignment ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contextual Profile Verb Profile Produce Noun Profile genscan, program, list, transcript Left Pattern (LP) Class-Level (LP 1 ) <Term> , produce,  <NP> ,  of Right Pattern (RP) Class-Level (RP 1 ) of,  <NP> Sentence Genscan program  node can  produce  a  list  of  nucleotide FASTAs  of predicted transcripts
Profile Comparisons
Bioinformatics terminology ,[object Object]
Statistics about textual corpus Full Text Articles # of documents 2,691 # of distinct candidate terms 113,280 # of candidate term occurrences  533,418 # of distinct sentences 294,614 # of distinct context noun stems ~79,000 # of distinct context verb stems ~2,500
The Bioinformatics Controlled Vocabulary Number of Terms ATR (C-Value) – total number of candidate terms 113,280 Number of terms with  lexical similarity  to resource terms 95,437 Number of terms with  context noun similarity  to resource terms 103,104 Number of terms with  context verb similarity  to resource terms 73,478 Number of terms with  context pattern similarity  to resource terms 21,182 Number of terms with  combined contextual similarity  (Nouns  ∪  Verbs  ∪  Patterns) 98,307
2 nd  Module  Mining Semantic Descriptions from Literature
 
Mining service descriptions
 
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Semantic classes – myGrid Ontology
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Semantic classes – myGrid Ontology
Semantic classes identification ,[object Object],Semantic class Typical terminological heads Application application, tool, service, software, system, program Algorithm algorithm, method, approach, procedure, analysis, alignment Data data, record, report, sequence, structure Data Resource resource, database, dataset, repository
Resource mentions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Semantic classes and instances
Semantic classes and instances
 
Extraction/functional rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],“ Matrix Global Alignment Tool MatGAT  generates similarity/identity matrices for DNA or protein sequences” “ Term_App  generates similarity/identity matrices for DNA or protein sequences”
Extraction/functional rules ,[object Object],[object Object],[object Object],[object Object],“ Matrix Global Alignment Tool MatGAT  generates similarity/identity matrices for DNA or protein sequences” “ Term_App  generates similarity/identity matrices for DNA or protein sequences”
Extraction/functional rules
Extraction/functional rules ,[object Object],Function Associated verbs Generic functionality/ Task specification applied, access, achieve, align, allow, based, developed, implemented, present, provide, used, is a, called Inputs, outputs accept, applied, create, provide, query, retrieve, starts with, take, used, generate Comparison outperform, perform, compare Implementation technique, Programming language implement(ed) Composition, subtasks contain(ed), construct(ed), generate(d) Availability available
Information Extraction Input Sentence:  “ Matrix Global Alignment Tool MatGAT generates similarity/identity matrices for DNA or protein sequences” SC instance (resource) Matrix Global Alignment Tool MatGAT SC Application Task Generate Predicted input DNA or protein sequences Predicted output similarity/identity matrices Descriptors similarity/identity matrices, DNA or protein sequences
 
Experiments ,[object Object],[object Object],[object Object],Semantic Class Total # of instances Algorithm 5,722 Application 2,076 Data 2,662 Data Resource 1,992 Total 12,452
Example – GeneClass  ,[object Object],Descriptors Frequency of co-occurrence motif data 4 differential gene expression 3 reliable predictive model 2 genome-wide protein-DNA binding data 2 transcriptional gene regulation 2 gene expression data 1 2) MyGrid terms BIND 3) Related resources Robust GeneClass Algorithm
Example – GeneClass  Functional Content Predicate (Task) Subject Functional Description  Input/Output predict GeneClass Algorithm predicting  differential gene expression starts with a candidate set of motifs x003bc
Example – GeneClass  ,[object Object],[object Object],[object Object],[object Object]
Evaluated for their capability to be used for semantic description of a given bioinformatics resource (0)   irrelevant (1)  partially useful  (2)  useful HeatMapper The HeatMapper tool has already proven to be very useful in several studies Kalign To compare Kalign to other MSA programs, the following test sets were used.  Cognitor To add a new species to the COG system, the annotated protein sequences from the respective genome were compared to the proteins in the COG database by using the BLAST program and assigned to pre-existing COGs by using the COGNITOR program Evaluation of semantic profiles
[object Object],[object Object],[object Object],Evaluation of semantic profiles Quality comparison of various components of  resource description profiles from the two experiments
3 rd  Module  Mining Semantic Networks from Literature
 
What next? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Next ? (Proposed in BioHackathon2010) Phylogenetic trees  are then generated by the  ClustalW program  by the neighbour-joining method  [PMC1973088] . We also used the  CLUSTALW program  for  multialignment  as a control process [PMC434493] . Resource1 Resource2 Resource3 Phylogenetic Tree ClustalW Program Multialignment RDF Store # Data # Task Phylogenetic Tree Generated by ClustalW Program Multialignment Is used for
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Selected Publications ,[object Object],[object Object],[object Object],[object Object],[object Object]
Thanks

Weitere ähnliche Inhalte

Was ist angesagt?

Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological databaseKAUSHAL SAHU
 
Biological databases
Biological databasesBiological databases
Biological databasesQamar iqbal
 
A guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesA guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesYannick Pouliot
 
Databases in Bioinformatics
Databases in BioinformaticsDatabases in Bioinformatics
Databases in BioinformaticsMeghaj Mallick
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Sreekanth Gali
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEPrashantSharma807
 
Biological databases
Biological databasesBiological databases
Biological databasesAshfaq Ahmad
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)ZoufishanY
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Data retreival system
Data retreival systemData retreival system
Data retreival systemShikha Thakur
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database SystemsDenis Shestakov
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformaticsnadeem akhter
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 

Was ist angesagt? (20)

Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Prosite
PrositeProsite
Prosite
 
A guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databasesA guided SQL tour of bioinformatics databases
A guided SQL tour of bioinformatics databases
 
Databases in Bioinformatics
Databases in BioinformaticsDatabases in Bioinformatics
Databases in Bioinformatics
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Biological databases
Biological databasesBiological databases
Biological databases
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
NCBI
NCBINCBI
NCBI
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
 
Ddbj
DdbjDdbj
Ddbj
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 

Ähnlich wie Literature Based Framework for Semantic Descriptions of e-Science resources

Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrievalunyil96
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningIOSR Journals
 
G04124041046
G04124041046G04124041046
G04124041046IOSR-JEN
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud ComputingCarmen Sanborn
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYcscpconf
 
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...IJwest
 
6.domain extraction from research papers
6.domain extraction from research papers6.domain extraction from research papers
6.domain extraction from research papersEditorJST
 
Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Sreekanth Gali
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAijistjournal
 
Identifying the semantic relations on
Identifying the semantic relations onIdentifying the semantic relations on
Identifying the semantic relations onijistjournal
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document ClassificationIDES Editor
 
Novel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information ExtractionNovel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information Extractionijsrd.com
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionKent State University
 
Multimedia information retrieval using artificial neural network
Multimedia information retrieval using artificial neural networkMultimedia information retrieval using artificial neural network
Multimedia information retrieval using artificial neural networkIAESIJAI
 
download
downloaddownload
downloadbutest
 
download
downloaddownload
downloadbutest
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003robertstevens65
 

Ähnlich wie Literature Based Framework for Semantic Descriptions of e-Science resources (20)

Semantic annotation of biomedical data
Semantic annotation of biomedical dataSemantic annotation of biomedical data
Semantic annotation of biomedical data
 
Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrieval
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
 
G04124041046
G04124041046G04124041046
G04124041046
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
 
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
 
6.domain extraction from research papers
6.domain extraction from research papers6.domain extraction from research papers
6.domain extraction from research papers
 
Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
 
Identifying the semantic relations on
Identifying the semantic relations onIdentifying the semantic relations on
Identifying the semantic relations on
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
 
Novel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information ExtractionNovel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information Extraction
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: Introduction
 
Multimedia information retrieval using artificial neural network
Multimedia information retrieval using artificial neural networkMultimedia information retrieval using artificial neural network
Multimedia information retrieval using artificial neural network
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003
 

Kürzlich hochgeladen

The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Kürzlich hochgeladen (20)

The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Literature Based Framework for Semantic Descriptions of e-Science resources

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 9.
  • 10. BioCatalogue Beta version at http://beta.biocatalogue.org/ Launch June 2009 at ISMB
  • 11.
  • 12. Our approach – Mine the literature Literature: Still the largest and most popular source of knowledge. Hypothesis : The semantic profiles of entities and events can be extracted from the domain literature.
  • 13. Example Semantically Annotated Web Service Annotations combine textual descriptions ontological mappings text
  • 15.
  • 16. 1 st Module Building Controlled Vocabulary from Literature
  • 17.  
  • 18.
  • 19.
  • 20.
  • 22. Term Classification driven approach 1) get a corpus 2) get all terms 3) get seed examples 4) find relevant ones using term profiling and comparison to seed examples Learn bioinformatics terms from literature
  • 23.
  • 24.
  • 25.
  • 26. Contextual Profile Verb Profile Produce Noun Profile genscan, program, list, transcript Left Pattern (LP) Class-Level (LP 1 ) <Term> , produce, <NP> , of Right Pattern (RP) Class-Level (RP 1 ) of, <NP> Sentence Genscan program node can produce a list of nucleotide FASTAs of predicted transcripts
  • 28.
  • 29. Statistics about textual corpus Full Text Articles # of documents 2,691 # of distinct candidate terms 113,280 # of candidate term occurrences 533,418 # of distinct sentences 294,614 # of distinct context noun stems ~79,000 # of distinct context verb stems ~2,500
  • 30. The Bioinformatics Controlled Vocabulary Number of Terms ATR (C-Value) – total number of candidate terms 113,280 Number of terms with lexical similarity to resource terms 95,437 Number of terms with context noun similarity to resource terms 103,104 Number of terms with context verb similarity to resource terms 73,478 Number of terms with context pattern similarity to resource terms 21,182 Number of terms with combined contextual similarity (Nouns ∪ Verbs ∪ Patterns) 98,307
  • 31. 2 nd Module Mining Semantic Descriptions from Literature
  • 32.  
  • 34.  
  • 35.
  • 36.
  • 37.
  • 38.
  • 39. Semantic classes and instances
  • 40. Semantic classes and instances
  • 41.  
  • 42.
  • 43.
  • 45.
  • 46. Information Extraction Input Sentence: “ Matrix Global Alignment Tool MatGAT generates similarity/identity matrices for DNA or protein sequences” SC instance (resource) Matrix Global Alignment Tool MatGAT SC Application Task Generate Predicted input DNA or protein sequences Predicted output similarity/identity matrices Descriptors similarity/identity matrices, DNA or protein sequences
  • 47.  
  • 48.
  • 49.
  • 50. Example – GeneClass Functional Content Predicate (Task) Subject Functional Description Input/Output predict GeneClass Algorithm predicting differential gene expression starts with a candidate set of motifs x003bc
  • 51.
  • 52. Evaluated for their capability to be used for semantic description of a given bioinformatics resource (0) irrelevant (1) partially useful (2) useful HeatMapper The HeatMapper tool has already proven to be very useful in several studies Kalign To compare Kalign to other MSA programs, the following test sets were used. Cognitor To add a new species to the COG system, the annotated protein sequences from the respective genome were compared to the proteins in the COG database by using the BLAST program and assigned to pre-existing COGs by using the COGNITOR program Evaluation of semantic profiles
  • 53.
  • 54. 3 rd Module Mining Semantic Networks from Literature
  • 55.  
  • 56.
  • 57. What Next ? (Proposed in BioHackathon2010) Phylogenetic trees are then generated by the ClustalW program by the neighbour-joining method [PMC1973088] . We also used the CLUSTALW program for multialignment as a control process [PMC434493] . Resource1 Resource2 Resource3 Phylogenetic Tree ClustalW Program Multialignment RDF Store # Data # Task Phylogenetic Tree Generated by ClustalW Program Multialignment Is used for
  • 58.
  • 59.

Hinweis der Redaktion

  1. A brief introduction of my recent affiliations
  2. This slide can be replaced with James’
  3. Mention that this example is taken from myGrid project.
  4. The volume of knowledge being generated in different research domains is increasing, with new concepts and terms being added continuously. Therefore, automated methods are required to automatically distil information, extract facts, discover implicit links and generate hypotheses relevant to user’s needs. Automatic acquisition of knowledge from unstructured text typically starts with the identification of terminology relevant for a specific domain, topic or task. Terms provide a means of communication, and it is the terms and their relationships that convey knowledge across scientific articles in particular (Krauthammer and Nenadic 2004). Terms are usually structurally organised not only to help information retrieval and extraction, but also to facilitate the smooth expansion of terminology where newly discovered terms/concepts are integrated into an existing taxonomy.
  5. The volume of knowledge being generated in different research domains is increasing, with new concepts and terms being added continuously. Therefore, automated methods are required to automatically distil information, extract facts, discover implicit links and generate hypotheses relevant to user’s needs. Automatic acquisition of knowledge from unstructured text typically starts with the identification of terminology relevant for a specific domain, topic or task. Terms provide a means of communication, and it is the terms and their relationships that convey knowledge across scientific articles in particular (Krauthammer and Nenadic 2004). Terms are usually structurally organised not only to help information retrieval and extraction, but also to facilitate the smooth expansion of terminology where newly discovered terms/concepts are integrated into an existing taxonomy.