SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Next-generation text-mining
applied to toxicogenomics data
            analysis

         Kristina Hettne
       PhD thesis defense


          20 December, 2012
Toxicogenomics: study if a chemical causes
 damage to genes

Text mining: teach a computer to “read”
 articles and extract explicit information

Next-generation text mining: teach a
 computer to find implicit information in
 articles
Drug safety is essential!
                                  But… how to minimize animal testing?




Image source: The Independent, July 12, 2012
Toxicogenomics data                                Interpretation using
                                                       knowledge from manually
                                                       curated databases




Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
Toxicogenomics data                                Interpretation using
                                                       knowledge from manually
                                                       curated databases




                                                       Not sufficient in coverage

     We hypothesize that next-generation text mining
     can increase the information coverage
Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
Next-generation text mining = concept profile
   matching
     Information cloud for
     a gene concept                   Shared concepts




                                                        Information cloud
                                                        for a chemical
                                                        concept




Image source: Herman van Haagen

                                  7
Concepts come from a thesaurus and are identified
   in text with concept identification software


   A good
   thesaurus =
   the basis for
   good concept
   identification



Image source: Herman van Haagen
Research objectives:
• Investigate information coverage in public
   biomedical and chemical thesauri and
   databases
• Provide methods to improve the quality
   and coverage
• Give recommendations for use
• Investigate added value of next-
   generation text mining when interpreting
   toxicogenomics data
                    9
Results




 10
A thesaurus of chemical concepts1 and
methods1,2,3 to prepare a thesaurus to be
used with concept identification software




http://www.biosemantics.org/casper http://www.biosemantics.org/jochem


1. Hettne et al. Bioinformatics, 2009
2. Hettne et al. Journal of Biomedical Semantics, 2010
                                        11
3. Hettne et al. Journal of Cheminformatics, 2010
A next-generation text mining-based method
   for interpreting biological data
                                                                         Next-generation
       Biological data                      Statistical test             text mining
                                                                                             12




     This method gives more, and more specific results1
     than other available tools
      http://www.biosemantics.org/weightedglobaltest

1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PA. Briefings in Bioinformatics, 2011
Application to toxicogenomics
                            Hettne et al. (submitted)
http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets
See developmental defects in stem cells instead of
       in animal embryos
                                                                          Embryonic
                                                                          structure
     1.



2.                                                                   Posterior neuropore open




     A) Control group rat embryo B)Triazole-exposed rat embryo
Image sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012
Toxicity class prediction (case study: Triazoles)
      25 times larger chemical-gene matrix compared to manual
      work (Comparative Toxicogenomics Database)
                                                     Chemical
     1.




Image source 1: Verhallen and Piersma, 2011
Conclusions
Next-generation text mining combined with
statistical tests complements, and is
sometimes superior to, manually curated
databases in:
- Relating chemical information to gene
   expression data
- Identifying toxic effects already at the
   gene expression stage
- Discriminating between different classes
   of chemicals
Future
1. Make the method easier to use
(currently being worked on)

2. Apply the method for new drugs
with unknown toxicity

Early prediction of toxicity ->
less animal testing and safer drugs
Thank you to all who made
      this possible!

Weitere ähnliche Inhalte

Was ist angesagt?

Prepare your Ph.D. Defense Presentation
Prepare your Ph.D. Defense PresentationPrepare your Ph.D. Defense Presentation
Prepare your Ph.D. Defense PresentationChristian Glahn
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentationnancyanne
 
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...Rahul Kamble
 
Dissertation proposal defense slideshow; phenomenology, qualitative
Dissertation proposal defense slideshow; phenomenology, qualitativeDissertation proposal defense slideshow; phenomenology, qualitative
Dissertation proposal defense slideshow; phenomenology, qualitativeCorey Caugherty
 
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, DBA, PMP
 
Dissertation defense ppt
Dissertation defense ppt Dissertation defense ppt
Dissertation defense ppt Dr. James Lake
 
Dissertation Defense Presentation
Dissertation Defense PresentationDissertation Defense Presentation
Dissertation Defense PresentationAvril El-Amin
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentationosideloc
 
Dissertation defense power point
Dissertation defense power pointDissertation defense power point
Dissertation defense power pointKelly Dodson
 
Ashbaugh dissertation defense presentation
Ashbaugh dissertation defense presentationAshbaugh dissertation defense presentation
Ashbaugh dissertation defense presentationDRMLAID
 
M.S. Thesis Defense
M.S. Thesis DefenseM.S. Thesis Defense
M.S. Thesis Defensepbecker1987
 
Doctorate Dissertation Proposal
Doctorate Dissertation ProposalDoctorate Dissertation Proposal
Doctorate Dissertation ProposalMaurice Dawson
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaDr Mohan Savade
 
Proposal defense presentation
Proposal defense presentationProposal defense presentation
Proposal defense presentationRuchika Mehresh
 
My Dissertation Proposal Defense
My Dissertation Proposal DefenseMy Dissertation Proposal Defense
My Dissertation Proposal DefenseLaura Pasquini
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense PresentationDavid Onoue
 

Was ist angesagt? (20)

Prepare your Ph.D. Defense Presentation
Prepare your Ph.D. Defense PresentationPrepare your Ph.D. Defense Presentation
Prepare your Ph.D. Defense Presentation
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentation
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
 
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...
PhD Viva presentation /PhD thesis defense/Environmental Science thesis presen...
 
Dissertation proposal defense slideshow; phenomenology, qualitative
Dissertation proposal defense slideshow; phenomenology, qualitativeDissertation proposal defense slideshow; phenomenology, qualitative
Dissertation proposal defense slideshow; phenomenology, qualitative
 
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
 
Dissertation defense ppt
Dissertation defense ppt Dissertation defense ppt
Dissertation defense ppt
 
Dissertation Defense Presentation
Dissertation Defense PresentationDissertation Defense Presentation
Dissertation Defense Presentation
 
PhD Viva PPT
PhD Viva PPTPhD Viva PPT
PhD Viva PPT
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentation
 
PhD Thesis Defense Presentation
PhD Thesis Defense PresentationPhD Thesis Defense Presentation
PhD Thesis Defense Presentation
 
Dissertation defense power point
Dissertation defense power pointDissertation defense power point
Dissertation defense power point
 
Ashbaugh dissertation defense presentation
Ashbaugh dissertation defense presentationAshbaugh dissertation defense presentation
Ashbaugh dissertation defense presentation
 
M.S. Thesis Defense
M.S. Thesis DefenseM.S. Thesis Defense
M.S. Thesis Defense
 
Doctorate Dissertation Proposal
Doctorate Dissertation ProposalDoctorate Dissertation Proposal
Doctorate Dissertation Proposal
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Proposal defense presentation
Proposal defense presentationProposal defense presentation
Proposal defense presentation
 
My Dissertation Proposal Defense
My Dissertation Proposal DefenseMy Dissertation Proposal Defense
My Dissertation Proposal Defense
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Research proposal presentation
Research proposal presentationResearch proposal presentation
Research proposal presentation
 

Ähnlich wie PhD thesis presentation

DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeMelanie Swan
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmMangaiK4
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmMangaiK4
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsPragya Pai
 
Bioinformatics-General_Intro
Bioinformatics-General_IntroBioinformatics-General_Intro
Bioinformatics-General_IntroAbhiroop Ghatak
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataYannick Pouliot
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsJTADrexel
 
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docxdomenicacullison
 
BioVariance Services Flyer
BioVariance Services FlyerBioVariance Services Flyer
BioVariance Services FlyerJosef Scheiber
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
 
Research trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxResearch trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxImtiajChowdhuryEham
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and BioinformaticsSharif Shuvo
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformaticaMartín Arrieta
 

Ähnlich wie PhD thesis presentation (20)

DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the Genome
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
 
Mrr iti phar_mu
Mrr iti phar_muMrr iti phar_mu
Mrr iti phar_mu
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Bioinformatics-General_Intro
Bioinformatics-General_IntroBioinformatics-General_Intro
Bioinformatics-General_Intro
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological Data
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
 
BioVariance Services Flyer
BioVariance Services FlyerBioVariance Services Flyer
BioVariance Services Flyer
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
David
DavidDavid
David
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Research trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxResearch trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docx
 
Bioinformatics .pptx
Bioinformatics .pptxBioinformatics .pptx
Bioinformatics .pptx
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and Bioinformatics
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
Computational biology
Computational biologyComputational biology
Computational biology
 

PhD thesis presentation

  • 1. Next-generation text-mining applied to toxicogenomics data analysis Kristina Hettne PhD thesis defense 20 December, 2012
  • 2. Toxicogenomics: study if a chemical causes damage to genes Text mining: teach a computer to “read” articles and extract explicit information Next-generation text mining: teach a computer to find implicit information in articles
  • 3.
  • 4. Drug safety is essential! But… how to minimize animal testing? Image source: The Independent, July 12, 2012
  • 5. Toxicogenomics data Interpretation using knowledge from manually curated databases Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  • 6. Toxicogenomics data Interpretation using knowledge from manually curated databases Not sufficient in coverage We hypothesize that next-generation text mining can increase the information coverage Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  • 7. Next-generation text mining = concept profile matching Information cloud for a gene concept Shared concepts Information cloud for a chemical concept Image source: Herman van Haagen 7
  • 8. Concepts come from a thesaurus and are identified in text with concept identification software A good thesaurus = the basis for good concept identification Image source: Herman van Haagen
  • 9. Research objectives: • Investigate information coverage in public biomedical and chemical thesauri and databases • Provide methods to improve the quality and coverage • Give recommendations for use • Investigate added value of next- generation text mining when interpreting toxicogenomics data 9
  • 11. A thesaurus of chemical concepts1 and methods1,2,3 to prepare a thesaurus to be used with concept identification software http://www.biosemantics.org/casper http://www.biosemantics.org/jochem 1. Hettne et al. Bioinformatics, 2009 2. Hettne et al. Journal of Biomedical Semantics, 2010 11 3. Hettne et al. Journal of Cheminformatics, 2010
  • 12. A next-generation text mining-based method for interpreting biological data Next-generation Biological data Statistical test text mining 12 This method gives more, and more specific results1 than other available tools http://www.biosemantics.org/weightedglobaltest 1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PA. Briefings in Bioinformatics, 2011
  • 13. Application to toxicogenomics Hettne et al. (submitted) http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets
  • 14. See developmental defects in stem cells instead of in animal embryos Embryonic structure 1. 2. Posterior neuropore open A) Control group rat embryo B)Triazole-exposed rat embryo Image sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012
  • 15. Toxicity class prediction (case study: Triazoles) 25 times larger chemical-gene matrix compared to manual work (Comparative Toxicogenomics Database) Chemical 1. Image source 1: Verhallen and Piersma, 2011
  • 16. Conclusions Next-generation text mining combined with statistical tests complements, and is sometimes superior to, manually curated databases in: - Relating chemical information to gene expression data - Identifying toxic effects already at the gene expression stage - Discriminating between different classes of chemicals
  • 17. Future 1. Make the method easier to use (currently being worked on) 2. Apply the method for new drugs with unknown toxicity Early prediction of toxicity -> less animal testing and safer drugs
  • 18. Thank you to all who made this possible!