SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Biomedical Annotation

Kevin Livingston, Ph.D.
Postdoctoral Fellow
Pharmacology Department, School of Medicine
University of Colorado Anschutz Medical Campus




                                        Kevin.Livingston@ucdenver.edu
                      http://compbio.ucdenver.edu/Hunter_lab/Livingston
Biomedical researchers are interested in
understanding their data in the context of
   all known background knowledge:
     curated databases & literature.




                                             2
Pubmed Growth Rate
                          1100                                                                                                    25

                          1000
                                                                                                 y = ~e0.0405x
                           900                                                                    R² = 0.99
                                                                                                                                  20
                           800
New Entries (thousands)




                                                                                                                                       Total Entries (millions)
                           700
                                                                                                                                  15
                           600

                           500                                                                      y = ~e0.0402x
                                                                                                     R² = 0.94                    10
                           400

                           300

                           200                                                                                                    5

                           100
                                                                                                                                                                  2 journal
                             0                                                                                                    0
                                                                                                                                                                   articles
                                                                                                                                                                     per
                                 1987

                                        1989

                                                1991

                                                       1993

                                                              1995

                                                                     1997

                                                                            1999

                                                                                   2001

                                                                                          2003

                                                                                                    2005

                                                                                                           2007

                                                                                                                    2009

                                                                                                                           2011
                                               973,499 PubMed entries in 2011 (>2,600 per day)                                                                    minute!

                                                                                                                                                                              3
Biomedical Data Sources
                  Total Manual GO
                    Annotations:
                     1,116,848

        1,380       Total GO
      Database     Annotations:
      s in 2012    132,425,702

                  PubMed Articles
                   Referenced:
                     94,518

                                    4
Annotation Consumers?
• The linguistic community typically uses
  annotation as training data or for specific tasks
  – An abundance of tools that can produce annotations
    in the specific format of those resources
  – Tools for computational linguistics
• Biomedical annotation typically used for
  curating, indexing, or enrichment analysis



• But what about re-using annotations and tools in
  other contexts and for other purposes?
                                                         5
6
Vision
                           Intelligent
   DBs
                           Application
                                s
Ontologies   Knowledge
               Base



             Text Mining
  Texts

                                         7
Applications: Gene Centric




                             8
Applications: Document
        Centric




                         9
Annotation for Computation
• Computer understandable
• Composable
• Provenance of compositions traceable




                                         10
CRAFT:
       Colorado Richly Annotated Full Text corpus
http://bionlp-
corpora.sourceforge.net/CRAFT/

•   67 full text articles (+30 more reserved for future testing)
•   >560,000 Tokens
•   >21,000 Sentences

•   ~100,000 concept annotations to
    7 different biomedical ontologies/terminologies
•   Penn Treebank markup for each sentence

•   Multiple output formats available
                                                                   11
CRAFT Annotation

 hemopoiesis       has agent             results in regulation by    transcription
                                         entity that has function    corepressor
                                                                         activity
                                           biological
                binding                   regulation         transcription
                                    results in
                                       protein                coactivator
            results in              regulation by               activity
            interaction of
                                                     regulates
         DNA                   protein        transcription



Hematopoiesis is precisely orchestrated by lineage-specific
         DNA-binding proteins that regulate transcription
                                  in concert with coactivators and corepressors.

         GO     GO
CHEBI                     SO relation                                            12
         BP     MF
Applications: Annotating




                           13
Compositional Annotation
                   & Knowledge
                                        vertebrate
                                       pigmentatio
                                             n
                    occurs_in                 denotes      subClassOf
                                    text annotation 3

               TAXON:7742                                  GO:0043474
                                basedO          basedO
                Vertebrata                                 pigmentation
                                n               n
                 hasBody                                     hasBody
   CRAFT
PMID:1473718           text annotation 1      text annotation 2
      3              hasTarget                        hasTarget




                                                                          14
Summary
• Model that covers syntactic and semantic
  annotation
  – Linguistic annotation
  – Semantic annotation
  – Entity-based annotation
• Capture complex content that is not necessarily
  best represented via a single URI
  – Created a GraphAnnotation
    that denotes a RDF named graph
• Add kiao:basedOn to enable annotation
  compositions and provenance tracking
  – Annotation-level
                                                    15
Acknowledgements
University of Colorado:   •   National ICT Australia
• Hunter Lab                  – Karin Verspoor
   –   Larry Hunter
   –   Mike Bada          •   Funding:
   –   Bill Baumgartner       – NIH/NLM training grant
   –   Chris Roeder           – Andrew W. Mellon Foundation
   –   Kevin Cohen
   –   Carsten Goerg




                                                              16
Biomedical Annotation

Kevin Livingston, Ph.D.
Postdoctoral Fellow
Pharmacology Department, School of Medicine
University of Colorado Anschutz Medical Campus




                                        Kevin.Livingston@ucdenver.edu
                      http://compbio.ucdenver.edu/Hunter_lab/Livingston

Weitere ähnliche Inhalte

Was ist angesagt?

Rural Road Safety
Rural Road SafetyRural Road Safety
Rural Road SafetyRPO America
 
Russian insurance market growth perspectives and main directions of investmen...
Russian insurance market growth perspectives and main directions of investmen...Russian insurance market growth perspectives and main directions of investmen...
Russian insurance market growth perspectives and main directions of investmen...РОСГОССТРАХ
 
Canopy management tree training & crop loading – opportunities to learn fro...
Canopy management   tree training & crop loading – opportunities to learn fro...Canopy management   tree training & crop loading – opportunities to learn fro...
Canopy management tree training & crop loading – opportunities to learn fro...MacadamiaSociety
 
Sustainable growth in a sustained crisis - the business model as a tool to in...
Sustainable growth in a sustained crisis - the business model as a tool to in...Sustainable growth in a sustained crisis - the business model as a tool to in...
Sustainable growth in a sustained crisis - the business model as a tool to in...Kasper Roldsgaard
 
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...InSync2011
 
Kevin Ms Web Platform
Kevin Ms Web PlatformKevin Ms Web Platform
Kevin Ms Web Platformrsnarayanan
 
2007* Airline Marketing Embraer Day 2007
2007* Airline Marketing Embraer Day 20072007* Airline Marketing Embraer Day 2007
2007* Airline Marketing Embraer Day 2007Embraer RI
 
Determinants of Cattle Prices in Ethiopia
Determinants of Cattle Prices in EthiopiaDeterminants of Cattle Prices in Ethiopia
Determinants of Cattle Prices in Ethiopiaessp2
 
Status of shs in bangladesh spva presentation
Status of shs in bangladesh   spva presentationStatus of shs in bangladesh   spva presentation
Status of shs in bangladesh spva presentationTuong Do
 
Sl&Et Automation Eng
Sl&Et Automation EngSl&Et Automation Eng
Sl&Et Automation Engtseener
 
SADTU - Institutional Management and Governance
SADTU - Institutional Management and GovernanceSADTU - Institutional Management and Governance
SADTU - Institutional Management and GovernanceEducation Moving Up Cc.
 
Measuring maize (Zea mays L.) cultivar coefficients for modeling water-limit...
Measuring maize (Zea mays L.) cultivar  coefficients for modeling water-limit...Measuring maize (Zea mays L.) cultivar  coefficients for modeling water-limit...
Measuring maize (Zea mays L.) cultivar coefficients for modeling water-limit...RUFORUM
 
Corporate presentation november_2011_4
Corporate presentation november_2011_4Corporate presentation november_2011_4
Corporate presentation november_2011_4NALenergy
 
[Challenge:Future] Disrupt your world: The Future of Work
[Challenge:Future] Disrupt your world: The Future of Work[Challenge:Future] Disrupt your world: The Future of Work
[Challenge:Future] Disrupt your world: The Future of WorkChallenge:Future
 
WCED Institutional Management and Governance
WCED Institutional Management and GovernanceWCED Institutional Management and Governance
WCED Institutional Management and GovernanceEducation Moving Up Cc.
 
Corporate presentation december
Corporate presentation   decemberCorporate presentation   december
Corporate presentation decemberMPX_RI
 
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010Enterprise and Government Mobility Solutions - Market Update & Outlook 2010
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010VDC Research Group
 

Was ist angesagt? (20)

Road, Safety, and Health - Is There a Disconnect?
Road, Safety, and Health - Is There a Disconnect?Road, Safety, and Health - Is There a Disconnect?
Road, Safety, and Health - Is There a Disconnect?
 
Rural Road Safety
Rural Road SafetyRural Road Safety
Rural Road Safety
 
Russian insurance market growth perspectives and main directions of investmen...
Russian insurance market growth perspectives and main directions of investmen...Russian insurance market growth perspectives and main directions of investmen...
Russian insurance market growth perspectives and main directions of investmen...
 
Canopy management tree training & crop loading – opportunities to learn fro...
Canopy management   tree training & crop loading – opportunities to learn fro...Canopy management   tree training & crop loading – opportunities to learn fro...
Canopy management tree training & crop loading – opportunities to learn fro...
 
Sustainable growth in a sustained crisis - the business model as a tool to in...
Sustainable growth in a sustained crisis - the business model as a tool to in...Sustainable growth in a sustained crisis - the business model as a tool to in...
Sustainable growth in a sustained crisis - the business model as a tool to in...
 
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...
Databse & Technology 2 _ Paul Guerin _ The biggest looser database - a boot c...
 
Summer Institute 2012: Nebraska Office of Highway Safety
Summer Institute 2012: Nebraska Office of Highway SafetySummer Institute 2012: Nebraska Office of Highway Safety
Summer Institute 2012: Nebraska Office of Highway Safety
 
Kevin Ms Web Platform
Kevin Ms Web PlatformKevin Ms Web Platform
Kevin Ms Web Platform
 
2007* Airline Marketing Embraer Day 2007
2007* Airline Marketing Embraer Day 20072007* Airline Marketing Embraer Day 2007
2007* Airline Marketing Embraer Day 2007
 
Determinants of Cattle Prices in Ethiopia
Determinants of Cattle Prices in EthiopiaDeterminants of Cattle Prices in Ethiopia
Determinants of Cattle Prices in Ethiopia
 
Status of shs in bangladesh spva presentation
Status of shs in bangladesh   spva presentationStatus of shs in bangladesh   spva presentation
Status of shs in bangladesh spva presentation
 
Sl&Et Automation Eng
Sl&Et Automation EngSl&Et Automation Eng
Sl&Et Automation Eng
 
SADTU - Institutional Management and Governance
SADTU - Institutional Management and GovernanceSADTU - Institutional Management and Governance
SADTU - Institutional Management and Governance
 
Measuring maize (Zea mays L.) cultivar coefficients for modeling water-limit...
Measuring maize (Zea mays L.) cultivar  coefficients for modeling water-limit...Measuring maize (Zea mays L.) cultivar  coefficients for modeling water-limit...
Measuring maize (Zea mays L.) cultivar coefficients for modeling water-limit...
 
Corporate presentation november_2011_4
Corporate presentation november_2011_4Corporate presentation november_2011_4
Corporate presentation november_2011_4
 
[Challenge:Future] Disrupt your world: The Future of Work
[Challenge:Future] Disrupt your world: The Future of Work[Challenge:Future] Disrupt your world: The Future of Work
[Challenge:Future] Disrupt your world: The Future of Work
 
WCED Institutional Management and Governance
WCED Institutional Management and GovernanceWCED Institutional Management and Governance
WCED Institutional Management and Governance
 
Corporate presentation december
Corporate presentation   decemberCorporate presentation   december
Corporate presentation december
 
Uars status
Uars statusUars status
Uars status
 
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010Enterprise and Government Mobility Solutions - Market Update & Outlook 2010
Enterprise and Government Mobility Solutions - Market Update & Outlook 2010
 

Andere mochten auch

An Annotation Framework for Fedora
An Annotation Framework for FedoraAn Annotation Framework for Fedora
An Annotation Framework for Fedoraandyashton
 
Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Timothy Cole
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlfjjett2
 
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureNikola Milosevic
 
Open Annotation Core Data Model (tutorial)
Open Annotation Core Data Model (tutorial)Open Annotation Core Data Model (tutorial)
Open Annotation Core Data Model (tutorial)Robert Sanderson
 
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureII-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureDr. Haxel Consult
 

Andere mochten auch (6)

An Annotation Framework for Fedora
An Annotation Framework for FedoraAn Annotation Framework for Fedora
An Annotation Framework for Fedora
 
Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlf
 
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literature
 
Open Annotation Core Data Model (tutorial)
Open Annotation Core Data Model (tutorial)Open Annotation Core Data Model (tutorial)
Open Annotation Core Data Model (tutorial)
 
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureII-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
 

Ähnlich wie Biomedical Annotation - Kevin Livingston

The New Service Economy: Innovation in Services
The New Service Economy: Innovation in ServicesThe New Service Economy: Innovation in Services
The New Service Economy: Innovation in ServicesIan Miles
 
Shou qing wang
Shou qing wangShou qing wang
Shou qing wangjenidoyle
 
UBS Global Basic Materials Conference
UBS Global Basic Materials Conference UBS Global Basic Materials Conference
UBS Global Basic Materials Conference finance15
 
Poster presentation
Poster presentationPoster presentation
Poster presentationredsys
 
The crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsThe crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsrobkitchin
 
The CEO’s Dilemma - How to drive efficient innovation in the organization
The CEO’s Dilemma - How to drive efficient innovation in the organizationThe CEO’s Dilemma - How to drive efficient innovation in the organization
The CEO’s Dilemma - How to drive efficient innovation in the organizationJoeBarkai
 
Energy Statistic : Global and Thailand Outlook
Energy Statistic : Global and Thailand OutlookEnergy Statistic : Global and Thailand Outlook
Energy Statistic : Global and Thailand OutlookDenpong Soodphakdee
 
Abiec 2012 resultados
Abiec 2012 resultadosAbiec 2012 resultados
Abiec 2012 resultadosAgroTalento
 
Walking through a library remotely. Digital Humanities seminar April 12, 2013...
Walking through a library remotely. Digital Humanities seminar April 12, 2013...Walking through a library remotely. Digital Humanities seminar April 12, 2013...
Walking through a library remotely. Digital Humanities seminar April 12, 2013...Andrea Scharnhorst
 
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Conferences
 
Graeme marshall tauranga transport and logistics forum 16 nov2012
Graeme marshall tauranga transport and logistics forum 16 nov2012Graeme marshall tauranga transport and logistics forum 16 nov2012
Graeme marshall tauranga transport and logistics forum 16 nov2012Greg Bold
 
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Conferences
 
Effect of Primary Fuels on the Availability and Cost of Power in India
Effect of Primary Fuels on the Availability and Cost of Power in IndiaEffect of Primary Fuels on the Availability and Cost of Power in India
Effect of Primary Fuels on the Availability and Cost of Power in IndiaIPPAI
 
Automatic extraction and manual validation of a hierarchical English-Swedish ...
Automatic extraction and manual validation of a hierarchical English-Swedish ...Automatic extraction and manual validation of a hierarchical English-Swedish ...
Automatic extraction and manual validation of a hierarchical English-Swedish ...Jody Foo
 
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...Vijay Agrawal
 
Medicon Valley and Life science cluster in Denmark
Medicon Valley and Life science cluster in DenmarkMedicon Valley and Life science cluster in Denmark
Medicon Valley and Life science cluster in DenmarkPramila Das
 

Ähnlich wie Biomedical Annotation - Kevin Livingston (20)

The New Service Economy: Innovation in Services
The New Service Economy: Innovation in ServicesThe New Service Economy: Innovation in Services
The New Service Economy: Innovation in Services
 
Shou qing wang
Shou qing wangShou qing wang
Shou qing wang
 
UBS Global Basic Materials Conference
UBS Global Basic Materials Conference UBS Global Basic Materials Conference
UBS Global Basic Materials Conference
 
Poster presentation
Poster presentationPoster presentation
Poster presentation
 
The crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsThe crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and maps
 
The CEO’s Dilemma - How to drive efficient innovation in the organization
The CEO’s Dilemma - How to drive efficient innovation in the organizationThe CEO’s Dilemma - How to drive efficient innovation in the organization
The CEO’s Dilemma - How to drive efficient innovation in the organization
 
Energy Statistic : Global and Thailand Outlook
Energy Statistic : Global and Thailand OutlookEnergy Statistic : Global and Thailand Outlook
Energy Statistic : Global and Thailand Outlook
 
Abiec 2012 resultados
Abiec 2012 resultadosAbiec 2012 resultados
Abiec 2012 resultados
 
Walking through a library remotely. Digital Humanities seminar April 12, 2013...
Walking through a library remotely. Digital Humanities seminar April 12, 2013...Walking through a library remotely. Digital Humanities seminar April 12, 2013...
Walking through a library remotely. Digital Humanities seminar April 12, 2013...
 
Metro's Natural Area Program - Soll
Metro's Natural Area Program - SollMetro's Natural Area Program - Soll
Metro's Natural Area Program - Soll
 
Aeronautic Sector In Andalusia
Aeronautic Sector In AndalusiaAeronautic Sector In Andalusia
Aeronautic Sector In Andalusia
 
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
 
Graeme marshall tauranga transport and logistics forum 16 nov2012
Graeme marshall tauranga transport and logistics forum 16 nov2012Graeme marshall tauranga transport and logistics forum 16 nov2012
Graeme marshall tauranga transport and logistics forum 16 nov2012
 
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
Objective Capital Precious Metals, Diamonds and Gemstones Investment Summit: ...
 
Effect of Primary Fuels on the Availability and Cost of Power in India
Effect of Primary Fuels on the Availability and Cost of Power in IndiaEffect of Primary Fuels on the Availability and Cost of Power in India
Effect of Primary Fuels on the Availability and Cost of Power in India
 
Automatic extraction and manual validation of a hierarchical English-Swedish ...
Automatic extraction and manual validation of a hierarchical English-Swedish ...Automatic extraction and manual validation of a hierarchical English-Swedish ...
Automatic extraction and manual validation of a hierarchical English-Swedish ...
 
Changing Donor Priorities and Strategies for Agricultural R&D in Developing C...
Changing Donor Priorities and Strategies for Agricultural R&D in Developing C...Changing Donor Priorities and Strategies for Agricultural R&D in Developing C...
Changing Donor Priorities and Strategies for Agricultural R&D in Developing C...
 
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...
Use of Simulation Modeling to Address Impact of Recurring Bad Weather Events,...
 
Access to open data through open access articles in the life sciences
Access to open data through open access articles in the life sciencesAccess to open data through open access articles in the life sciences
Access to open data through open access articles in the life sciences
 
Medicon Valley and Life science cluster in Denmark
Medicon Valley and Life science cluster in DenmarkMedicon Valley and Life science cluster in Denmark
Medicon Valley and Life science cluster in Denmark
 

Mehr von DLFCLIR

Managing the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesManaging the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesDLFCLIR
 
Dlf bonnie tijerina keynote
Dlf  bonnie tijerina keynoteDlf  bonnie tijerina keynote
Dlf bonnie tijerina keynoteDLFCLIR
 
Participatory Digital Library
Participatory Digital LibraryParticipatory Digital Library
Participatory Digital LibraryDLFCLIR
 
Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge ProjectDLFCLIR
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipDLFCLIR
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipDLFCLIR
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpaceDLFCLIR
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... ServicesDLFCLIR
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataDLFCLIR
 
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1DLFCLIR
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011DLFCLIR
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUDLFCLIR
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011DLFCLIR
 

Mehr von DLFCLIR (13)

Managing the Digitization of Large Press Archives
Managing the Digitization of Large Press ArchivesManaging the Digitization of Large Press Archives
Managing the Digitization of Large Press Archives
 
Dlf bonnie tijerina keynote
Dlf  bonnie tijerina keynoteDlf  bonnie tijerina keynote
Dlf bonnie tijerina keynote
 
Participatory Digital Library
Participatory Digital LibraryParticipatory Digital Library
Participatory Digital Library
 
Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge Project
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnership
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital Scholarship
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpace
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... Services
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
 
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYU
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 

Biomedical Annotation - Kevin Livingston

  • 1. Biomedical Annotation Kevin Livingston, Ph.D. Postdoctoral Fellow Pharmacology Department, School of Medicine University of Colorado Anschutz Medical Campus Kevin.Livingston@ucdenver.edu http://compbio.ucdenver.edu/Hunter_lab/Livingston
  • 2. Biomedical researchers are interested in understanding their data in the context of all known background knowledge: curated databases & literature. 2
  • 3. Pubmed Growth Rate 1100 25 1000 y = ~e0.0405x 900 R² = 0.99 20 800 New Entries (thousands) Total Entries (millions) 700 15 600 500 y = ~e0.0402x R² = 0.94 10 400 300 200 5 100 2 journal 0 0 articles per 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 973,499 PubMed entries in 2011 (>2,600 per day) minute! 3
  • 4. Biomedical Data Sources Total Manual GO Annotations: 1,116,848 1,380 Total GO Database Annotations: s in 2012 132,425,702 PubMed Articles Referenced: 94,518 4
  • 5. Annotation Consumers? • The linguistic community typically uses annotation as training data or for specific tasks – An abundance of tools that can produce annotations in the specific format of those resources – Tools for computational linguistics • Biomedical annotation typically used for curating, indexing, or enrichment analysis • But what about re-using annotations and tools in other contexts and for other purposes? 5
  • 6. 6
  • 7. Vision Intelligent DBs Application s Ontologies Knowledge Base Text Mining Texts 7
  • 10. Annotation for Computation • Computer understandable • Composable • Provenance of compositions traceable 10
  • 11. CRAFT: Colorado Richly Annotated Full Text corpus http://bionlp- corpora.sourceforge.net/CRAFT/ • 67 full text articles (+30 more reserved for future testing) • >560,000 Tokens • >21,000 Sentences • ~100,000 concept annotations to 7 different biomedical ontologies/terminologies • Penn Treebank markup for each sentence • Multiple output formats available 11
  • 12. CRAFT Annotation hemopoiesis has agent results in regulation by transcription entity that has function corepressor activity biological binding regulation transcription results in protein coactivator results in regulation by activity interaction of regulates DNA protein transcription Hematopoiesis is precisely orchestrated by lineage-specific DNA-binding proteins that regulate transcription in concert with coactivators and corepressors. GO GO CHEBI SO relation 12 BP MF
  • 14. Compositional Annotation & Knowledge vertebrate pigmentatio n occurs_in denotes subClassOf text annotation 3 TAXON:7742 GO:0043474 basedO basedO Vertebrata pigmentation n n hasBody hasBody CRAFT PMID:1473718 text annotation 1 text annotation 2 3 hasTarget hasTarget 14
  • 15. Summary • Model that covers syntactic and semantic annotation – Linguistic annotation – Semantic annotation – Entity-based annotation • Capture complex content that is not necessarily best represented via a single URI – Created a GraphAnnotation that denotes a RDF named graph • Add kiao:basedOn to enable annotation compositions and provenance tracking – Annotation-level 15
  • 16. Acknowledgements University of Colorado: • National ICT Australia • Hunter Lab – Karin Verspoor – Larry Hunter – Mike Bada • Funding: – Bill Baumgartner – NIH/NLM training grant – Chris Roeder – Andrew W. Mellon Foundation – Kevin Cohen – Carsten Goerg 16
  • 17. Biomedical Annotation Kevin Livingston, Ph.D. Postdoctoral Fellow Pharmacology Department, School of Medicine University of Colorado Anschutz Medical Campus Kevin.Livingston@ucdenver.edu http://compbio.ucdenver.edu/Hunter_lab/Livingston

Hinweis der Redaktion

  1. Entity Centric
  2. Document Centric
  3. Rectangles are concepts we create, rounded rectangles are current ontological concepts. Orange objects are information content entities, blue objects are biomedical concepts.