SlideShare a Scribd company logo
1 of 42
ChemSpider as a Chemical
           Term Resolver

  Antony Williams and Valery Tkachenko,

                   ACS San Diego March 2012
The Web of Chemistry – VERY BIG!
Online Databases are “Linking”
It is so difficult to navigate…
                                                        IP?
                                What’s the
                                structure?
                                                    Are they in
                                                     our file?
                                  What’s
                                 similar?
                                                    What’s the
                              Pharmacology           target?
                                  data?

                                              Known
                                            Pathways?
                             Competitors?
                                                    Working On
                              Connections             Now?
                              to disease?
                                              Expressed in
                                             right cell type?
Open PHACTS Project
 Develop a set of robust standards…
 Implement the standards in a semantic integration hub
 Deliver services to support drug discovery programs in
  pharma and public domain
 22 partners, 8 pharmaceutical companies, 3 biotechs
 36 months project

  Guiding principle is open access, open usage, open source
                - Key to standards adoption -
What is the Structure of Vitamin K?
MeSH
 A lipid cofactor that is required for normal blood
  clotting.

 Several forms of vitamin K have been identified:
   VITAMIN K 1 (phytomenadione) derived from
    plants,
   VITAMIN K 2 (menaquinone) from bacteria, and
    synthetic naphthoquinone provitamins,
   VITAMIN K 3 (menadione).
What is the Structure of Vitamin K1?
Create an Online “Resolver” as a
path to chemistry
 Search all forms of structure IDs

   Systematic name(s)
   Trivial Name(s)
   SMILES
   InChI Strings
   InChIKeys
   Database IDs
   Registry Number
ChemSpider
Available Information…
 Linked to vendors, safety data, toxicity, metabolism
Available Information….
Vitamin K1 Names
Vitamin K1 on ChemSpider CORRECT
Resolving Names for QUALITY
 Searching chemical identifiers should resolve to
  the correct chemical as much as possible
Validated Name-Structure Dictionaries

 Chemical name dictionaries are used for:
     Text-mining (publications, patents)
        Used to index PubMed and link to Google Patents

     Linking to other databases – think Biology!
        When structures are not available drug names link

     Searching the web
        Names link to structures link to InChIs
I want to know about “Vincristine”
Vincristine: Identifiers
Vincristine: Patents
Linked by Name
Many Names, One Structure
Top 200 Drugs on Wikipedia
http://en.wikipedia.org/wiki/List_of_bestselling_drugs
The Project Challenge PART ONE
 Agree on the set of chemical names to work with

 Independently create an SDF file in each “lab”

 Compare differences and agree on final structures

 Issue “Gold Standard” SDF file to team
RSC Process
Relative accuracy of groups against
final master list
The Project Challenge PART TWO
 Use Gold Standard SDF File to investigate data
  quality on these compounds in Internet Databases

 Two checks
    Search chemical name – does it return the
     correct compound. If not correct, how is it
     different?
    Search “structure” – SMILES, Molfile,
     InChIString or InChIKey
“The First 10”
Performance on 150 Drug Names
NPC Browser Set
Standardize




 Use the SRS as a guidance document for
  standardization
 Adjust as necessary to our needs
Nitro groups
Salt and Ionic Bonds
One dictionary look up is never enough…
 ChemSpider does not contain all chemistry

 We are not the only ones curating data

 New chemistry expands daily and goes online
One dictionary look up is never enough…
 Federation is key….

     Check ChemSpider first, if not found then
     Check PubChem
     Check NCI resolver
     Check ChEBI
     Check ….the “network” of open interfaces

 Each resolver will have its own “quantitative
  confidence”.
Chemical Identifier Resolver (CIR)

                                    Converts a given
                                    structure identifier into
                                    another representation
                                    or structure identifier.

                                    Resolve names,
                                    identifiers etc




http://cactus.nci.nih.gov/chemical/structure
What can become a resolver?
We are building….
 A central federated resolver utilizing available
  services
 Dictionary lookups, systematic name conversions
  (multiple tools – ACD/Labs, Lexichem, OPSIN)
 “Consensus” decisions and guidance BUT
 Chemicals have timelines!!!
ORIGINAL   FINAL
Thank you

Email: williamsa@rsc.org
Twitter: ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

More Related Content

What's hot

ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ChemSpider hosting linking and curating chemistry data for the community
ChemSpider  hosting linking and curating chemistry data for the communityChemSpider  hosting linking and curating chemistry data for the community
ChemSpider hosting linking and curating chemistry data for the community
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
Valery Tkachenko
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider

What's hot (20)

ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
 
RSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For ChemistsRSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For Chemists
 
Citizen Scientists and Their Contributions to Internet Based Chemistry
Citizen Scientists and Their Contributions to Internet Based ChemistryCitizen Scientists and Their Contributions to Internet Based Chemistry
Citizen Scientists and Their Contributions to Internet Based Chemistry
 
Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider
 
How an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our WorldHow an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our World
 
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
 
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Crowdsourcing Chemistry for the Community – 5 Years of ExperiencesCrowdsourcing Chemistry for the Community – 5 Years of Experiences
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
 
How the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data finalHow the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data final
 
Integrating and curating internet based chemistry resources to serve life sci...
Integrating and curating internet based chemistry resources to serve life sci...Integrating and curating internet based chemistry resources to serve life sci...
Integrating and curating internet based chemistry resources to serve life sci...
 
ChemSpider hosting linking and curating chemistry data for the community
ChemSpider  hosting linking and curating chemistry data for the communityChemSpider  hosting linking and curating chemistry data for the community
ChemSpider hosting linking and curating chemistry data for the community
 
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspnRSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
 
Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
 
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
 
Ebi public meeting on internet chemistry databases november 2010
Ebi public meeting on internet chemistry databases november 2010Ebi public meeting on internet chemistry databases november 2010
Ebi public meeting on internet chemistry databases november 2010
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpider
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1
 
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
 

Viewers also liked

Chemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the communityChemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the community
Royal Society of Chemistry
 

Viewers also liked (8)

Great promise of navigating the internet using in chis
Great promise of navigating the internet using in chisGreat promise of navigating the internet using in chis
Great promise of navigating the internet using in chis
 
RSC Mobile
RSC Mobile RSC Mobile
RSC Mobile
 
RSC membership presentation 2011
RSC membership presentation 2011RSC membership presentation 2011
RSC membership presentation 2011
 
Chemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the communityChemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the community
 
Open Data: Touching Upon the Intangible
Open Data: Touching Upon the IntangibleOpen Data: Touching Upon the Intangible
Open Data: Touching Upon the Intangible
 
Realizing a UK National Compound Collection
Realizing a UK National Compound CollectionRealizing a UK National Compound Collection
Realizing a UK National Compound Collection
 
Research Data Management - EPSRC’s Perspective
Research Data Management  - EPSRC’s PerspectiveResearch Data Management  - EPSRC’s Perspective
Research Data Management - EPSRC’s Perspective
 
The Global Chemistry Network - driving innovation
The Global Chemistry Network - driving innovationThe Global Chemistry Network - driving innovation
The Global Chemistry Network - driving innovation
 

Similar to ChemSpider as a chemical term resolver

Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Similar to ChemSpider as a chemical term resolver (18)

Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
 
The Great Promise of Online Data for Chemistry and the Life Sciences
The Great Promise of Online Data for Chemistry and the Life SciencesThe Great Promise of Online Data for Chemistry and the Life Sciences
The Great Promise of Online Data for Chemistry and the Life Sciences
 
Mining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposingMining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposing
 
Chemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScienceChemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScience
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
 
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
 
Connecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpiderConnecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpider
 
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
 
ChemSpider – An Online Database and Registration System Linking the Web
ChemSpider – An Online Database and  Registration System Linking the WebChemSpider – An Online Database and  Registration System Linking the Web
ChemSpider – An Online Database and Registration System Linking the Web
 
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life SciencesBuilding A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
 
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
 
RSC ChemSpider is the online chemistry database where community contributions...
RSC ChemSpider is the online chemistry database where community contributions...RSC ChemSpider is the online chemistry database where community contributions...
RSC ChemSpider is the online chemistry database where community contributions...
 
Connecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpiderConnecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpider
 
ChemSpider Presentation At University Of Toronto
ChemSpider Presentation At University Of TorontoChemSpider Presentation At University Of Toronto
ChemSpider Presentation At University Of Toronto
 
AZ of Chemspider February 2011
AZ of Chemspider February 2011AZ of Chemspider February 2011
AZ of Chemspider February 2011
 

More from Royal Society of Chemistry

Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
Royal Society of Chemistry
 

More from Royal Society of Chemistry (14)

20130724 cisrg sugars_batchelor
20130724 cisrg sugars_batchelor20130724 cisrg sugars_batchelor
20130724 cisrg sugars_batchelor
 
20130410 carbohydrates
20130410 carbohydrates20130410 carbohydrates
20130410 carbohydrates
 
Engaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careersEngaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careers
 
Navigating scientific resources using wiki based resources
Navigating scientific resources using wiki based resourcesNavigating scientific resources using wiki based resources
Navigating scientific resources using wiki based resources
 
Utilizing open source software to facilitate communication of chemistry at rsc
Utilizing open source software to facilitate communication of chemistry at rscUtilizing open source software to facilitate communication of chemistry at rsc
Utilizing open source software to facilitate communication of chemistry at rsc
 
ChemCareers India Specialist presentation
ChemCareers India Specialist presentation ChemCareers India Specialist presentation
ChemCareers India Specialist presentation
 
Newcastle chemistry admissions talk for MTU Online
Newcastle chemistry admissions talk for MTU OnlineNewcastle chemistry admissions talk for MTU Online
Newcastle chemistry admissions talk for MTU Online
 
ChemNet Careers 2011-12
ChemNet Careers 2011-12ChemNet Careers 2011-12
ChemNet Careers 2011-12
 
Town hall speech
Town hall speechTown hall speech
Town hall speech
 
Chemistry Landscape - Town Hall Speech
Chemistry Landscape - Town Hall SpeechChemistry Landscape - Town Hall Speech
Chemistry Landscape - Town Hall Speech
 
All aboard the Semantic Bandwagon
All aboard the Semantic BandwagonAll aboard the Semantic Bandwagon
All aboard the Semantic Bandwagon
 
Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
 
Metabolomics seminarslides 013111final 110201
Metabolomics seminarslides 013111final 110201Metabolomics seminarslides 013111final 110201
Metabolomics seminarslides 013111final 110201
 
Chem spider introduction spring 2011
Chem spider introduction spring 2011Chem spider introduction spring 2011
Chem spider introduction spring 2011
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

ChemSpider as a chemical term resolver

  • 1. ChemSpider as a Chemical Term Resolver Antony Williams and Valery Tkachenko, ACS San Diego March 2012
  • 2. The Web of Chemistry – VERY BIG!
  • 3. Online Databases are “Linking”
  • 4. It is so difficult to navigate… IP? What’s the structure? Are they in our file? What’s similar? What’s the Pharmacology target? data? Known Pathways? Competitors? Working On Connections Now? to disease? Expressed in right cell type?
  • 5. Open PHACTS Project  Develop a set of robust standards…  Implement the standards in a semantic integration hub  Deliver services to support drug discovery programs in pharma and public domain  22 partners, 8 pharmaceutical companies, 3 biotechs  36 months project Guiding principle is open access, open usage, open source - Key to standards adoption -
  • 6.
  • 7. What is the Structure of Vitamin K?
  • 8. MeSH  A lipid cofactor that is required for normal blood clotting.  Several forms of vitamin K have been identified:  VITAMIN K 1 (phytomenadione) derived from plants,  VITAMIN K 2 (menaquinone) from bacteria, and synthetic naphthoquinone provitamins,  VITAMIN K 3 (menadione).
  • 9. What is the Structure of Vitamin K1?
  • 10.
  • 11.
  • 12. Create an Online “Resolver” as a path to chemistry  Search all forms of structure IDs  Systematic name(s)  Trivial Name(s)  SMILES  InChI Strings  InChIKeys  Database IDs  Registry Number
  • 14. Available Information…  Linked to vendors, safety data, toxicity, metabolism
  • 17. Vitamin K1 on ChemSpider CORRECT
  • 18. Resolving Names for QUALITY  Searching chemical identifiers should resolve to the correct chemical as much as possible
  • 19. Validated Name-Structure Dictionaries  Chemical name dictionaries are used for:  Text-mining (publications, patents)  Used to index PubMed and link to Google Patents  Linking to other databases – think Biology!  When structures are not available drug names link  Searching the web  Names link to structures link to InChIs
  • 20. I want to know about “Vincristine”
  • 23. Many Names, One Structure
  • 24. Top 200 Drugs on Wikipedia http://en.wikipedia.org/wiki/List_of_bestselling_drugs
  • 25. The Project Challenge PART ONE  Agree on the set of chemical names to work with  Independently create an SDF file in each “lab”  Compare differences and agree on final structures  Issue “Gold Standard” SDF file to team
  • 27. Relative accuracy of groups against final master list
  • 28. The Project Challenge PART TWO  Use Gold Standard SDF File to investigate data quality on these compounds in Internet Databases  Two checks  Search chemical name – does it return the correct compound. If not correct, how is it different?  Search “structure” – SMILES, Molfile, InChIString or InChIKey
  • 30. Performance on 150 Drug Names
  • 31.
  • 33. Standardize  Use the SRS as a guidance document for standardization  Adjust as necessary to our needs
  • 35. Salt and Ionic Bonds
  • 36. One dictionary look up is never enough…  ChemSpider does not contain all chemistry  We are not the only ones curating data  New chemistry expands daily and goes online
  • 37. One dictionary look up is never enough…  Federation is key….  Check ChemSpider first, if not found then  Check PubChem  Check NCI resolver  Check ChEBI  Check ….the “network” of open interfaces  Each resolver will have its own “quantitative confidence”.
  • 38. Chemical Identifier Resolver (CIR) Converts a given structure identifier into another representation or structure identifier. Resolve names, identifiers etc http://cactus.nci.nih.gov/chemical/structure
  • 39. What can become a resolver?
  • 40. We are building….  A central federated resolver utilizing available services  Dictionary lookups, systematic name conversions (multiple tools – ACD/Labs, Lexichem, OPSIN)  “Consensus” decisions and guidance BUT  Chemicals have timelines!!!
  • 41. ORIGINAL FINAL
  • 42. Thank you Email: williamsa@rsc.org Twitter: ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams