SlideShare ist ein Scribd-Unternehmen logo
1 von 42
ChemSpider as a Chemical
           Term Resolver

    Antony Williams, Valery Tkachenko,
            Sean Ekins and Andy Fant
                 ACS San Diego March 2012
The Web of Chemistry – VERY BIG!
Online Databases are “Linking”
It is so difficult to navigate…
                                                        IP?
                                What’s the
                                structure?
                                                    Are they in
                                                     our file?
                                  What’s
                                 similar?
                                                    What’s the
                              Pharmacology           target?
                                  data?

                                              Known
                                            Pathways?
                             Competitors?
                                                    Working On
                              Connections             Now?
                              to disease?
                                              Expressed in
                                             right cell type?
Open PHACTS Project
 Develop a set of robust standards…
 Implement the standards in a semantic integration hub
 Deliver services to support drug discovery programs in
  pharma and public domain
 22 partners, 8 pharmaceutical companies, 3 biotechs
 36 months project

  Guiding principle is open access, open usage, open source
                - Key to standards adoption -
What is the Structure of Vitamin K?
MeSH
 A lipid cofactor that is required for normal blood
  clotting.

 Several forms of vitamin K have been identified:
   VITAMIN K 1 (phytomenadione) derived from
    plants,
   VITAMIN K 2 (menaquinone) from bacteria, and
    synthetic naphthoquinone provitamins,
   VITAMIN K 3 (menadione).
What is the Structure of Vitamin K1?
Create an Online “Resolver” as a
path to chemistry
 Search all forms of structure IDs

   Systematic name(s)
   Trivial Name(s)
   SMILES
   InChI Strings
   InChIKeys
   Database IDs
   Registry Number
ChemSpider
Available Information…
 Linked to vendors, safety data, toxicity, metabolism
Available Information….
Vitamin K1 Names
Vitamin K1 on ChemSpider CORRECT
Resolving Names for QUALITY
 Searching chemical identifiers should resolve to
  the correct chemical as much as possible
Validated Name-Structure Dictionaries

 Chemical name dictionaries are used for:
     Text-mining (publications, patents)
        Used to index PubMed and link to Google Patents

     Linking to other databases – think Biology!
        When structures are not available drug names link

     Searching the web
        Names link to structures link to InChIs
I want to know about “Vincristine”
Vincristine: Identifiers
Vincristine: Patents
Linked by Name
Many Names, One Structure
Top 200 Drugs on Wikipedia
http://en.wikipedia.org/wiki/List_of_bestselling_drugs
The Project Challenge PART ONE
 Agree on the set of chemical names to work with

 Independently create an SDF file in each “lab”

 Compare differences and agree on final structures

 Issue “Gold Standard” SDF file to team
RSC Process
Relative accuracy of groups against
final master list
The Project Challenge PART TWO
 Use Gold Standard SDF File to investigate data
  quality on these compounds in Internet Databases

 Two checks
    Search chemical name – does it return the
     correct compound. If not correct, how is it
     different?
    Search “structure” – SMILES, Molfile,
     InChIString or InChIKey
“The First 10”
Performance on 150 Drug Names
NPC Browser Set
Standardize




 Use the SRS as a guidance document for
  standardization
 Adjust as necessary to our needs
Nitro groups
Salt and Ionic Bonds
One dictionary look up is never enough…
 ChemSpider does not contain all chemistry

 We are not the only ones curating data

 New chemistry expands daily and goes online
One dictionary look up is never enough…
 Federation is key….

     Check ChemSpider first, if not found then
     Check PubChem
     Check NCI resolver
     Check ChEBI
     Check ….the “network” of open interfaces

 Each resolver will have its own “quantitative
  confidence”.
Chemical Identifier Resolver (CIR)

                                    Converts a given
                                    structure identifier into
                                    another representation
                                    or structure identifier.

                                    Resolve names,
                                    identifiers etc




http://cactus.nci.nih.gov/chemical/structure
What can become a resolver?
We are building….
 A central federated resolver utilizing available
  services
 Dictionary lookups, systematic name conversions
  (multiple tools – ACD/Labs, Lexichem, OPSIN)
 “Consensus” decisions and guidance BUT
 Chemicals have timelines!!!
ORIGINAL   FINAL
Thank you

Email: williamsa@rsc.org
Twitter: ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

Weitere ähnliche Inhalte

Was ist angesagt?

Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSValery Tkachenko
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Valery Tkachenko
 

Was ist angesagt? (19)

ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...
 
Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider Crawling Across the Web of Chemistry Using ChemSpider
Crawling Across the Web of Chemistry Using ChemSpider
 
Citizen Scientists and Their Contributions to Internet Based Chemistry
Citizen Scientists and Their Contributions to Internet Based ChemistryCitizen Scientists and Their Contributions to Internet Based Chemistry
Citizen Scientists and Their Contributions to Internet Based Chemistry
 
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
 
Integrating and curating internet based chemistry resources to serve life sci...
Integrating and curating internet based chemistry resources to serve life sci...Integrating and curating internet based chemistry resources to serve life sci...
Integrating and curating internet based chemistry resources to serve life sci...
 
How an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our WorldHow an Online Resource for Chemistry Can Change Our World
How an Online Resource for Chemistry Can Change Our World
 
ChemSpider hosting linking and curating chemistry data for the community
ChemSpider  hosting linking and curating chemistry data for the communityChemSpider  hosting linking and curating chemistry data for the community
ChemSpider hosting linking and curating chemistry data for the community
 
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...
 
How the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data finalHow the web has weaved a web of interlinked chemistry data final
How the web has weaved a web of interlinked chemistry data final
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpider
 
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
Enhancing Discoverability Across Royal Society Of Chemistry Content By Integr...
 
Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008
 
Ebi public meeting on internet chemistry databases november 2010
Ebi public meeting on internet chemistry databases november 2010Ebi public meeting on internet chemistry databases november 2010
Ebi public meeting on internet chemistry databases november 2010
 
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
 
Crowdsourcing, Collaborations And Text Mining In A World Of Open Chemistry
Crowdsourcing, Collaborations And Text Mining In A World Of Open ChemistryCrowdsourcing, Collaborations And Text Mining In A World Of Open Chemistry
Crowdsourcing, Collaborations And Text Mining In A World Of Open Chemistry
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
 

Ähnlich wie Chem spider as a chemical term resolver

Chemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the communityChemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the communityRoyal Society of Chemistry
 

Ähnlich wie Chem spider as a chemical term resolver (20)

Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
 
The Great Promise of Online Data for Chemistry and the Life Sciences
The Great Promise of Online Data for Chemistry and the Life SciencesThe Great Promise of Online Data for Chemistry and the Life Sciences
The Great Promise of Online Data for Chemistry and the Life Sciences
 
Chemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScienceChemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScience
 
Mining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposingMining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposing
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Chemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the communityChemspider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the community
 
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...
 
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
 
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspnRSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn
 
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Crowdsourcing Chemistry for the Community – 5 Years of ExperiencesCrowdsourcing Chemistry for the Community – 5 Years of Experiences
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
 
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
 
Connecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpiderConnecting Chemistry Across the Internet Using ChemSpider
Connecting Chemistry Across the Internet Using ChemSpider
 
ChemSpider – An Online Database and Registration System Linking the Web
ChemSpider – An Online Database and  Registration System Linking the WebChemSpider – An Online Database and  Registration System Linking the Web
ChemSpider – An Online Database and Registration System Linking the Web
 
RSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For ChemistsRSC ChemSpider – Building An Internet Based Community For Chemists
RSC ChemSpider – Building An Internet Based Community For Chemists
 
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life SciencesBuilding A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
 
RSC ChemSpider is the online chemistry database where community contributions...
RSC ChemSpider is the online chemistry database where community contributions...RSC ChemSpider is the online chemistry database where community contributions...
RSC ChemSpider is the online chemistry database where community contributions...
 
Connecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpiderConnecting Chemists to the Internet Through ChemSpider
Connecting Chemists to the Internet Through ChemSpider
 
ChemSpider Presentation At University Of Toronto
ChemSpider Presentation At University Of TorontoChemSpider Presentation At University Of Toronto
ChemSpider Presentation At University Of Toronto
 
Chemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the handChemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the hand
 

Kürzlich hochgeladen

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Chem spider as a chemical term resolver

  • 1. ChemSpider as a Chemical Term Resolver Antony Williams, Valery Tkachenko, Sean Ekins and Andy Fant ACS San Diego March 2012
  • 2. The Web of Chemistry – VERY BIG!
  • 3. Online Databases are “Linking”
  • 4. It is so difficult to navigate… IP? What’s the structure? Are they in our file? What’s similar? What’s the Pharmacology target? data? Known Pathways? Competitors? Working On Connections Now? to disease? Expressed in right cell type?
  • 5. Open PHACTS Project  Develop a set of robust standards…  Implement the standards in a semantic integration hub  Deliver services to support drug discovery programs in pharma and public domain  22 partners, 8 pharmaceutical companies, 3 biotechs  36 months project Guiding principle is open access, open usage, open source - Key to standards adoption -
  • 6.
  • 7. What is the Structure of Vitamin K?
  • 8. MeSH  A lipid cofactor that is required for normal blood clotting.  Several forms of vitamin K have been identified:  VITAMIN K 1 (phytomenadione) derived from plants,  VITAMIN K 2 (menaquinone) from bacteria, and synthetic naphthoquinone provitamins,  VITAMIN K 3 (menadione).
  • 9. What is the Structure of Vitamin K1?
  • 10.
  • 11.
  • 12. Create an Online “Resolver” as a path to chemistry  Search all forms of structure IDs  Systematic name(s)  Trivial Name(s)  SMILES  InChI Strings  InChIKeys  Database IDs  Registry Number
  • 14. Available Information…  Linked to vendors, safety data, toxicity, metabolism
  • 17. Vitamin K1 on ChemSpider CORRECT
  • 18. Resolving Names for QUALITY  Searching chemical identifiers should resolve to the correct chemical as much as possible
  • 19. Validated Name-Structure Dictionaries  Chemical name dictionaries are used for:  Text-mining (publications, patents)  Used to index PubMed and link to Google Patents  Linking to other databases – think Biology!  When structures are not available drug names link  Searching the web  Names link to structures link to InChIs
  • 20. I want to know about “Vincristine”
  • 23. Many Names, One Structure
  • 24. Top 200 Drugs on Wikipedia http://en.wikipedia.org/wiki/List_of_bestselling_drugs
  • 25. The Project Challenge PART ONE  Agree on the set of chemical names to work with  Independently create an SDF file in each “lab”  Compare differences and agree on final structures  Issue “Gold Standard” SDF file to team
  • 27. Relative accuracy of groups against final master list
  • 28. The Project Challenge PART TWO  Use Gold Standard SDF File to investigate data quality on these compounds in Internet Databases  Two checks  Search chemical name – does it return the correct compound. If not correct, how is it different?  Search “structure” – SMILES, Molfile, InChIString or InChIKey
  • 30. Performance on 150 Drug Names
  • 31.
  • 33. Standardize  Use the SRS as a guidance document for standardization  Adjust as necessary to our needs
  • 35. Salt and Ionic Bonds
  • 36. One dictionary look up is never enough…  ChemSpider does not contain all chemistry  We are not the only ones curating data  New chemistry expands daily and goes online
  • 37. One dictionary look up is never enough…  Federation is key….  Check ChemSpider first, if not found then  Check PubChem  Check NCI resolver  Check ChEBI  Check ….the “network” of open interfaces  Each resolver will have its own “quantitative confidence”.
  • 38. Chemical Identifier Resolver (CIR) Converts a given structure identifier into another representation or structure identifier. Resolve names, identifiers etc http://cactus.nci.nih.gov/chemical/structure
  • 39. What can become a resolver?
  • 40. We are building….  A central federated resolver utilizing available services  Dictionary lookups, systematic name conversions (multiple tools – ACD/Labs, Lexichem, OPSIN)  “Consensus” decisions and guidance BUT  Chemicals have timelines!!!
  • 41. ORIGINAL FINAL
  • 42. Thank you Email: williamsa@rsc.org Twitter: ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams