Beyond the EU: DORA and NIS 2 Directive's Global Impact
Online Chemical Database with Modelling Environment
1. “ Online chemical database
with modeling environment”
a summer school course
Sergii Novotarskyi
Iurii Sushko
2. Chemoinformatics – overview of online resources
Chemical databases
1. PubChem — a database that provides information on the biological
activities of small molecules
2. ChemSpider — a free access service providing a structure centric
community for chemists
3. ChemIDplus — a tool, that provides chemical structure, property, and
toxicity searching
4. ChemBank — a database of chemical structures and assays
5. ChemDB — a set of chemoinformatics tools
3. Chemoinformatics – overview of online resources
Literature databases
6. PubMed — a service, that includes over 19 million citations from
MEDLINE and other life science journals for biomedical articles back to
1948
7. Toxicology Literature Online (TOXLINE) — references from toxicology
literature
8. ScienceDirect — a full-text scientific database offering articles/chapters
from more than 2,500 peer-reviewed journals and more than 10,000
books
9. ACS Publications — a worldwide scientific community with a collection
of the most cited peer-reviewed journals in the chemical and related
sciences.
4. Chemoinformatics – overview of online resources
PubChem – start page
URL: http://pubchem.ncbi.nlm.nih.gov/ or for «PubChem»
16. Chemoinformatics – overview of online resources
PubMed – main page
URL: http://www.ncbi.nlm.nih.gov/pubmed/ or for «PubMed»
17. Online chemical database with modeling environment
The subject of development
The web-based service
The database of physical, chemical and biological properties
Accumulating experimentally verified data
Providing user-friendly web-based access to this data
The QSPR modeling environment
Providing web-based tools for QSPR modeling
Storing and “publishing” created models
18. Online chemical database with modeling environment
Motivation
Our motivation
The importance of QSPR modeling
The importance of web-based tools for QSPR modeling
The importance to build one more service in this field
19. Online chemical database with modeling environment
Motivation - QSPR
Structure-property relationship hypothesis:
“Similar structures - similar properties”
log (IC50) = log (IC50) =
1.87 log(µM) 1.87 log(µM)
QSPR modeling:
Predicting properties based on available
data for structurally similar molecules.
Structures are represented by a set of
descriptors (atom count, molecular
log (IC50) = log (IC50) = ? weight).
0.64 log(µM)
20. Online chemical database with modeling environment
QSPR – Similarity in descriptor space
Number of specific fragments in a molecule
21. Online chemical database with modeling environment
Motivation - web-based tools for modeling
Main benefits of web-based tools:
Availability and accessibility
only a computer with Internet access and a modern web-browser required
to start working; possibility to share work materials among several
locations; works with any platform (Linux, Win, Mac)
Communication and collaboration
possibility to work on common topics, publish own results and use new
results of other people
22. Online chemical database with modeling environment
Motivation - one more web-based tool
Reasons to build one more service:
Different approach to data modification
a completely open database, any user can add, delete and edit data (only
constrained by a set of simple rules)
Different approach to data organization
data in the database is organized in a way, suitable for QSPR modeling
Integration of a database with modeling tools
data from the database can be used for model creation and property
prediction
23. Online chemical database with modeling environment
Distinctive features
The features, that make our service different:
“Wiki” approach to data handling
users can add, modify and delete data
Mandatory reference to an article
every record in a database should contain a reference to an article, where
the data was published
Storing additional information
we store measurement conditions to increase data quality
Several tools to support decision making
integration with other web-services (validation of molecule names against
PubChem database, automatic fetching of article information from
PubMed), duplicate records management
Aimed at model building
convenient to build training sets from data - filter by property, article and
export data either to internal modeling tools or download as Excel file
25. Online chemical database with modeling environment
Simplified data structure
Records Properties
Conditions
Molecules Users
Units
Articles
Journals
26. Online chemical database with modeling environment
User interface agreements
Browser-based interface
27. Online chemical database with modeling environment
User interface agreements
Browser-based interface
28. Online chemical database with modeling environment
User interface agreements
Icons
Edit current record (item, article, unit, etc.)
Delete current record
Most places — open record-specific submenu, sometimes — view profile
Open a wiki page with additional explanations
Send a message to the user
Download data in XLS format
Select item
29. Online chemical database with modeling environment
Summary
The database currently contains:
More than 50000 records
Around 285 properties
More than 2700 articles
31. Online chemical database with modeling environment
Practical course - outline
• Collection of data from original literature
• Use of publicly available tools for literature and cmemical structure
lookup
• Introduction of data to OCHEM — single record
• Collection of data from benchmark literature
• Introduction of data to OCHEM — batch upload
32. Online chemical database with modeling environment
Practical course – collection of data – before we start
Article name PubMedID Compound name Value
1
2
3
4
5
34. Online chemical database with modeling environment
Practical course – data collection
Article name PubMedID Compound name CYP
Modulation
1 Chemical genomics of •3H-1,2-dithiole-3-thione Inhibitor
cancer chemopreventive
19126641 •4-methyl-5-pyrazinyl-3H-1,2-dithiole-3-thione Noninhibitor
dithiolethiones •5-tert-butyl-3H-1,2-dithiole-3-thione Noninhibitor
2 Comprehensive in vitro Noninhibitor
analysis of voriconazole 19029318 Voriconazole
inhibition of eight cytochrome
P450 (CYP) enzymes: major
effect on CYPs 2B6, 2C9,
2C19, and 3A
3 Involvement of CYP1A2 in Mexiletine Inhibitor
mexiletine metabolism 9690950
4 Differential inhibition of Indinavir Noninhibitor
cytochrome P450 isoforms by 9278209
the protease inhibitors,
ritonavir, saquinavir and
indinavir
5 An evaluation of potential Clorgyline Inhibitor
mechanism-based inactivation 16669850
of human drug metabolizing
cytochromes P450 by
monoamine oxidase
inhibitors,including isoniazid.
35. Online chemical database with modeling environment
Practical course – data introduction – cheat sheet
Good chemistry lookup engine: PubChem (find URL in Google.com)
We search by name, and want to get structure
Convenient structure representation - SMILES
Property: CYP450 Modulation
Condition: CYP450 Type = CYP1A2
36. Online chemical database with modeling environment
Practical course – batch data introduction – template
• CASRN — CAS registration number
• SMILES — smiles string
• NAME — molecule name
• ARTICLEID — article identifier (PubMed or OCHEM)
• PAGE — article page
• TABLE — article table
• LINE — article line
• COMMENT — text comment
• REFERENCE — record reference
• CYP450 Modulation — value of the property
• Unit — measurment unit of the property
• Accuracy — measurment accuracy
• Interval — measurmen interval
• CYP450 Type — record condition
37. Online chemical database with modeling environment
Practical course – batch data introduction – cheat sheet
• Article URL: http://tinyurl.com/rendic
• Article title: «Summary of information on human CYP enzymes:
human P450 metabolism data»
• Good chemistry lookup engine: PubChem (find URL in Google.com)
• We search by name, and want to get structure
• Convenient structure representation - SMILES
• Property: CYP450 Modulation
• Condition: CYP450 Type = CYP1A2
• Reference = 1
• ArticleID = Q1592
• Batch upload template URL: http://tinyurl.com/bu-template