A report presented in my BNF 216 (Database Design and Modeling for Bioinformatics) class regarding principles and tips to follow in designing biological databases.
1. How do you solve a problem like a
biological database?
(BNF 216 - Database Modeling and Design for Bioinformatics)
Arjei Balandra
Software Developer
National Telehealth Center
University of the Philippines – Manila
http://bumblebest.net
2. Database
• A database is a set of data that has a regular
structure and that is organized in such a way
that a computer can easily find the
desired information.
– The Linux Information Project
(http://www.linfo.org/database.html)
3. Biological Database
• Biological databases are libraries of life
sciences information collected from scientific
experiments, published literature, high-
throughput experiment technology, and
computational analyses.
- Wikipedia (en.wikipedia.org/wiki/Biological_database)
7. Why Database?
• Data-intensive techniques such as high-
throughput screening and gene expression
experiments demand methods to correlate
large and diverse datasets.
• Databases integrate information from a
variety of sources allowing faster and more
powerful searches.
9. Good Database Design
• Provides easy access to previous results.
• Supports both expert- and machine-guided
searches for novel correlations in data.
10. Bad Database Design
• Obfuscates the correlations for which the user
is searching
• makes it difficult for biologists to fit their data
into the database or to find previously stored
data resulting to user contempt.
• ‘brittle’
25. • In Biology, one size does not fit all
• Focus on a subset of Biology (ie. Genes,
Proteins)
• In large subsets, do it one at a time
• Inclusive
Keep the database scope manageable
26. LISTEN TO THE PEOPLE WHO HAVE TO
WRITE AND USE THE INTERFACE
Tip #7:
27. • Databases are successful only when people
use it
Users know what they want and need
+ Developers know what they can do
+ Designers know what must be done
---------------------------------------------------------
= Collaborative approach to develop a
successful database
32. References
• The Linux Information Project
(http://www.linfo.org/database.html)
• Nelson, M.R., Reisinger, S.J., Henry, S. (2003).Designing
databases to store biological information. BIOSILICO
Vol. 1, No. 4
• Wikipedia (en.wikipedia.org/wiki/Biological_database)
• Lemer, C., Antezana, E., Couche, F., Fays, F., Santolaria,
X., Janky, R., … Wodak, S. J. (2004). The aMAZE
LightBench: a web interface to a relational database
of cellular processes. Nucleic Acids
Research, 32(Database issue), D443–D448.
doi:10.1093/nar/gkh139