2. • Introduction
• Supported and funded by
• History
• PDB Holdings list
• Member organizations
• Task forces
• PDB ID
• PDB File format
• Browse to WWW.RCSB.ORG/PDB/
3. “The repository reservoir data bank to store the authenticated structures
of Protein and Nucleic acid”
Single worldwide database and hundreds of secondary databases categorize the
data differently.
Key resource in the area of structural biology, stores 3D structural data of
large biological molecules such as Proteins and Nucleic acids.
Data is submitted by Biologists and Biochemists from all around the world to
be freely accessible on internet via its member organizations’ websites and is
updated weekly.
The mission is to maintain a single Protein Data Bank Archive of
Macromolecular Structural data.
4. The Protein Data Bank (PDB) is operated by:
Rutgers, The State University of New Jersey.
The San Diego Supercomputer Center at the University of California, San
Diego.
RCSB-the Research Collaborator for Structural Bioinformatics
The PDB is supported by funds from the National Science Foundation, the
Department of Energy, and the National Institutes of Health.
5. Two forces to initiate PDB:
Growing collection of sets of protein structural data by X-Ray diffraction.
NMR-nuclear Magnetic Resonance method to visualize protein structures
in 3D, emerged in 1968.
In 1969, Dr Edger Meyer began to write software to store atomic
coordinates files in a common format to make them available for geometric
and graphical evaluation.
In 1971, one of Dr Meyer’s programs- SEARCH- enabled networking i.e
enabled the researchers to access information from database to study protein
structures offline.
6. In 1973, upon Hamilton’s death, Dr Tom Koetzle took over direction of PDB
for 20 years.
mmCIF project completed and Structural genomics began in 1970s.
In 1980s, IUCr guidelines established, number of structures deposited increases
and independent biological databases established – e.g., the NDB.
In Oct, 1998; PDB was transferred to Research Collaboratory for Structural
Bioinformatics (RCSB), complete transfer since 1999. Dr Helen M Berman of
Rutgers University was the new director.
In 2003, with the formation of wwPDB, the PDB became an international
organization having three member organizations.
In 2006, the BMRB joined PDB.
8. Act as Data deposition, Data processing and Distribution centers for PDB data.
Three are founding member organizations:
PDBe…Protein Data Bank in Europe.
PDBj…Protein Data Bank in Japan.
RCSB…Research Collaboratory for Structural Bioinformatics.
The Biological Magnetic Resonance Data Bank (BMRB) joined later in 2006.
Another organization Worldwide Protein Data Bank (wwPDB) oversees PDB.
wwPDB reviews and annotates each submitted entry and then it is automatically
checked for plausibility( the source code) for validation software is available.
9.
10.
11.
12.
13.
14. X-Ray diffraction :-Spacing of atoms determined by location intensities
spot on photographic plate by X-Ray e.g lyzozyme.
Limited to just crystal structures only
NMR (about 15% e.g., hemoglobin)…estimations of distances between
pairs of atoms of proteins. Final conformation is obtained after solving
distance geometry problem.
Illuminate dynamic side,conformatonal changes, protein folding as well
15.
16.
17. Each structure published in PDB receives a four character alphanumeric
identifier or accession number. Like, 1ANG or 4hhb.
However, this cant be used as an identifier for biomolecules. Because
several structures for the same molecule in different environments or
conformations-are contained in PDB with different PDB IDs.
HAEMOGLOBIN
(2DN2)
18. Standard data representation…encoded in data
dictionary. The metadata model supporting this
representation is used by all PDB data processing and
database software tools.
1. PDB file format was restricted to 80 characters per line
initially.
2. In 1996, macromolecular Crystallographic Information
File (mmCIF) format started.
3. In 2005, XML version called as PDBML, was
described.
19. The Protein Data Bank (pdb) file format is a textual
file format
describing the three dimensional structures of
molecules held in the Protein Data Bank.
provides description and annotation structure
atomic coordinates,
side chains,
secondary structure, as well as
atomic connectivity
Water , ions, nucleic acids, ligands…
20.
21. mmCIF is the acronym for the macromolecular Crystallographic
Information File.
mmCIF is based on a subset of the syntax rules for the Self
Defining Text Archive (STAR) file.
A Dictionary Description Language (DDL) defines the structure of
mmCIF dictionaries. Dictionaries provide the metadata which define
the content of mmCIF data files.
mmCIF data files, dictionaries and DDLs are all expressed in a
common syntax.
22.
23. basic information, more detailed
description of PDB, PDBML and mmCIF file formats
can be found at Protein Data Bank web sites.
highly recommended to get familiar with all rules of
PDB format (such as gaps between columns)
BEACAUSE…
24.
25. put either a search term (for example, a protein name) or a PDB
number
38. If the contents of the PDB are thought of as primary
data,
THEN
hundreds of derived (i.e., secondary) databases
categorize the data differently.
For example
SCOP & CATH :
categorize structures according to type of structure and
assumed evolutionary relations;
GO categorize structures based on genes.
39.
40. The Structural Classification of
Proteins (SCOP) database is a
largely manual classification of
protein structural domains based
on similarities of
their structures and amino
acid sequences
41. Class:the overall secondary-structure content of the
domain
Architecture:high structural similarity but no evidence
of homology.
Topology:a large-scale grouping of topologies which
share particular structural features
Homologous superfamily:indicative of a demonstrable
evolutionary relationship.
42. Pfam is a database
of protein
families that
includes their
annotations
and multiple
sequence
alignment
generated
using hidden
Markov models
59. Text file can be viewed or modified in editor.
Structure files may be viewed using various free and commercial
visualizations programs and Web browsers plug-ins like
OPEN SOURCE PDB SOFTWERES
Jmol
Molekel
MeshLab (able to import PDB data set and buildup surfaces from them)
QuteMol
Avogadro
OPEN BUT NOT FREE
PYMOL , RASMOL, VIST PROT 3DS & STAR BIOCHEM
60. The RCSB PDB website contains
an extensive list of both free and
commercial molecule visualization
programs and web browser plug-in.
61.
62.
63. central archive of experimentally solved bimolecular structures.
But
only allows data retrieval
does not provide collaboration or user feedback.
In contrast, PDBWiki allows for sharing expert knowledge
about structures deposited in the PDB.
provides tools for discussing and annotating proteins in a
collaborative way.
64. The Protein Data Bank (PDB) is the central archive of
experimentally solved bimolecular structures. However, the
PDB only allows data retrieval and does not provide
functionality for collaboration or user feedback.
In contrast, PDBWiki allows for sharing expert knowledge
about structures deposited in the PDB. It provides tools for
discussing and annotating proteins in a collaborative way. The
goal is to create a central and freely-accessible repository of
user-contributed information that will be useful for anyone
working with PDB structures. As such PDBWiki can be
considered a part of a wider effort in community-based
biological databases curation.