INTRODUCTION
DEFINITION OF BIOINFORMATICS
HISTORY
OBJECTIVES OF BIOINFORMATICS
TOOLS OF BIOINFORMATICS
BIOLOGICAL DATABASES
HOMOLOGY AND SIMILARITY TOOLS (SEQUENCE ALIGNMENT)
PROTEIN FUNCTION ANALYSIS TOOLS
STRUCTURAL ANALYSIS TOOLS
SEQUENCE MANIPULATION TOOLS
SEQUENCE ANALYSIS TOOLS
APPLICATION
CONCLUSION
REFERENCES
2. INTRODUCTION
DEFINITION OF BIOINFORMATICS
HISTORY
OBJECTIVES OF BIOINFORMATICS
TOOLS OF BIOINFORMATICS
o BIOLOGICAL DATABASES
o HOMOLOGY AND SIMILARITY TOOLS (SEQUENCE
ALIGNMENT)
o PROTEIN FUNCTION ANALYSIS TOOLS
o STRUCTURAL ANALYSIS TOOLS
o SEQUENCE MANIPULATION TOOLS
o SEQUENCE ANALYSIS TOOLS
APPLICATION
CONCLUSION
REFERENCES
11-May-20
2
3. Bioinformatics is a newly emerged scientific discipline
for the computational analysis and storage of biological
data. The word bioinformatics has been derived from
two words.
Bio means biology
Informatics (a French word) meaning ‘data processing’.
Bioinformatics is the field in which biology, computer
science and information technology merge into single
discipline for managing and analyzing biological data
using advanced computing techniques.
11-May-20
3
4. Keeping in view all the facts, bioinformatics can be
defined as the storage, analysis, and searching/retrieval
of data(e.g. nucleic acid sequences for the genes and
RNAs, amino acid sequence and structural information of
protein).
Fredj Tekaia at the Institute Pasteur, Paris (France)
defined bioinformatics more precisely as the
mathematical, statistical and computing methods that
aim to solve biological problems using DNA and amino
acid sequences, and related information.
11-May-20
4
5. YEAR’S SCIENTIST HISTORICAL EVENTS
1958 Jack Kilby The first integrated circuit (IC) was constructed.
1971 Ray Tholinson The e-mail program was invented.
1974 Vint Cerf & Robert
Khan
The concept of connecting network of computer into an “internet”
and develop the Transmission Control Protocol (TCP) was
developed.
1981 PC IBM introduces its Personal Computer to the market.
1984 The Macintosh was announced by Apple Computer.
1986 SWISS-PROT The SWISS-PROT database was created by the Department of
Medical Biotechnology of the University of Geneva and the
European Molecular Biology Laboratory (EMBL).
1987 HGI (Human
Genome Initiative)
NIH NIGMS begun funding of genome projects.
1990 BLAST The BLAST program is implemented.
1991 Birth of term
“Bioinformatics”
First time the term Bioinformatics appeared in the scientific
literature.
11-May-20
5
6. At its simplest and basic level, bioinformatics organizes data in a way
that allows researchers to access existing information and to submit
new entries, as produced (e.g.) the protein data Bank for 3D
macromolecular structures.
The second key objective is to develop tools and resources that aid in
the analysis of data. For example, having sequenced a particular
protein, it is of interest to compare it with previously characterized
sequences.
The third objective is to use these tools to analysis the data and
interpret the results in a biologically meaningful manner. Traditionally,
biological studies examined individual systems in detail, and frequently
compared them with a few that are related.
11-May-20
6
7. These are software programs that are designed for
extracting the meaningful information from the mass of
molecular biology/biological databases and to carry out
sequence and structural analysis.
After the formation of the databases, tools become
available to search sequences databases.
The bioinformatics tools can be categorized in to the
following categories:
a) Biological databases
b) Homology and similarity tools (Sequence alignment tool)
c) Protein function analysis tools
d) Structural analysis tools
e) Sequence manipulation tools
f) Sequence analysis tools
11-May-20
7
8. This biological database usually contain genomic, proteomic and
metabolic data. The data include nucleotide sequences of genes or
amino acid sequences.
Some of the major biological database are:
a) Major Nucleotide Sequences Database.
b) Major Mutation Databases.
c) Major Gene Expression Databases.
d) Major Microbial Genomic Databases.
e) Major Organism Specific Genome Database.
f) Major protein Database.
EMBL (European Molecular Biology Laboratory nucleotide sequence database at EBI, Hinxton,
UK)
NDB (Nucleic Acid structure Database at Rutgers University, USA)
Entrez/Genome (NCBI, USA)
11-May-20
8
9. Homologous sequences are sequences that are
related by divergence from a common ancestor.
Thus the degree of similarity between two
sequences can be measured.
This set of tools can be used to identify
similarities between novel query sequences of
unknown structure and function and database
sequences whose structure and functions have
been elucidated.
11-May-20
9
10. o It is a program for sequence similarity searching
developed at the NCBI.
o It identifies genes and genetic features.
o It executes sequences searches against the entire DNA
database in less than 15 seconds.
o A BLAST search enables a researcher to compare a
query sequence with a database of sequence and
identify database sequence that resemble the query
sequence.
11-May-20
10
11. FASTA is a DNA and protein sequence alignment software
package.
It is used for a fast protein or fast nucleotide comparison.
This program achieves a high level of sensitivity for
similarity searching at high speed.
11-May-20
11
12. These groups of programs allow comparing protein
sequence to the secondary protein databases that contain
information on motifs, signatures and protein domains.
Interproscan
Search protein sequences.
PPSearch
Searches protein motifs.
Radar
Protein repeats detection.
11-May-20
12
13. This set of tools allows comparing structures with the known
structures databases. The determination of a protein’s 2D/3D
structure is crucial in the study of its functions.
RasMol
It is a powerful research tool to display the structure of biological
macromolecules like DNA, proteins and smaller molecules.
PROSPECT(PROtein Structure Prediction and Evaluation
Computer Toolkit)
It is a protein structure prediction system that employs a
computational technique called protein threading to construct a
protein 3-D model.
COPIA(Consensus Pattern Identification and Analysis)
It is a protein structure analysis tool for discovering motifs in a
family of protein sequences. Such motifs can then be used to
determine membership to the family of new proteins sequences,
predict secondary and tertiary structures and functions of proteins.
11-May-20
13
14. These are software programs for analyzing and formatting
DNA and protein sequences.
RepeatMasker
It is a program that screens the DNA for interspersed
repeats.
Webcut
It is an online tool for restriction analysis, silent
mutation analysis, and SNP analysis.
Translate
It is a tool which allows the translation of a nucleotide
sequence to a protein sequence.
11-May-20
14
15. This set of tools allow to carry out further more detailed
analysis of query sequence including evolutionary
analysis, identification of mutation.
Align
This tool is used to compare two sequences.
DNA Scanner
It is a tool that scans DNA for number of different
properties such as biophysical, potential for protein
interaction.
11-May-20
15
16. Some of the applications related to biological
information analysis are:
Bioinformatics is used in primer design.
Bioinformatics is used to attempt to predict the function
of actual gene products.
Molecular modeling/structural biology is a growing field
which can be considered part of bioinformatics.
There are other fields- for example, medical imaging/
image analysis, that might be considered part of
bioinformatics. there is also a whole other discipline of
biologically inspired computation: genetic algorithms, AI,
neural networks etc.
11-May-20
16
17. Bioinformatics is building on the recognition of the
importance of information transmission, accumulation
and processing in biological systems.
Software tools for bioinformatics range from simple
command-line tools, to more complex graphical programs
and standalone web-services available from various
bioinformatics companies or public institutions.
11-May-20
17