2. BIOMART
• Developed jointly by EBI & CSHL
• BioMart is a search engine that can
find multiple terms and put them
into a table format.
• Such as: human gene (IDs),
chromosome and base pair position
• No programming required!
2 of 38
3. BIOMART
• A wide variety of analyses and
tasks:
SNP (single nucleotide
polymorphism)
selection for candidate gene
screening
microarray annotation
recovery of disease links, sequence
variations and expression patterns
3 of 38
4. General or Specific Data-Tables
• All the genes for one species
• Or… only genes on one specific
region of a chromosome
• Or… genes on one region of a
chromosome associated with a
disease
4 of 38
5. BioMart Data Sets
• Ensembl genes
• Vega genes
• SNPs
• Markers
• Phenotypes
• Gene expression information
• Gene ontology
• Homology predictions
• Protein annotation
5 of 38
10. BioMart Walkthrough
• Glucose-6-phosphate dehydrogenase
(G6PD) human gene located on
chromosome X in cytogenetic band q28.
• Which are the other genes in relevance
to human diseases locate to the same
band?
• Find out their Ensembl Gene IDs and
Entrez Gene IDs?
• And also find out their cDNA
sequences?
10 of 38
11. Information Flow
• Choose the species of interest
(Dataset)
• Decide what you would like to know
about the genes (Attributes)
(sequences, IDs, description…)
• Decide on a smaller geneset using
Filters.
(enter IDs, choose a region …)
11 of 38
14. On the left narrow the
gene set by clicking
“Filters”. In front of
“REGION”, click on the “+”
to expand the choices.
Filters: what we know
14 of 38
17. Limit to genes
with MIM
disease ID’.
These
associations
have been
determined
using MIM
(Online
Mendelian
Inheritance in
Man).
17 of 38
18. The filters have
determined our gene set.
Click „Count‟ to see how
many genes have passed
these filters.
18 of 38
19. The „Count‟
results show 26
human genes out
of 56478 total
genes passed the
filters.
Click on „Attributes‟ to
select output options (i.e.
what we would like to
know about our gene
set).
19 of 38
21. Select, along with the default
options, „Associated Gene
name‟ (this shows the gene
symbol from HGNC).
Note the summary of selected
options. The order of attributes
determines the order of
columns in the result table. 21 of 38
32. Many BioMarts have now been
installed by external groups, in
large part because of its
automated deployment tools and
compatibility cross different
platforms. Some of the groups
are model organism databases
such as Gramene, Dictybase,
Wormbase, HapMap variation.
32 of 38