THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
Hertweck AB3ACBS presentation
1. Plant systematics to cancer biology:
Transferrable skills and evolutionary
thinking in bioinformatics
Kate L Hertweck
The University of Texas at Tyler
Department of Biology
Twitter @k8hert
2. I am...
...an educator and researcher.
...an evolutionary biologist.
...a data-driven bioinformaticist.
...committed to reproducible science.
4. Outline:
1. Evolution in monocots
2. Population genomics in Drosophila
3. Biomarkers in cancer
Objectives:
1. Identify associations between genomic and
organismal variation
2. Consider opportunities transferring
bioinformatic skills among model systems
Biodiversity
Heritage
Library
5. Can we use genomic data to determine relationships
among species and identify patterns of genomic evolution
across deep time?
?
6. Monocots are a delicious and diverse model system
●
ca. 60,000 species, many edible and ornamental
●
Variation in traits
●
life history :growth habit, habitat
●
genome: size, chromosome number, ploidy
●
Few genomic resources except in grasses
Darlington 1963
Asparagus from
user Evan-Amos
Allium from user Ram-ManIris from user
Bob Gutowski
Allium, Bozzini 1964
7. Monocots exhibit varying rates of evolution and
shifts in diversification rates
Hertweck et al., 2015 Bot J Linn Soc
●
Data: Eight loci from three
genomic partitions (mt, cp,
nuclear; including one low-
copy nuclear gene)
●
Analysis: tree-building with
RAxML, divergence time
analysis with r8s and
multidivtime, diversification
with MEDUSA and
apTreeShape
A
Fossil calibration
Species-rich lineage (MEDUSA)
Species-poor lineage (MEDUSA)
A ApTreeShape
8. Steele, Hertweck, Mayfield, McKain,
Leebens-Mack, and Pires, 2012 AJB
●
Data: Genomic survey
sequences (GSS;
anonymous, low-coverage
NGS data)
●
Analysis: plastome and
mt/nrDNA assembly, tree
building with PAUP and Garli
●
Used less than 10% of the
data collected!
Doryanthaceae
Iridaceae
Xeronemataceae
Hemerocallidoideae
Xanthorrhoeoideae
Asphodeloideae
Agapanthoideae
Allioideae
Amaryllidoideae
Aphyllanthoideae
Lomandroideae
Asparagoideae
Nolinoideae
Agavoideae
Scilloideae
Brodiaeoideae
Plastid genomes resolve relationships in Asparagales
Xanthorrhoeaeceae
AgapanthaceaeAsparagaceae
*increase in
bootstrap support
*
*
*
*
*
*
*
*
*
*problematic
*
9. ●
Transposable
elements (TEs): mobile
genetic elements or
jumping genes
●
Independently
replicating
●
Similar to or derived
from viruses
●
Occur in multiple
copies throughout the
genome
●
TEs are an important
driver in genomic
evolution
●
Interactions with
genes
●
Genome-wide
modifications
●
Source of mutation on
which natural selection
can act
Transposable elements are an underappreciated
source of genomic variation
Approach: assembly of
TEs from GSS
●
contigs are consensus
of most abundant TEs
in the genome
●
TEs must exist in high
copy to have sufficient
reads for detection
(assembly)
●
the older a TE
insertion, the more
likely it has
accumulated
mutations which will
inhibit detection
●
data presented as
percentage of TE type
in nuclear genome
(relative abundance)
Heslop-Harrison et al, 1997
10. Hertweck, 2013, Genome
TE content does not vary with genome size
Aphyllanthes
Lomandra
Sansevieria
Asparagus
Ledebouria
Dichelostemma
Agapanthis
Allium
Haworthia
Hosta
Scadoxus
0%
10%
20%
30%
40%
50%
60%
70%
0
5000
10000
15000
20000
25000
Percentageofsequence
readsfromnucleargenome
One of largest
genomes in
dataset, but very
small proportion
of repeats!
●
Data: Previously
published GSS data
●
Analysis: assembly
with MaSuRCA,
BLAST to remove
organellar
sequences, annotate
with RepeatMasker
●
Inconsistent with
hypothesis that TE
proliferation is related
to an increase in
genome size
Genomesize(Mb/1C)
11. tetraploid,
largest (known) genome in dataset
Agavoideae TEs are difficult to annotate but
appear to vary with ploidy
●
Data: GSS from
Agavoideae (tequila)
●
Analysis: additional
annotation methods
with CDD
●
Agavoideae TEs are
particularly difficult to
sequence
●
CDD more than
doubles identifiable
sequence!
Agave tequilana from user
Stan Shebs
12. copia
gypsy
Allium
other Allioideae
Allium have much lower proportions of
Copia LTR retrotransposons than closely related genera
●
Data: GSS from
Allioideae (onion, garlic,
leek)
●
Allium has 800+
species, related genera
have relatively few
●
Low proportion of copia
counter to expectations
of diversification from
TE expansion
Allium senescens from user
Adamantios
13. Conclusions: AsparagalesConclusions: Evolution in monocots
Can we use genomic data to determine relationships
among species and identify patterns of genomic
evolution across deep time?
●
Monocot phylogenetics
●
Unlinked loci from across the genome provide the
framework for diversification analyses
●
Complete plastomes resolve Asparagales relationships
●
Asparagales TEs
●
GSS can suggest what parts of the genome may be
interesting for further investigation
14. 1. Evolution in monocots
2. Population genomics in Drosophila
3. Biomarkers in cancer
Collaborators:
Michael R. Rose (UC Irvine)
Joseph L. Graves (NC A&T, UNCG)
D. melanogaster male from user Aka
16. Long term experimental evolution
system (established 1980) with
following treatments:
A short life cycle (9 days)
B baseline life cycle (14 days)
C long life cycle (28 days)
●
Data: Whole-genome pooled
population resequencing,
three selection types, six
treatments, five populations
each
●
Analysis: phenotypes, SNPs,
structural variants, TEs
Experimental evolution in Drosophila results in
parallel responses to selection for time to development
NCO
BO
AO
CO
ACO
B
17. B C A
Populations with accelerated development
have higher TE load
●
Analysis: Identification of
per-population TE load using
PopoolationTE
●
Within-treatment TE load is
not significantly different
(p>0.05)
●
Between-treatment TE load
does differ
●
Consistent with expectation
that TEs are more tightly
controlled in populations with
longer life spans
18. Heterozygosity of TE insertions is higher in
populations with accelerated development
●
Analysis: T-lex to
identify insertion
frequencies for TEs
compared to reference
genome
●
Within-treatment TE
load is not significantly
different (p>0.05)
●
Between-treatment TE
load does differ
●
Consistent with
expectation that A-type
selection is more
intense
B C A
19. ●
Analysis: T-lex to
identify insertion
frequencies for TEs
compared to reference
genome followed by
CMH test
●
177 insertions vary in
frequency between two
or more populations
●
91 insertions were
significantly
differentiated among at
least one treatment
comparison
●
Within-treatment
comparisons have few
to no significantly
differentiated TEs
Between-treatment comparisons have
more significantly differentiated TEs
20. ●
Yes, with evidence from across the genome
●
Many types of TEs are responding to selective pressures
●
Comparisons of treatment types shows parallel response to selection
●
These data are a powerful tool for continuing to assess TE
responses to selection at a genomic level
Conclusions: Population genomics of Drosophila
f
Do populations experimentally selected for specific
phenotypes yield similar genomic patterns?
21. 1. Evolution in monocots
2. Population genomics in Drosophila
3. Biomarkers in cancer
Collaborator:
Santanu Dasgupta (UT Health Northeast)
Philley et al, 2015, J Cell Phys
22. Can we integrate genomic data
with experimental studies to
identify biomarkers and cancer
pathways?
23. Background
●
Both detection and treatment of
cancer remain problematic
because of complex and
heterogeneous genetics
●
Integration of NGS analysis with
traditional wet lab work can
inform the relevance of particular
genetic variants and be used for
biomarker development
Philley et al, 2015, J Cell Phys
24. Philley et al, 2015, J Cell Phys
Haplotype phylogeny identifies variants
potentially linked to cancer
Turquoise = heteroplasmy
@ = reversion
●
Data: mitochondrial genome
sequencing from prostate
cancer patients
●
Analysis: Variant calling,
haplotypes assigned with
HaploGrep and PhyloTree
●
Differentiates variants due to
common ancestry from
variants possibly related to
cancer
25. Somatic mutations inform analyses in
genes of interest for HNSCC
●
Data: whole-genome
NGS data from paired
tumor/non-tumor tonsil
tissue (HPV-induced
head/neck squamous
cell carcinoma)
●
Analysis: Variant
calling, filter for only
somatic variants, mine
genes of interest
●
Provides the genetic
context to match with
protein expression
studies
●
Opportunities for data
re-use to examine
evolutionary questions
Kannan, Hertweck et al., in review
26. Conclusions: Biomarkers in cancer
Can we integrate genomic data with experimental studies
to identify biomarkers and cancer pathways?
●
Paired tumor/normal samples are a powerful tool for identifying
variants related to multiple types of cancer
●
The integration of genomic data with wet-lab work contributes to both
biomarker development and elucidation of cancer pathways
●
Evolutionary thinking is valuable for interpreting integrative studies
27. General conclusions
●
You can answer really interesting questions about evolutionary
biology by combining NGS data with other types of biological
information
●
Skills to assess variation in large datasets are very transferrable and
offer great opportunity for novel research approaches
Goal: Relate genomic variation to organismal
function and evolution to understand complex
traits.
1. Evolution in monocots
2. Population genomics in Drosophila
3. Biomarkers in cancer
28. Considerations for diversifying your research
●
Learning reproducible science skills is well worth your
time!
●
Find a community.
●
Be prepared to spend lots of time managing and
organizing data.
●
Choose collaborations carefully, but don't be afraid to
branch out.
Image by Sugar Research Australia
Hibiscus dasycalyx by user
Sesamehoneytart
Hibiscus dasycalyx by user
Sesamehoneytart
Clostridium acetobutylicum by user
Geoman3