Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of
these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the
principles of the human transcriptional regulatory network, we determined the genomic binding information of
119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of
transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations.
In particular, there are significant differences in the binding proximal and distal to genes. We organized all the
transcription factor binding into a hierarchy and integrated it with other genomic information (for example,
microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for
instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate
targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched
network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components
are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the
two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome
sequences and understanding basic principles of human biology and disease.
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Architecture of the human regulatory network derived from encode data
1. ARCHITECTURE OF THE HUMAN REGULATORY
NETWORK DERIVED FROM ENCODE DATA
Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky
J, Alexander R, Min R, Alves P, Abyzov A, Addleman N,Bhardwaj N, Boyle AP, Cayting P, Charos
A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert
F, Harmanci A, Jain P,Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O'Geen H, Ouyang
Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M,Slifer T, Wang
J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers
RM, Weissman SM, Snyder M.
Paper Presentation | Physiology |M.Sc., ITMB UoA
Anaxagoras Fotopoulos – Thanos Papathanasiou | 2014
Nature, 489(7414):91-100, 2012
2. INTRODUCTION
▪ System-wide analyses of transcription-factor-binding patterns have been performed in
unicellular model organisms, such as Escherichia coli.
▪ For humans, systems-level analyses have been a challenge due to the size of the
transcription factor repertoire and genome.
▪ Large-scale data from the ENCODE project begin to enable such analyses
▪ An analysis of the genome-wide binding profiles of 119 transcription-related factors
derived from 450 distinct experiments is performed, for finding correlations and
multi-transcription factor motifs.
▪ The results are integrated with other genomic information to form a multi-level meta-
network in which different levels have distinct properties.
▪ Information obtained in this study will be crucial to interpreting variants in the many
personal genome sequences expected in the future and understanding basic
principles of human biology and disease.
3. ENCODE
ChIP-seq data sets for 119 TF
over five main cell lines
Peak Detection
TF
For every peak
find intensities of
overlapping peaks
of all other factors
Generation of
Cobinding maps
(e.g. GATA1)
vs
Negative Set
created by independently
shuffling the peak intensity
values in each row of
the co-binding map
RULEFIT ALGORITHM
{Combination of
factors are compared
to randomized co-
binding map}
Positive Set
Randomized co-binding map Randomized co-binding map
Relative Importance Coassociation score
Aggregate across all
focus-factor contexts
Importance
correlation
matrix
Maximal
Coassociation
matrix
Data & Methods Analysis
4. POL2-(4H8)
TAL1
GATA2
Relative Importance gives the overall importance of each
transcription factor in the model. It reflects the ‘size’ of the
biclusters to which a particular transcription factor belongs,
and it is related to the number of co-binding factors and the
fraction of peak locations involved.
For GATA1 context primary partners POL2,
TAL1 and GATA2, as well as local partners
MAX and JUN, have high RI scores.
Co-association scores measure the impact of the co-
dependency implicit in a particular pair on the model as a
whole, and they more directly probe the co-occupancy of
transcription factors in the focus factor context than does
the RI score.
CCNT2–HMGN3 Novel Pair
MYC–MAX–E2F6
Expected Pairings Many Genes that lie near
clusters of co-associated
factors are enriched for
specific biological functions.
For example
• Bicluster {E2F6–GATA1–
GATA2–TAL1} was enriched
for genes related to
myeloid differentiation
• Bicluster {E2F6–SP1–SP2–
FOS–IRF1} was involved in
DNA damage response
Distinct combinations
of factors regulate
specific types of
genes.
Example of GATA1 Relative Importance & Co-association scores
5. CORRELATIONS OF TRANSCRIPTION FACTORS {1/7}
WITH DISTAL EDGES
Downward
Pointing Edges
Upward
Pointing Edges
Distal edges have a
different degree
distribution than
proximal ones.
Transcription factors with low in-degree values in the proximal
network but high in-degree values in the distal one, indicating
that they are heavily regulated through enhancers
Top
Level
Middle
Level
Bottom
Level
6. CORRELATIONS OF TRANSCRIPTION FACTORS {2/7}
WITHIN THE PROXIMAL NETWORK
• Upper-level transcription factors tend to
have more targets than lower-level ones
(Less Shaded TF).
• In middle-level, TF concentrate many in-
degree & out-degree information
(bottleneck) between top and bottom
level.
Top
Level
Middle
Level
Bottom
Level
Downward
Pointing Edges
Upward
Pointing Edges
7. CORRELATIONS OF TRANSCRIPTION FACTORS {3/7}
WITH PROTEIN INTERACTIONS AND THE PHOSPHORYLOME
Top-level transcription factors
tend to have more partners in
the protein–interaction network
than do lower-level ones.
Kinases at the bottom tend not to
phosphorylate transcription factors
Kinases at the bottom tend to be
regulated by transcription factors
Phosphorylome is a proteome
(entire set) of phosphoproteins
8. CORRELATIONS OF TRANSCRIPTION FACTORS {4/7}
WITH ncRNAs
Highly connected
transcription factors
tend to regulate more
miRNAs and to be more
regulated by them.
Top-level and middle-level
transcription factors have the highest
total number of ncRNA targets.
enriched for miRNA –>TF edges
Balanced number of edges
enriched for TF –> miRNA edges
9. CORRELATIONS OF TRANSCRIPTION FACTORS {5/7}
WITH FAMILIES AND FUNCTIONAL CATEGORIES
• Transcription factors at the top of the hierarchy tend to have more general
functions, and those at the bottom tend to have more specific functions.
• TFSSs show a greater degree of tissue specificity and are more highly regulated
by miRNAs than the general and chromatin-related factors
Chromatin-related
factors are
enriched at the top
of the hierarchy
TF Sequence-
Specific (TFSSs) are
enriched in the
middle
10. CORRELATIONS OF TRANSCRIPTION FACTORS {6/7}
WITH GENE EXPRESSION
•Highly connected factors tend to be highly expressed
•Top and middle levels show a greater correlation.
•More ‘influential’ transcription factors tend to be better
connected and higher in the hierarchy.
•A model integrating the binding–expression relationships of
the highly connected transcription factors has the same
influence (in prediction) with the less connected ones (weak
binding–expression).
Top-Middle and Middle-Middle
transcription factor pairs influence
gene expression cooperatively
11. CORRELATIONS OF TRANSCRIPTION FACTORS {7/7}
WITH NETWORK DYNAMICS
Transcription factors change their binding patterns
among different cell types.
Targets of lower-level transcription factors tend to
change more between cell types, consistent with their
role in more specialized processes.
‘Rewiring score’ is negatively correlated
with hierarchy level
Binding
Set 1
Binding
Set 2
Common
binding
sites
Rewiring score quantifies
the difference between two
sets of binding targets of a
TF in two cell lines
(Gm12878 and K562)
12. ENRICHED NETWORK MOTIFS {1/4}
AUTO-REGULATOR MOTIFS
Network motifs are small connectivity patterns that carry out canonical functions
Motifs in broad template patterns, could be over- or under-represented relative to a random control
•Human Regulatory Network is
enriched with auto-regulators
•Auto-regulators tend to be
repressors, representing a well
known design principle for
maintaining steady state.
•Auto-regulators have more ncRNAs
as their targets
Auto regulator is a simple but important motif which is
commonly found in networks exhibiting multistability.
90 TF are Non-Auto-regulators
28 TF are Auto-regulators
13. ENRICHED NETWORK MOTIFS {2/4}
THREE TRANSCRIPTION FACTOR MOTIFS
•The most enriched motif of the Three-transcription-factor
motifs in the proximal network is the feed-forward loop (FFL).
•From the expression levels of the genes of the FFLs over many
tissues, many were positively correlated
•Enriched three-transcription-factor motifs contain an
additional regulation on top of that in a FFL. This creates a
mutual regulation between a pair of transcription factors,
instantiating a toggle-switch, which has essential role in the
determination of the cell
Enrichment
Depletion
14. ENRICHED NETWORK MOTIFS {3/4}
PPI-MIMs MOTIFS
•Co-regulating transcription factors are likely to interact
physically, indicating that they work together as a complex.
•The motif ranking second in enrichment consists of a distal
regulatory relationship, a promoter regulatory relationship, and
a protein–protein interaction. Consisting of a DNA loop, with an
interacting complex of transcription factors binding to the
promoter and enhancer simultaneously.
Possible Multiple-Input-Modules involving promoter and distal regulation and a Protein–Protein Interaction (PPI-MIMs)
15. ENRICHED NETWORK MOTIFS {4/4}
miRNA REGULATION MOTIFS
•The miRNAs are more likely to regulate a pair of physically
interacting factors.
•In order to avoid unwanted cross-talk, a miRNA tends to
shut down an entire functional unit (transcription factor
complex) rather than just a single component .
•miRNAs tend to target a pair of transcription factors
binding both proximally and distally. This suggests that
miRNA represses the expression of both promoter and
distal regulators to shut down a target completely.
16. ALLELIC BEHAVIOR IN A NETWORK FRAMEWORK
•The degree of allele-specific behaviour of each transcription factor can be
quantified by a statistic that we call ‘allelicity’.
•of the 4,798 allele-specific binding cases (Paternal or Maternal Targets) of a single
transcription factor, 57% showed coordinated allelic binding and expression.
•Increment of the degree of combinatorial regulation, cause a progressively
stronger relationship between expressed and regulated alleles.
•Small insertions and deletions in TF sequences cause more allelic behavior than
SNPs.
Examining relationships between sequence variation and transcription factor regulation
TF
Target
Pat/Mat
Every line denotes
allele specific binding
17. CONCLUSIONS
• Human transcription factors co-associate in a combinatorial
and context-specific fashion.
• Different combinations of factors bind near different
targets, and the binding of one factor often affects the
preferred binding partners of others.
• Transcription factors often show different co-association
patterns in gene-proximal and distal regions
• Different parts of the hierarchical transcription factor
network exhibit distinct properties.
• Number of motifs in which two genes co-regulated by a
factor are bridged by a protein–protein interaction or
regulating miRNA.
• Both transcription factors and Targets are under strong
evolutionary selection and exhibit stronger allele-specific
activity but are under weaker selection than non-allelic
ones.
18. Thank youNational & Kapodistrian
University of Athens
Department of Informatics
Technological Education
Institute of Athens
Department of
Biomedical Engineering
Biomedical Research
Foundation
Academy of Athens
Demokritos
National Center
for Scientific Research
Physiology
Information Technologies in Medicine and Biology