Protein Functional Site Prediction Using Shortest-Path Graph Kernel

PROTEIN FUNCTIONAL SITE
PREDICTION USING THE SHORTEST-
PATH GRAPH KERNEL METHOD
Presented by :: Malinda Sanjaka
Major Advisor:: Dr. Changhui Yan
Graduate Committee Members::
Dr. Juan (Jen) Li
Dr. Jun Kong
Dr. Nan Yu
Date:: 04/22/2013
1

Outline
Problem Statement
Introduction
Materials and Methods
Results and Discussion
Conclusion
Future Work
2

Problem Statement
 Problem : Prediction of functional sites on protein
structures
 What are the functional sites
 The functional sites are the small portion of a protein where substrate
molecules bind and undergo a chemical reaction.
 Example:
3
Phosphorylation SiteProtein 3D Structure

Problem Statement(2)
Importance of Functional Sites Prediction
 To understand protein functionalities
 To structure based drug design
 To design new protein
4

Outline
Introduction
Conclusion
Future Work
5

Introduction
20 Amino Acid
Protein
6

Introduction(2)
Protein Functional Sites
D. Catalytic active site atlas
 Catalytic active site atlas
 Phosphorylation Site
 DNA binding Site
 Zinc-binding site
7
Addition of a phosphate to an amino acid
 The functional sites are the small portion of a protein where substrate molecules bind
and undergo a chemical reaction.

Introduction(3)
Laboratory Methods for Functional Sites Determination
 X-ray Crystallography
 Nuclear Magnetic Resonance(NMR)
 Challenges
 Time consume
 High cost
 Lack of support for some protein
 Need skilled professional bodies
8

Introduction(4)
The Need for Computational Methods
Structural Genomics (SG) projects reveal large number of protein structures
but least understanding of protein function.
 Advantages
 Low cost
 Less execution time
 Less environmental impacts
 Results optimize by repeating
 Reusable
 Run as simulation
 Reduce human mistakes
 Disadvantage
 Accuracy is less than laboratory experimental results
 Computational methods provide helpful guide line for experimental approach
9

Introduction(5)
Computational Methods for Functional Sites Prediction
 Template-based
 Identify the structure similar template
 An alignment a target and the template
 Predict functional groups
 Micro environment-based
 Focus on a single residue or position
 Used structural and physicochemical properties
 Supervised machine learning approaches
 Macro environment-based
 Local structural region is involved
 Protein to protein interaction
 Structure-based drug design
 DNA-binding sites and ligand-binding sites
10

Introduction(6)
Overview of Our Approach
We used graphs to represent each residue with contacting neighbors in a
protein structure.
Central Residue
(+/Functional)
Contacting Residues
One Residue is
consist of number of
atoms
11
Residue
(-/Non-Functional) Contacting

Introduction(7)
Overview of Our Approach –Prediction
Database Knowledge
(Experimentally Verified)
Positive
(Functional/Active)
Negative
(Non-Functional/Non-Active)
Target Graph
(Functional or Non-Functional)
Similarity Prediction
Nearest Neighbor
Method
Shortest-Path Graph
Kernel
12

Outline
Introduction
Conclusion
Future Work
13

Materials and Methods
Datasets
 How to get protein structure
 Download::
[http://ftp.wwpdb.org/pub/pdb/data/biounit/coordinates/all/]
 How to get the protein sequence
 PDB Database ::
[ftp://ftp.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txt].
 PDB ID and Change ID :: 101m_A
 FASTA Format:: >101m_Amol:protein length:154 MYOGLOBIN
MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKASEDLKKH
14

Materials and Methods(2)
Catalytic Binding Site (CSA)
[http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/CSA/CSA_Show_EC_List.pl]
 73 Protein Chains
 201 Active Catalytic Sites
 20398 Non-Active Residues
 Balanced Dataset
 201 Active Catalytic Sites
Phosphorylation Site
 Section 3.3.4 of this paper
[http://www.informatics.indiana.edu/predrag/publications.htm].
 679 Protein Chains
 2062 Active Phosphorylation Site Residues
 Balanced Dataset
 2062 Active Phosphorylation Site Residues
15

Graph Representation
 Definition
 A graph G=<V, E>
 V vertices (nodes) and E edges (arcs)
 A path in G is a sequence of vertices
<v0, v1, v2, ..., vn>
 Directed Graph
 Undirected Graph
 Adjacency Matrix
16
Node
(Label)
Edge(Weight)

Graph Representation Contd.
 Node
 Edge
 Weight
 Labels
(PSSM <Biological conservation of amino acid>)
(Position-specific scoring matrix)
 blast-2.2.25+
 NR Database
 Distance Contacting
Residue (Node-
Labeled(PSSM))
Edge
(Arch) –weight (1)
Calculation
Distance (d1)
2+ (y1-y2)2+ (z1-z2)2
 VDW- radius of each atoms
(van der Waals-VDW.radii file)
d1 <= (R1+R2+0.5)
Protein Sequences
17
R1 R2
d1<x,y,z> PDB
Residue1.Atom1 Residue2.Atom1

Shortest-path graph Kernel
 What is a kernel
 Simply Kernel is a matrix
 AxA =<v1…..Vn,v1…..Vn> =Matrix elements
 What is a graph kernel
 Use graph instead of vectors
 What is shortest-path graph kernel
 Compare the each pair of node by using
shortest- path between each node
V1
V1
V2
V2
Vn
Vn
g1 g2 gn
g2
g1
gn
18

Shortest-Path Graph Kernel Contd.
 Original G1 and G2 graphs converted into shortest-path graphs S1 (V1, E1) and S2
(V2, E2)
 The Floyd-Warshall algorithm
 The kernel function is used to calculate similarity between G1 and G2 by
comparing all pairs of edges between S1 and S2.
 Calculation
11 22
),(),( 2121
Ee Ee
edge eekGGK
Where, kedge ( ) is a kernel function for comparing two edges
19
e1 e2
v1 w1 w2v2

)
2
||)()(||
exp(),( 2
2
wlabelsvlabels
wvknode
Where, labels (v) returns the vector of attributes associated with node v. Note that Knode() is a Gaussian
kernel function. 2
2
1
was set to 72 by trying different values between 32 and 128 with increments of 2.
|))()(|,0max(),( 2121 eweighteweightceekweight
Where, weight (e) returns the weight of edge e. Kweight( ) is a Brownian bridge kernel that assigns the
highest value to the edges that are identical in length. Constant c was set to 2 as in Borgward et
al.(2005).
Shortest-Path Graph Kernel Contd.
Let e1 be the edge between nodes v1 and w1, and e2 be the edge between nodes v2 and w2. Then,
),(*),(*),(),( 21212121 wwkeekvvkeek nodeweightnodeedge
Where, knode( ) is a kernel function for comparing the labels of two nodes, and kweight( ) is a
kernel function for comparing the weights of two edges. These two functions are defined as
in Borgward et al.(2005):
20
v1
<Pssm1>
e1=1
w2
w1 v2 e2=1
<Pssm2> <Pssm3>
<Pssm4>

Prediction Methods
 Nearest Neighbor Algorithm
 Classify a new example x by finding the training
example <Xi-Yj> that is nearest to x according to
Euclidean distance:
 NNM_Max
 NNM_AVE
 NNM_TOP10AVE
Positive
(Functional/Active)
Negative
(Non-Functional/Non-Active) ?
Test Set
Train Set(Experimentally Verified )
21
Similarity

 K-fold Cross-Validation
 Leave-One-Out Cross-Validation
Evolution of Predictors
22

Measurements for Evaluation
True Positive/ False Positive
Sensitivity
Specificity
Accuracy
23

Outline
Introduction
Conclusion
Future Work
24

Results and Discussion
Enzyme Catalytic Site
Enzyme catalytic site
TP TP % FN FN% FP FP% TN TN% Contact Not Contact Accuracy Sensitivity Specificity
NNM_Max 150 74.5% 51 25.3% 64 31.8% 137 68.1% 5 59 71.3% 74.5% 68.1%
NNM_Ave 155 77.1% 46 22.8% 46 22.8% 155 77.1% 5 41 77.1% 77.1% 77.1%
NNM_Top10Ave 156 77.6% 45 22.3% 51 25.3% 150 74.6% 5 46 76.1% 77.6% 74.6%
Phosphorylation
TP TP% FN FN% FP FP% TN TN% Contact Not Contact Accuracy Sensitivity Specificity
NNM_Max 1104 53.5% 958 46.4% 758 36.7% 1304 50.1% 73 685 58.3% 53.5% 50.1%
NNM_Ave 1054 51.1% 1008 48.8% 482 23.3% 1580 76.6% 54 428 63.8% 51.1% 76.6%
NNM_Top10Ave
1085 52.6% 977 47.3% 667 32.3 1395 67.6% 60 607 60.1% 52.6% 67.6%
25

Results and Discussion(2)
Percentile Ranking
 Used full dataset
 Ordered list
 Position ranking
 Majority of functional sites
are less 10% percentile
 NNM_MAX
 NNM_AVE
 NNM_TOP10AVE
26

Percentile Result(CSA) Active(Functional)
0
20
40
60
80
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Number Active Residues Vs.
Percentile[Max]
Number Active
Residues
0
20
40
60
80
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Number Active Residues Vs. Percentile[Ave]
Number Active
Residues
0
20
40
60
80
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Number Active Residues Vs. Percentile[Top
10 Ave]
Number Active
Residues
27

Percentile Result(CSA) Non-Active(Non-Functional)
18.5
19
19.5
20
20.5
21
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Number Non-Active Residues Vs.
Percentile[Max]
Number Non-Active
Residues
18.5
19
19.5
20
20.5
21
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Percentile[Ave]
Number Non-Active
Residues
18.5
19
19.5
20
20.5
21
0.0-0.1
0.1-0.2
0.2-0.3
0.3-0.4
0.4-0.5
0.5-0.6
0.6-0.7
0.7-0.8
0.8-0.9
0.9-1.0
Percentile[Top 10 Ave]
Number Non-Active
Residues
28

Outline
Introduction
Conclusion
Future Work
29

Conclusions
 We developed an innovative graph method to represent protein
surface based on how amino acid residues contact with each other.
 We implemented a shortest-path graph kernel method and used it
to compute the similarity between graphs.
 We developed three nearest neighbor variants to predict both
dataset based on the similarity matrix that the graph kernel method
produced.
 The predictors were able to predict catalytic sites with accuracy up
to 77.1%.
 This work showed that the proposed methods were able to capture
the similarity between enzyme catalytic sites and would provide a
useful tool for catalytic site prediction.
30

Outline
Introduction
Conclusion
Future Work
31

Future Work
Add more parameters into labels(graphs, nodes)
Improve the program as web service
Working with other kernel methods such
as, Minimum Spring Tree and etc.
Optimize algorithm for large datasets
32

Acknowledgements
I would like to express my deep gratitude to my adviser Dr.
Changhui Yan for his continuous
encouragements, guidance, and supports to complete this
paper successfully.
My sincere thanks also go to my committee members, Dr. Juan
(Jen) Li, Dr. Jun Kong, and Dr. Nan Yu for their willingness to
serve as committee members.
33

Introduction …
.vdw
.PDB
NR
Database
Blast
35

Protein
…-CUA-AAA-GAA-GGU-GUU-AGC-AAG
…-L-K-E-G-V-S-K-D-…
DNA
protein sequence
36

Important of Functional Site
Prediction
Understanding Protein Functionalities
Reveal the Structural Protein
Drug Design
Design New Protein
37

Rationale for Understanding Protein Structure and
Function
Protein sequence
-large numbers of
sequences, including
whole genomes
Protein function
- rational drug design and treatment of disease
- protein and genetic engineering
- build networks to model cellular pathways
- study organismal function and evolution
?
structure determination
structure prediction
homology
rational mutagenesis
biochemical analysis
model studies
Protein structure
- three dimensional
- complicated
38

Existing Applications for Protein
Active Sites Prediction
39

Our Approach
 Shortest-path Distance Theory
 Graph with Adjacent Matrix and Graph kernel
 Nearest Neighbor Variant (Max, Ave, Top10 Ave)
 Leave-one-out Cross-Validation
 True Positive & False Positive
 Increment percentile
40

Literature Review
 Graph
 Adjacency Matrix
 Shortest Distance Path Algorithm
 Cross Validation
 True Positive vs. False Positive
 Percentile Ranking 41

Graph
 A graph G=<V, E>
 V vertices (nodes) and E edges (arcs)
 A path in G is a sequence of vertices <v0, v1, v2, ..., vn>
 Directed Graph
 Undirected Graph 42

Adjacency Matrix
 A simple graph is a matrix with rows and columns
labeled by graph vertices
1 = Adjacent
0 = Not Adjacent
0s on the diagonal
43

Shortest Distance Path Algorithm
 Used in communications, transportation, electronics, and
bioinformatics problems.
 The all-pairs shortest-path problem involves finding the
shortest path between all pairs of vertices in a graph.
A i j=1 if there is an edge (Vi,Vj) ; otherwise, A i j =0
44

Percentile Ranking
 There is no proper definition for percentile
calculation
 Ordered List
 Position Ranking
 Max, Ave, Top10
45

Method And Material
 Data Gathering
 Identify the Active Residues
 Balance Dataset
 Generating a Map File
 Generate Set of Graphs
 Development of Graph Kernel
46

Data Gathering
http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/CSA/CSA_Show_EC_List.pl
 EC1, EC2…EC6
 HTML
 Regular Expression
 Finding Large Single Group
 Selected EC 3.4
 73 Protein chains
 201 Active Catalytic Site
 20398 Non-Active Resides
47

Data Gathering..
 Section 3.3.4 of This Paper
[http://www.informatics.indiana.edu/predrag/publications.htm].
 679 protein chains
 2062 Active Phosphorylation Site Residues
 139795 Non-Active Resides
48

Identify the Active Residues
 CSA Annotation –Database(CSA_2_2_12.dat)
[ http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/CSA/CSA_Download.pl]
 251777 Records
 List of Active Residue(201)
[http://www.informatics.indiana.edu/predrag/publications.htm]
 List of Active Residue(2062)
49

Balance Dataset
Computation Time
Leave-One-Out Cross-Validation
Random Selection
Catalytic Binding Site (CSA)
-Active 201 , Non Active 201
Phosphorylation Site
-Active 2062, Non Active 2062
50

Generating a Map File
 Map with Protein PDB ID with Protein Sequences
 Atomic Solvent Accessible Area Calculations (RASA)
 Position-Specific Scoring Matrix Calculations (PSSM)
 Active Residues
51

Map with Protein PDB ID with Protein
Sequences
 PDB ID and Change ID
101m_A
 PDB Database
[ftp://ftp.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txt].
FASTA Format
>101m_Amol:protein length:154 MYOGLOBIN
MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRVKHLKTEAEMKA
SEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHP
GNFGADAQGAMNKALELFRKDIAAKYKELGYQG
52

Atomic Solvent Accessible Area
Calculations (RASA)
 Calculate the Solvent Accessible Area (RASA) of each
Protein
 Naccess V2.11 Program
– Linux/Unix systems /Cygwin
– [http://www.bioinf.manchester.ac.uk/naccess/]
– ./naccess 1a91.pdb & ./naccess 1afo.pdb & ./naccess 1aig.pdb
 PDB DATA Bank –PDB File
– [http://ftp.wwpdb.org/pub/pdb/data/biounit/coordinates/all/]
ncbi-blast-2.2.24+
RASA >0
53

Position-Specific Scoring Matrix
Calculations (PSSM)
 Download PDB Files
 blast-2.2.25+ Program
– Microsoft Windows
 NR Database (non-redundant protein sequence)
Process p = new Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.FileName = "C:blast-2.2.25+binpsiblast.exe";
p.StartInfo.Arguments = string.Format("{0}", "-query " + FileNameIN + " -db C:blast-
• 2.2.25+dbnr -num_iterations 2 -out_ascii_pssm " + FileNameOUT);
p.Start();
• Example: Sample record of .PSSM
1 A 5 -2 -2 -2 -1 -1 -2 1 -2 -2 -3 -1 -2 -3 -2 2 -1 -3 -3 -1 77 0 0 0 0 0 0 10 0 0 0 0 0 0 0 13 0 0 0 0 0.59 1.#J
54

Sample Mapping File
>1neg_A
Seq :
KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLAAAWSHPQF
SUR :
11101011111111111111111111111111111111011111111011110111111111111
Site :
00000000000000000000000000000000000000000000000000000000000010000
rASA
:115.47,81.22,64.82,.00,20.59,.00,41.60,111.13,56.32,14.17,124.18,35.41,127.39,43.03,111.84,1
60.37,10.00,.71,33.57,1.82,120.20,91.83,15.89,41.40,69.81,.77,20.31,2.22,49.44,65.40,30.56,97
.39,80.11,152.72,75.17,80.10,47.20,64.49,.00,57.09,16.33,101.38,111.31,104.16,71.57,2.73,60.8
4,.00,18.67,8.04,64.07,71.08,.00,125.10,66.68,24.97,32.49,79.86,65.19,179.94,87.62,51.01,109.
35,145.21,71.53,
entropy
:0.80,0.85,0.25,0.92,0.44,1.48,1.02,2.42,1.57,2.01,0.44,0.93,0.49,0.73,0.73,0.83,1.72,1.46,0.59,
2.15,0.72,0.98,1.99,1.65,0.60,1.20,0.35,0.94,0.66,0.65,0.51,0.23,1.04,0.45,1.09,4.74,3.91,0.67,1
.38,0.61,0.45,0.75,1.43,0.49,0.36,2.32,0.72,1.63,3.17,0.46,1.53,2.78,1.61,0.38,0.45,0.26,0.15,0.
51,0.17,0.38,0.47,0.46,0.93,2.04,1.73,
pdbindex
:6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,
38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68
,69,70, 55

Generate Set of Graphs
Shorted Distance Path (Dijkstra Theory)
Adjacent Matrix Theory
Contacting Neighbor’s Residues
Labeled
Weighted
Various Numbers of Node and Edge
Normalization Graph
– Linear Normalization(X1) =(X-Min)/ (Max-Min)
56

Calculate Distance between Atoms
and Check the Contacting
2+ (y1-y2)2+ (z1-z2)2
 PDB File
 VDW
(van der Waals-VDW.radii file)
D1 <= (R1+R2+0.5)
Example of a contact residue
2 A _ 3 A! : 1.33441
Example of a non-contact residue.
4 A _ 2 A : 4.14432 57

Development of Graph Kernel
Original G1 and G2 graph converted into
shortest-path graphs S1 (V1, E1) and S2 (V2, E2)
The Floyd-Warshall algorithm
The kernel function is used to calculate the
similarity between G1 and G2 by comparing
all pairs of edges between S1 and S2.
59

The Floyd-Warshall Algorithm
for i = 1 to N
for j = 1 to N
if there is an edge from i to j
dist[0][i][j] = the length of the edge from i to j
else dist[0][i][j] = INFINITY
for k = 1 to N
for i = 1 to N
for j = 1 to N
dist[k][i][j] = min(dist[k-1][i][j], dist[k-1][i][k] + dist[k-1][k][j])
To find the shortest path between all vertices v V for a weighted graph G = (V; E).
D(k)
ij=the weight of the shortest path from vertex I to vertex j for which all intermediate
vertices are in the set {1,2,……k}
60

Implementation
doublePssm(intResidueA, intResidueB)
{
inti;
double sum=0;
for (i=0; i<20; i++)
{
sum+=pow((double)(seq_a_pssm[ResidueA][i]-seq_b_pssm[ResidueB][i]), 2);
}
sum=((double)sum);
return sum;
}
dis+=Pssm(i, j);
attr_dis[i][j]=exp((-1)*parm_gamma*dis);
sum=0;
for (i=0; i<seq_a_len; i++)
for (j=0; j<seq_b_len; j++)
for (k=i+1; k<seq_a_len; k++)
for (r=j+1; r<seq_b_len; r++)
{
xx1 = seq_a_dist[i][k]-seq_b_dist[j][r];
Klen=MaxValue(0, CC-fabs(xx1));
product1=attr_dis[i][j]*attr_dis[k][r];
product2=attr_dis[k][j]*attr_dis[i][r];
value=MaxValue(product1, product2);
sum+=value*Klen;
}
return sum;
61

Compare Similarity
Max
Ave
Top 10 Ave
62

Result and Discussion
Comparison Similarity (TP/FP)
– Max
– Ave
– Top 10 Ave
Percentile Ranking calculation
 RASA Value
63

static
IEnumerable<string>SortByLength(IEnumerabl
e<string> e)
{
var sorted = from s in e
orderbys.Length descending
select s;
return sorted;
}
Section 3.4
67

List of
Phosphorylation
Site
69

Catalytic Binding Site (CSA)-Active Residue
Back
70

Phosphorylation Site-Active Residues
Back
71

van der Waals-VDW.radii file
Back
RESIDUE ATOM ALA 5
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
RESIDUE ATOM ARG 11
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.87 0
ATOM CD 1.87 0
ATOM NE 1.65 1
ATOM CZ 1.76 0
ATOM NH1 1.65 1
ATOM NH2 1.65 1
RESIDUE ATOM ASP 8
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.76 0
ATOM OD1 1.40 1
ATOM OD2 1.40 1
RESIDUE ATOM ASN 8
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.76 0
ATOM OD1 1.40 1
ATOM ND2 1.65 1
RESIDUE ATOM CYS 6
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM SG 1.85 0
RESIDUE ATOM GLU 9
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.87 0
ATOM CD 1.76 0
ATOM OE1 1.40 1
ATOM OE2 1.40 1
RESIDUE ATOM GLN 9
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.87 0
ATOM CD 1.76 0
ATOM OE1 1.40 1
ATOM NE2 1.65 1
RESIDUE ATOM GLY 4
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
RESIDUE ATOM HIS 10
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.76 0
ATOM ND1 1.65 1
ATOM CD2 1.76 0
ATOM CE1 1.76 0
ATOM NE2 1.65 1
RESIDUE ATOM ILE 8
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG1 1.87 0
ATOM CG2 1.87 0
ATOM CD1 1.87 0
RESIDUE ATOM LEU 8
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.87 0
ATOM CD1 1.87 0
ATOM CD2 1.87 0
RESIDUE ATOM LYS 9
ATOM N 1.65 1
ATOM CA 1.87 0
ATOM C 1.76 0
ATOM O 1.40 1
ATOM CB 1.87 0
ATOM CG 1.87 0
ATOM CD 1.87 0
ATOM CE 1.87 0
ATOM NZ 1.50 1
72

Protein Functional Site Prediction Using Shortest-Path Graph Kernel

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Protein Functional Site Prediction Using Shortest-Path Graph Kernel

Similar to Protein Functional Site Prediction Using Shortest-Path Graph Kernel (20)

Recently uploaded

Recently uploaded (20)

Protein Functional Site Prediction Using Shortest-Path Graph Kernel

Editor's Notes