SlideShare a Scribd company logo
1 of 23
Download to read offline
GCB 2012

Computation and visualization of protein topology
graphs including ligands

Tim Schäfer, Patrick May, Ina Koch
Goethe-University Frankfurt am Main
Institute for Computer Science
Department for Molecular Bioinformatics
Contents
●

Introduction
●

●

Protein structure, ligands and graphs

Methods
●

●

Types of protein ligand graphs

●

●

Computing protein ligand graphs from PDB and DSSP data
Visualization of protein ligand graphs

Results
●

The VPLG software

●

Case studies

Computation and visualization of protein topology graphs including ligands

2
Protein structure
●

Primary structure
●

●

●

String of 20 AAs
3D: peptide bonds, steric constraints, Ramachandran

Secondary structure
●

●

super-secondary structure

●

●

local, H-bonds, SSE types
protein domains

Tertiary structure
●

●

atom coordinates

Quaternary structure
●

multi-chain proteins

Computation and visualization of protein topology graphs including ligands

3
Modeling protein structure as a graph
●

●

●

Graph G = (V, E)
Pervasive data structure in
computer science and
bioinformatics
Usage of graps to model
proteins
●

CATH [Orengo et al.]

●

SCOP [Murzin et al.]

●

TOPS+ [Gilbert, Veeramalai]

●

PTGL [May, Koch]

Computation and visualization of protein topology graphs including ligands

4
Protein ligand interactions
●

●

●

Proteins interact with their environment to fulfill their biological
function: other proteins, ligands, ...
Knowledge on PLI is important in many applications, e.g. drug
design, molecular medicine
> 10,000 different ligands occur in PDB files

Computation and visualization of protein topology graphs including ligands

[PDB Ligand Expo, 1AVD]

5
A graph model of proteins on the supersecondary structure level
●

Secondary Structure Elements (SSEs)
●

●

●

Few: 40,000 atoms => 400 residues => 40 SSEs
Automated assignment from 3D data possible (e.g., DSSP)

Protein model
●

●

undirected, labeled graph for each chain of a protein

Protein graph
●

vertices: SSEs or ligands

●

edges: contacts and spatial relations between them

●

graph types: alpha-beta-graph, alpha-graph or beta-graph

Computation and visualization of protein topology graphs including ligands

[Koch 1997, May et al. 2004]

6
Protein ligand graphs
●

Extension of protein graph model

●

Protein ligand graph G = (V, E)
●

Vertex: SSE || Ligand

●

Edge: Spatial contact (parallel, antiparallel, mixed, ligand, backbone)

Computation and visualization of protein topology graphs including ligands

7
Computing protein ligand graphs
1. Obtain sequence from PDB file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM

Computation and visualization of protein topology graphs including ligands

8
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM
HHHHHHHHHH

EEEEEEE

HHHHHH EEEEEEE

Computation and visualization of protein topology graphs including ligands

9
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file

SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMJ
HHHHHHHHHH

EEEEEEE

HHHHHH EEEEEEE

Computation and visualization of protein topology graphs including ligands

L
10
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file
4. Build graph: add a vertex for each SSE

H

E

H

Computation and visualization of protein topology graphs including ligands

E

L
11
Computing protein ligand graphs
1. Obtain sequence from PDB file
2. Obtain secondary structure assignments from DSSP file
3. Add ligand information from PDB file
4. Build graph: add a vertex for each SSE
5. Compute spatial contacts and add edges between SSEs

H

E

H

Computation and visualization of protein topology graphs including ligands

E

L
12
VPLG – an open source implementation in Java

Computation and visualization of protein topology graphs including ligands

13
Overview:
Computing protein ligand graphs

Computation and visualization of protein topology graphs including ligands

14
Contact computation
●

Parser for PDB and DSSP files

●

Atom level contacts
●

●

●

vdW radius overlap, radius 2 A
complexity for n atoms is Ο (n*n)

Residue level contacts
●

●

Protein-ligand contacts
●

●

collision spheres, CA is center

collision spheres, min max center

SSE level contacts
●

rules depending on SSE type, difficult

Computation and visualization of protein topology graphs including ligands

15
Types of protein graphs
●

Alpha

●

Beta

●

Alpha-beta

●

Ligands

●

Not ness. connected
●

Connected components

●

Backbone option

Computation and visualization of protein topology graphs including ligands

16
The VPLG software
●

Command line tool

●

GUI in Java/Swing

●

Screenshots

●

Graph file formats

●

Database support

Computation and visualization of protein topology graphs including ligands

17
Protein graphs for the whole PDB –
Some statistics
●

62,364 proteins consisting of 151,025 chains

●

1.4 million helices, 1.2 million beta strands, 300k ligands

●

Contacts
●

●

Parallel: 450.000

●

Antiparallel: 870.000

●

Total non-ligand: 2.000.000

●

●

Mixed: 710.000

Ligand: 760.000 => 2.50 contacts per ligand

=> 0.76 contacts per SSE

High number of isolated ligands (~30 %) at first run
●

Atom radius too small?

●

Contact definition?

●

Small ligands (few atoms?)

Computation and visualization of protein topology graphs including ligands

18
Outlook and possible improvements
●

VPLG software
●

●

New spatial relation 'close to' for ligands

●

●

Post-processing of DSSP output (short SSEs)
RNA / DNA as ligands

Backend
●

Get Web server online

●

Protein structure comparison (not necessarily graph-based)

●

Use VPLG to get ligands into the PTGL

Computation and visualization of protein topology graphs including ligands

19
Summary

●

Ligands are part of protein graph

●

Visualization based on primary structure ordering of SSEs

●

VPLG software is open source implementation
●

●

●

Visualization of protein graphs from PDB and DSSP files
http://www.bioinformatik.uni-frankfurt.de
https://sourceforge.net/projects/vplg/

Evaluation
●

Statistics

●

Hemoglobin as an example

Computation and visualization of protein topology graphs including ligands

20
Acknowledgments
●

●

Supervisor Prof. Ina Koch
Molecular Bininformatics group at Goethe University Frankfurt
am Main (MolBI)

Computation and visualization of protein topology graphs including ligands

21
Thanks for your attention!

Questions?

Computation and visualization of protein topology graphs including ligands

22
(Appendix slides follow)

Computation and visualization of protein topology graphs including ligands

23

More Related Content

Viewers also liked

Pmi pmp-resume template-11
Pmi pmp-resume template-11Pmi pmp-resume template-11
Pmi pmp-resume template-11mission_vishvas
 
Интернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыИнтернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыFyisher
 
よね村の再開村についてのご提案
よね村の再開村についてのご提案よね村の再開村についてのご提案
よね村の再開村についてのご提案doku_sho
 
Почему ИграютВсе
Почему ИграютВсеПочему ИграютВсе
Почему ИграютВсеIgrayutVse
 
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...Gelecek Hane
 
Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»DataArt
 
Trinh dien ho so bai day
Trinh dien ho so bai dayTrinh dien ho so bai day
Trinh dien ho so bai dayheocon020192
 

Viewers also liked (10)

0 rchel installation_os
0 rchel installation_os0 rchel installation_os
0 rchel installation_os
 
Pmi pmp-resume template-11
Pmi pmp-resume template-11Pmi pmp-resume template-11
Pmi pmp-resume template-11
 
Интернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайтыИнтернет для специальности психолога - важнейшие информационные сайты
Интернет для специальности психолога - важнейшие информационные сайты
 
よね村の再開村についてのご提案
よね村の再開村についてのご提案よね村の再開村についてのご提案
よね村の再開村についてのご提案
 
Почему ИграютВсе
Почему ИграютВсеПочему ИграютВсе
Почему ИграютВсе
 
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
GelecekHane Halil Aksu - Açık İnovasyon Buluşması - Akıl Gıdıklayıcı İnovasyo...
 
Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»Роман Кокин «Организация тестирования в больших командах»
Роман Кокин «Организация тестирования в больших командах»
 
Joomla! na MS Windows
Joomla! na MS WindowsJoomla! na MS Windows
Joomla! na MS Windows
 
Trinh dien ho so bai day
Trinh dien ho so bai dayTrinh dien ho so bai day
Trinh dien ho so bai day
 
Bluetooth speaker
Bluetooth speakerBluetooth speaker
Bluetooth speaker
 

Similar to Computation and visualization of protein topology graphs including ligands

upload.pdf
upload.pdfupload.pdf
upload.pdfzohra72
 
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and ResiduesMULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residuescsandit
 
modelling assignment
modelling assignmentmodelling assignment
modelling assignmentShwetA Kumari
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlishAttique1
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccUSD Bioinformatics
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformaticsZeeshan Hanjra
 
Exploring Large Chemical Data Sets
Exploring Large Chemical Data SetsExploring Large Chemical Data Sets
Exploring Large Chemical Data Setskylelutz
 
PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013Zhongru (Paul) Xie
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminarAnne D'cruz
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdfJaberRad1
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDavide Chicco
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptorsRAJAN ROLTA
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...DataScienceConferenc1
 

Similar to Computation and visualization of protein topology graphs including ligands (20)

upload.pdf
upload.pdfupload.pdf
upload.pdf
 
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and ResiduesMULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
 
modelling assignment
modelling assignmentmodelling assignment
modelling assignment
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modeller
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
 
Exploring Large Chemical Data Sets
Exploring Large Chemical Data SetsExploring Large Chemical Data Sets
Exploring Large Chemical Data Sets
 
May workshop
May workshopMay workshop
May workshop
 
PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013PaulZRXie_Nucl. Acids Res.-2013
PaulZRXie_Nucl. Acids Res.-2013
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
 
CADD meeting 08-30-2016
CADD meeting 08-30-2016CADD meeting 08-30-2016
CADD meeting 08-30-2016
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminar
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
 
ProCheck
ProCheckProCheck
ProCheck
 
Swaati modeling
Swaati modeling Swaati modeling
Swaati modeling
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdf
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptors
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
 
VLife SCOPE for Lead Optimization
VLife SCOPE for Lead OptimizationVLife SCOPE for Lead Optimization
VLife SCOPE for Lead Optimization
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Computation and visualization of protein topology graphs including ligands

  • 1. GCB 2012 Computation and visualization of protein topology graphs including ligands Tim Schäfer, Patrick May, Ina Koch Goethe-University Frankfurt am Main Institute for Computer Science Department for Molecular Bioinformatics
  • 2. Contents ● Introduction ● ● Protein structure, ligands and graphs Methods ● ● Types of protein ligand graphs ● ● Computing protein ligand graphs from PDB and DSSP data Visualization of protein ligand graphs Results ● The VPLG software ● Case studies Computation and visualization of protein topology graphs including ligands 2
  • 3. Protein structure ● Primary structure ● ● ● String of 20 AAs 3D: peptide bonds, steric constraints, Ramachandran Secondary structure ● ● super-secondary structure ● ● local, H-bonds, SSE types protein domains Tertiary structure ● ● atom coordinates Quaternary structure ● multi-chain proteins Computation and visualization of protein topology graphs including ligands 3
  • 4. Modeling protein structure as a graph ● ● ● Graph G = (V, E) Pervasive data structure in computer science and bioinformatics Usage of graps to model proteins ● CATH [Orengo et al.] ● SCOP [Murzin et al.] ● TOPS+ [Gilbert, Veeramalai] ● PTGL [May, Koch] Computation and visualization of protein topology graphs including ligands 4
  • 5. Protein ligand interactions ● ● ● Proteins interact with their environment to fulfill their biological function: other proteins, ligands, ... Knowledge on PLI is important in many applications, e.g. drug design, molecular medicine > 10,000 different ligands occur in PDB files Computation and visualization of protein topology graphs including ligands [PDB Ligand Expo, 1AVD] 5
  • 6. A graph model of proteins on the supersecondary structure level ● Secondary Structure Elements (SSEs) ● ● ● Few: 40,000 atoms => 400 residues => 40 SSEs Automated assignment from 3D data possible (e.g., DSSP) Protein model ● ● undirected, labeled graph for each chain of a protein Protein graph ● vertices: SSEs or ligands ● edges: contacts and spatial relations between them ● graph types: alpha-beta-graph, alpha-graph or beta-graph Computation and visualization of protein topology graphs including ligands [Koch 1997, May et al. 2004] 6
  • 7. Protein ligand graphs ● Extension of protein graph model ● Protein ligand graph G = (V, E) ● Vertex: SSE || Ligand ● Edge: Spatial contact (parallel, antiparallel, mixed, ligand, backbone) Computation and visualization of protein topology graphs including ligands 7
  • 8. Computing protein ligand graphs 1. Obtain sequence from PDB file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM Computation and visualization of protein topology graphs including ligands 8
  • 9. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAM HHHHHHHHHH EEEEEEE HHHHHH EEEEEEE Computation and visualization of protein topology graphs including ligands 9
  • 10. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file SKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMJ HHHHHHHHHH EEEEEEE HHHHHH EEEEEEE Computation and visualization of protein topology graphs including ligands L 10
  • 11. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file 4. Build graph: add a vertex for each SSE H E H Computation and visualization of protein topology graphs including ligands E L 11
  • 12. Computing protein ligand graphs 1. Obtain sequence from PDB file 2. Obtain secondary structure assignments from DSSP file 3. Add ligand information from PDB file 4. Build graph: add a vertex for each SSE 5. Compute spatial contacts and add edges between SSEs H E H Computation and visualization of protein topology graphs including ligands E L 12
  • 13. VPLG – an open source implementation in Java Computation and visualization of protein topology graphs including ligands 13
  • 14. Overview: Computing protein ligand graphs Computation and visualization of protein topology graphs including ligands 14
  • 15. Contact computation ● Parser for PDB and DSSP files ● Atom level contacts ● ● ● vdW radius overlap, radius 2 A complexity for n atoms is Ο (n*n) Residue level contacts ● ● Protein-ligand contacts ● ● collision spheres, CA is center collision spheres, min max center SSE level contacts ● rules depending on SSE type, difficult Computation and visualization of protein topology graphs including ligands 15
  • 16. Types of protein graphs ● Alpha ● Beta ● Alpha-beta ● Ligands ● Not ness. connected ● Connected components ● Backbone option Computation and visualization of protein topology graphs including ligands 16
  • 17. The VPLG software ● Command line tool ● GUI in Java/Swing ● Screenshots ● Graph file formats ● Database support Computation and visualization of protein topology graphs including ligands 17
  • 18. Protein graphs for the whole PDB – Some statistics ● 62,364 proteins consisting of 151,025 chains ● 1.4 million helices, 1.2 million beta strands, 300k ligands ● Contacts ● ● Parallel: 450.000 ● Antiparallel: 870.000 ● Total non-ligand: 2.000.000 ● ● Mixed: 710.000 Ligand: 760.000 => 2.50 contacts per ligand => 0.76 contacts per SSE High number of isolated ligands (~30 %) at first run ● Atom radius too small? ● Contact definition? ● Small ligands (few atoms?) Computation and visualization of protein topology graphs including ligands 18
  • 19. Outlook and possible improvements ● VPLG software ● ● New spatial relation 'close to' for ligands ● ● Post-processing of DSSP output (short SSEs) RNA / DNA as ligands Backend ● Get Web server online ● Protein structure comparison (not necessarily graph-based) ● Use VPLG to get ligands into the PTGL Computation and visualization of protein topology graphs including ligands 19
  • 20. Summary ● Ligands are part of protein graph ● Visualization based on primary structure ordering of SSEs ● VPLG software is open source implementation ● ● ● Visualization of protein graphs from PDB and DSSP files http://www.bioinformatik.uni-frankfurt.de https://sourceforge.net/projects/vplg/ Evaluation ● Statistics ● Hemoglobin as an example Computation and visualization of protein topology graphs including ligands 20
  • 21. Acknowledgments ● ● Supervisor Prof. Ina Koch Molecular Bininformatics group at Goethe University Frankfurt am Main (MolBI) Computation and visualization of protein topology graphs including ligands 21
  • 22. Thanks for your attention! Questions? Computation and visualization of protein topology graphs including ligands 22
  • 23. (Appendix slides follow) Computation and visualization of protein topology graphs including ligands 23