SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Data Mining Protein Structures' Topological Properties  to Enhance Contact Map Predictions Dr. Jaume Bacardit School of Computer Science and School of Biosciences University of Nottingham [email_address] Weizmann Institute of Sciences, May 27 th , 2010
Preface ,[object Object],[object Object],[object Object],[object Object]
Roadmap ,[object Object],[object Object],[object Object],[object Object],[object Object],PSP    TP    CM    CASP    INS
PROTEIN STRUCTURE AND CONTACT MAP PREDICTION PSP     TP    CM    CASP    INS
Protein Structure Prediction ,[object Object],Primary Sequence 3D Structure
Why PSP? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
PSP: A family of problems ,[object Object],[object Object],[object Object],[object Object],[object Object]
PSP sub-problems ,[object Object],[object Object],[object Object],[object Object],[object Object]
TOPOLOGICAL PROPERTIES OF PROTEINS PSP     TP     CM    CASP    INS
Contact Map ,[object Object],[object Object],[object Object],[object Object],Contact helices sheets
Recursive Convex Hull ,[object Object],[object Object],[object Object],[object Object],[object Object]
Recursive Convex Hull ,[object Object]
Relation of RCH to other structural properties ,[object Object],[object Object],[object Object],[object Object],[object Object]
Correlation between features
Proximity Graphs (PGs) DT  ⊇  GG ⊇ RNG ⊇ MST  Poupon: 2004 Delanuy Tessellation of a point set ,[object Object]
Proximity Graphs (PGs) DT  ⊇  GG ⊇ RNG ⊇ MST  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Residue Packing Density ,[object Object],[object Object],Contact Map Public calculation server: http://lobelia.cs.nott.ac.uk/psp/newInterface/
Predictability of RCH ,[object Object],[object Object],[object Object],[object Object],[object Object]
Predictability of RCH ,[object Object]
Is RCH more predictable than other features? ,[object Object]
But is it useful? ,[object Object],[object Object]
OUR CONTACT MAP PREDICTION METHOD PSP    TP     CM     CASP    INS
Steps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Using BioHEL [Bacardit et al., 09]
The BioHEL GBML System ,[object Object],[object Object],[object Object],[object Object],[object Object]
Iterative Rule Learning ,[object Object]
Characteristics of BioHEL ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Prediction of RCH, SA and CN ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How are these features predicted? ,[object Object],[object Object],R i SS i R i+1 SS i+1 R i-1 SS i-1 R i+2 SS i+2 R i-2 SS i-2 R i+3 SS i+3 R i+4 SS i+4 R i-3 SS i-3 R i-4 SS i-4 R i-5 SS i-5 R i+5 SS i+5 R i-1  R i  R i+1     SS i R i  R i+1  R i+2     SS i+1 R i+1  R i+2  R i+3     SS i+2
Prediction of RCH, SA and CN ,[object Object],[object Object],[object Object],[object Object]
Characterisation of the contact map problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1 2 3
1. Three windows of residues ,[object Object],[object Object],[object Object],[object Object],[object Object]
Description of connecting segment and the whole sequence ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contact Map dataset ,[object Object],[object Object],[object Object],[object Object],[object Object]
Samples and ensembles ,[object Object],[object Object],[object Object],[object Object],[object Object],Training set x50 x25 Consensus Predictions Samples Rule sets
CONTACT MAP PREDICTION AT CASP9 PSP    TP    CM     CASP     INS
Contact Map prediction in CASP ,[object Object],[object Object],[object Object],[object Object]
Contact Map prediction in CASP ,[object Object],[object Object],[object Object],[object Object]
Accuracy Results ,[object Object],Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
Xd results Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
[object Object],[object Object],Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
WHAT INSIGHT CAN WE EXTRACT FROM THE METHOD?  PSP    TP    CM    CASP     INS
Is all that information useful? ,[object Object],[object Object],[object Object]
Rule generated by BioHEL ,[object Object],[object Object]
Understanding the rule sets ,[object Object],[object Object],[object Object],[object Object]
Distribution of frequency of use of attributes ,[object Object],[object Object]
Top 10 attributes The four kind of residue’s predictions are highly ranked Attribute Frequency Counts PredSS_r1_1 1.48% 18141 PredCN_r1 1.66% 20336 propensity 1.74% 21288 PredSS_r2 1.75% 21350 PredSS_r1 1.82% 22205 PredRCH_r2 1.87% 22856 PredRCH_r1 2.04% 24961 PredSA_r2 2.12% 25891 PredSA_r1 2.39% 29246 separation 4.17% 50951
Beyond individual attributes… ,[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
CM prediction. Is it worth it? ,[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Andere mochten auch

我早就選擇好我的幸福了 The Wise Old Man
我早就選擇好我的幸福了 The Wise Old Man我早就選擇好我的幸福了 The Wise Old Man
我早就選擇好我的幸福了 The Wise Old Man
nonnon
 
你的桶子有多滿
你的桶子有多滿你的桶子有多滿
你的桶子有多滿
nonnon
 
think it over及時關愛生活
think it over及時關愛生活think it over及時關愛生活
think it over及時關愛生活
nonnon
 
圖說人生哲理
圖說人生哲理圖說人生哲理
圖說人生哲理
nonnon
 
Johannes Lars
Johannes LarsJohannes Lars
Johannes Lars
eka
 
Apollo Erik And Knud Ole
Apollo Erik And Knud OleApollo Erik And Knud Ole
Apollo Erik And Knud Ole
eka
 
Ерехинская диктум извлечение мнений
Ерехинская диктум извлечение мненийЕрехинская диктум извлечение мнений
Ерехинская диктум извлечение мнений
Lidia Pivovarova
 
프레젠테이션3
프레젠테이션3프레젠테이션3
프레젠테이션3
yoon5056
 

Andere mochten auch (20)

Regionální a metropolitní sítě Cisco
Regionální a metropolitní sítě CiscoRegionální a metropolitní sítě Cisco
Regionální a metropolitní sítě Cisco
 
我早就選擇好我的幸福了 The Wise Old Man
我早就選擇好我的幸福了 The Wise Old Man我早就選擇好我的幸福了 The Wise Old Man
我早就選擇好我的幸福了 The Wise Old Man
 
Knocknarea
KnocknareaKnocknarea
Knocknarea
 
Flex et Php Afup
Flex et Php AfupFlex et Php Afup
Flex et Php Afup
 
Class Project Pxgt 6110
Class Project Pxgt 6110Class Project Pxgt 6110
Class Project Pxgt 6110
 
你的桶子有多滿
你的桶子有多滿你的桶子有多滿
你的桶子有多滿
 
think it over及時關愛生活
think it over及時關愛生活think it over及時關愛生活
think it over及時關愛生活
 
What is your product's social strategy?
What is your product's social strategy?What is your product's social strategy?
What is your product's social strategy?
 
Lorelle at WordCamp 2008 - 260 Ways to Break WordPress
Lorelle at WordCamp 2008 - 260 Ways to Break WordPressLorelle at WordCamp 2008 - 260 Ways to Break WordPress
Lorelle at WordCamp 2008 - 260 Ways to Break WordPress
 
圖說人生哲理
圖說人生哲理圖說人生哲理
圖說人生哲理
 
Fetc '09 Wiki Presentation
Fetc '09 Wiki PresentationFetc '09 Wiki Presentation
Fetc '09 Wiki Presentation
 
Johannes Lars
Johannes LarsJohannes Lars
Johannes Lars
 
Apollo Erik And Knud Ole
Apollo Erik And Knud OleApollo Erik And Knud Ole
Apollo Erik And Knud Ole
 
Swot Analysis
Swot AnalysisSwot Analysis
Swot Analysis
 
Zadanie_1
Zadanie_1Zadanie_1
Zadanie_1
 
Greene Presentation
Greene PresentationGreene Presentation
Greene Presentation
 
Ерехинская диктум извлечение мнений
Ерехинская диктум извлечение мненийЕрехинская диктум извлечение мнений
Ерехинская диктум извлечение мнений
 
Líbano
LíbanoLíbano
Líbano
 
프레젠테이션3
프레젠테이션3프레젠테이션3
프레젠테이션3
 
Bibliaren Idazkera
Bibliaren IdazkeraBibliaren Idazkera
Bibliaren Idazkera
 

Ähnlich wie Data Mining Protein Structures' Topological Properties to Enhance Contact Map Predictions

Powerpoint
PowerpointPowerpoint
Powerpoint
butest
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
butest
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminar
Anne D'cruz
 
IJBB-51-3-188-200
IJBB-51-3-188-200IJBB-51-3-188-200
IJBB-51-3-188-200
sankar basu
 
Gordon2003
Gordon2003Gordon2003
Gordon2003
toluene
 
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
ijngnjournal
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Rajarshi Guha
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
 

Ähnlich wie Data Mining Protein Structures' Topological Properties to Enhance Contact Map Predictions (20)

The Infobiotics Contact Map predictor at CASP9
The Infobiotics Contact Map predictor at CASP9The Infobiotics Contact Map predictor at CASP9
The Infobiotics Contact Map predictor at CASP9
 
Powerpoint
PowerpointPowerpoint
Powerpoint
 
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Analysis of Cholesterol Quantity Detection and ANN Classification
Analysis of Cholesterol Quantity Detection and ANN ClassificationAnalysis of Cholesterol Quantity Detection and ANN Classification
Analysis of Cholesterol Quantity Detection and ANN Classification
 
Presentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali ShahPresentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali Shah
 
Crimson Publishers-Predicting Protein Transmembrane Regionsby Using LSTM Model
Crimson Publishers-Predicting Protein Transmembrane Regionsby Using LSTM ModelCrimson Publishers-Predicting Protein Transmembrane Regionsby Using LSTM Model
Crimson Publishers-Predicting Protein Transmembrane Regionsby Using LSTM Model
 
PCA-CompChem_seminar
PCA-CompChem_seminarPCA-CompChem_seminar
PCA-CompChem_seminar
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
 
IJBB-51-3-188-200
IJBB-51-3-188-200IJBB-51-3-188-200
IJBB-51-3-188-200
 
HEART DISEASE PREDICTION USING MACHINE LEARNING TECHNIQUES
HEART DISEASE PREDICTION USING MACHINE LEARNING TECHNIQUESHEART DISEASE PREDICTION USING MACHINE LEARNING TECHNIQUES
HEART DISEASE PREDICTION USING MACHINE LEARNING TECHNIQUES
 
modelling assignment
modelling assignmentmodelling assignment
modelling assignment
 
Gordon2003
Gordon2003Gordon2003
Gordon2003
 
PPT
PPTPPT
PPT
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...
 
Cray HPC + D + A = HPDA
Cray HPC + D + A = HPDACray HPC + D + A = HPDA
Cray HPC + D + A = HPDA
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity Data
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Data Mining Protein Structures' Topological Properties to Enhance Contact Map Predictions

  • 1. Data Mining Protein Structures' Topological Properties to Enhance Contact Map Predictions Dr. Jaume Bacardit School of Computer Science and School of Biosciences University of Nottingham [email_address] Weizmann Institute of Sciences, May 27 th , 2010
  • 2.
  • 3.
  • 4. PROTEIN STRUCTURE AND CONTACT MAP PREDICTION PSP  TP  CM  CASP  INS
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. TOPOLOGICAL PROPERTIES OF PROTEINS PSP  TP  CM  CASP  INS
  • 10.
  • 11.
  • 12.
  • 13.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. OUR CONTACT MAP PREDICTION METHOD PSP  TP  CM  CASP  INS
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. CONTACT MAP PREDICTION AT CASP9 PSP  TP  CM  CASP  INS
  • 36.
  • 37.
  • 38.
  • 39. Xd results Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
  • 40.
  • 41. WHAT INSIGHT CAN WE EXTRACT FROM THE METHOD? PSP  TP  CM  CASP  INS
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. Top 10 attributes The four kind of residue’s predictions are highly ranked Attribute Frequency Counts PredSS_r1_1 1.48% 18141 PredCN_r1 1.66% 20336 propensity 1.74% 21288 PredSS_r2 1.75% 21350 PredSS_r1 1.82% 22205 PredRCH_r2 1.87% 22856 PredRCH_r1 2.04% 24961 PredSA_r2 2.12% 25891 PredSA_r1 2.39% 29246 separation 4.17% 50951
  • 47.
  • 48.
  • 49.
  • 50.