Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Â
How To Recoord
1. Running RECOORD Scripts using CNS 1.1
Christiane Riedinger, Feb 2007
The RECOORD scripts are bash scripts that generate input scripts (*.inp) for CNS
tailor-made for NMR structure calculation. RECOORD comes with its own set of
forcefields and allows the user to carry out a structure calculation using a standardised
protocol.
Aart J. Nederveen, Jurgen F. Doreleijers, Wim Vranken, Zachary Miller, Chris
A.E.M. Spronk, Sander B. Nabuurs, Peter Guentert, Miron Livny, John L. Markley,
Michael Nilges, Eldon L. Ulrich, Robert Kaptein and Alexandre M.J.J. Bonvin
(2005). RECOORD: a REcalculated COORdinates Database of 500+ proteins from
the PDB using restraints from the BioMagResBank. Proteins 59, 662-672.
http://www.ebi.ac.uk/msd-srv/docs/NMR/recoord/main.html
1. Installing the Scripts
⢠Make a project directory for your structure calculation in your home directory,
e.g. <doc1>
⢠Put the RECOORD script directory into <doc1>
⢠Make all RECOORD scripts executable and add the directory to your path
# vi .cshrc
add: set path = (~/doc1/RECOORD)
⢠Edit and execute the script âchangeScriptsDir.shâ:
#!/bin/bash
#
# script to change scripts directory, to be run in scripts directory
# script will change itself as well
#
# fill in directory here
newDir=/users1/riedinger/doc1/RECOORD
2. Creating the topology file (.mtf)
The topology file contains atom information, bond information (lengths and angles),
torsion angles, disulphide bonds, but it does not contain coordinates.
2.1. Generating the topology file from a primary sequence file
If you only have a primary sequence file available for your protein, you need to
generate the topology file using a script from the CNS webpage, called
generate_seq.inp:
⢠go to the CNS webpage and edit this script
⢠enter the name of your primary sequence file, the desired name of the output
fileâŚ.
⢠set hydrogens to true, select false for B-factor and occupancy (we are not
doing crystallography!)âŚ.
⢠RECOORD comes with its own set of parameter and topology files which
need to be specified in generate_seq.inp:
{================== protein topology and parameter files ===================}
{* protein topology file *}
{===>} prot_topology_infile="SCRIPTS:/toppar/topallhdg5.3.pro";
{* protein linkage file *}
{===>} prot_link_infile="SCRIPTS:/toppar/topallhdg5.3.pep";
{* protein parameter file *}
2. {===>} prot_parameter_infile="SCRIPTS:/toppar/parallhdg5.3.pro";
{================nucleic acid topology and parameter files =================}
{* nucleic acid topology file *}
{===>} nucl_topology_infile="SCRIPTS:/toppar/dna-rna-allatom.top";
{* nucleic acid linkage file *}
{===>} nucl_link_infile="SCRIPTS:/toppar/dna-rna.link";
{* nucleic acid parameter file *}
{===>} nucl_parameter_infile="SCRIPTS:/toppar/dna-rna-allatom.param";
{=================== water topology and parameter files ====================}
{* water topology file *}
{===>} water_topology_infile="SCRIPTS:/toppar/topallhdg5.3.sol";
{* water parameter file *}
{===>} water_parameter_infile="SCRIPTS:/toppar/parallhdg5.3.sol";
{================= carbohydrate topology and parameter files ===============}
{* carbohydrate topology file *}
{===>} carbo_topology_infile="SCRIPTS:/toppar/carbohydrate.top";
{* carbohydrate parameter file *}
{===>} carbo_parameter_infile="SCRIPTS:/toppar/carbohydrate.param";
{============= prosthetic group topology and parameter files ===============}
{* prosthetic group topology file *}
{===>} prost_topology_infile="";
{* prosthetic group parameter file *}
{===>} prost_parameter_infile="";
{===================== ion topology and parameter files ====================}
{* ion topology file *}
{===>} ion_topology_infile="SCRIPTS:/toppar/ion.top";
{* ion parameter file *}
{===>} ion_parameter_infile="SCRIPTS:/toppar/ion.param";
⢠Save the script in your project directory, also put your primary sequence file
there.
⢠Now you need to set an environment variable:
# setenv SCRIPTS ~/doc1/RECOORD
⢠Run the Script:
# cns < generate_seq.inp > generate_seq.out
⢠The file generated is doc1.mtf
⢠The output file will give you information in case things have gone wrong.
2.2. Generating the topology file from a pdb file
In case that you already have a pdb file of your protein, you generate the topology file
with generate.sh (similar to generate_easy.inp from the CNS webpage)
3. Generate an extended Structure
The an extended structure is the starting point for simulated annealing and is
generated using the script generate_extended.sh
⢠# generate_extended.sh <your mtf file>
⢠creates <your protein>_extended.pdb
3. 4. Start the simulated annealing
Place your restraint files in your project directory. The restraint files need to be named
as follows:
⢠unambig.tbl
⢠ambig.tbl
⢠dihedrals.tbl
⢠hbonds.tbl
⢠methyls.tbl
⢠âŚ
An example of a unambiguous restraint file: (Exclamation mark indicates a comment)
!Q 60
assign (resid 60 and name hn) (resid 60 and name hb1) 1.8 0.0 1.1 ! 3dhnnoe_new.259
1.80378 strong
assign (resid 60 and name hn) (resid 60 and name hb2) 1.8 0.0 1.7 ! 3dhnnoe_new.240
0.93875 medium
An example of a dihedral restrain file: (coming from TALOS)
! 1. Q 57 Phi -103.79 +/- 70.98 (-174.77 to -32.81)
assign (resid 56 and name C) (resid 57 and name N)
(resid 57 and name CA) (resid 57 and name C) 1.0 -103.79 70.98 2
⢠the simulated annealing is run with the script annealing.sh
⢠the annealing.sh file generates the CNS input file annealing.inp and run.cns
(containing the restraints). It then generates refineLong.inp files, each one
calculating one pdb file.
⢠go thoroughly through the script to specify parameters:
#!/bin/bash
# run as: annealing.sh <entries>
#
# script for calculating an NMR ensemble with MDSA
# per model only one job is generated
#
# Aart Nederveen 2003, Utrecht University
############ settings for CNS calculation #############
# The following files should be present in project directory:
# project_cns.pdb
# project_cns.mtf
# project_cns_extended.pdb
âŤ# simply rename your mft and extended pdb file to contain <your
# protein>_cns_extended.pdb. I donât know why it needs a second pdb file, so I just
# copied my doc1_cns_extended.pdb file to doc1_cns.pdb. that seemed to work, but no
# guarantee!
# directory for models that are calculated
dirRefined='str'
# directory for scripts
dirScripts='/users1/riedinger/doc1/RECOORD'
# directory for CNS output
dirCalc='cnsRef'
# submit command for cluster
# if 'csh', then your own computer is used
# submit='ssub linux_cluster'
# submit='csh'
# either chose âcshâ for your own computer, or if you are using synapse, specify:
submit = âqsub âV cwdâ
# number of models that are generated
# all models are generated with the same protocol; only the seed number differs
number=2
# select as you wish
# if deletePrevious is 0 then no calculation is performed if coordinatefile is already
present
deletePrevious=0
4. # sleep time between successive jobs to make cluster happy
sleepTime='3s'
# cns executable
# cnsExec='/software/cns_1.1/cns_solve_1.1/intel-i686-linux_g77/bin/cns'
cnsExec='/packages/cns/cns_solve_1.1/intel-i686-linux/bin/cns'
# enter your correct path
# settings for symmetric dimer (ncs + symmetry restraints)
# mind that segid names and residuenumbers are grepped from $entry.pdb
symDimerOn=0
# 0 = off, if you have a monomer
# choose longer protocol; double number of steps, default 0
doubleSteps=0
âŚ
⢠finally, to run the script, you need to be one directory above your project
directory
⢠to run the script:
# annealing.sh <your project directory>
5. Analysing Violations
⢠First, use the Script calcViol.sh, which will generate and run calcViol_all.inp
⢠Before you run it for the first time, make sure you have entered the correct
CNS executable for your installation
⢠You have to run this script from within your project directory:
# calcViol.sh doc1_cns str violations 0.3 doc1_cns.mtf > calcViol.out
⢠There wonât be an output file if there are no errors
⢠The exact input variables are explained in the script itself, the above is just an
example
⢠When youâre done, run the second script, called analys Viol.sh
â˘
Useful tips:
⢠If running a script again after an error, remove every file that it has created.
⢠Check the CNS executable is stated correctly