CS23D is a web server to generate 3D structural models from NMR chemical shifts. CS23D combines maximal fragment assembly with chemical shift threading, de novo structure generation, chemical shift-based torsion angle prediction, and chemical shift refinement. CS23D makes use of RefDB and ShiftX.
CS23D input formats
CS23D accepts chemical shift files in either SHIFTY or BMRB formats.
CS23D options
A user can
Exclude a protein from being used as the template
Ignore high-identity homologs in the list of available templates
Change the number of models in the final ensemble
Change the number of model optimization steps
CS23D output
CS23D output consists of a set of 10 best-score PDB coordinates. A hyperlink to the single best score structure is also provided. The overall CS23D score, knowledge-based score, chemical shift score, Ramachandran plot statistics, correlations between the observed and calculated shifts before and after refinement are displayed. A conclusion about structure reliability is given to the user.
CS23D protocol
Homology search: The query sequence is used to find homologous proteins or/and protein fragments in a non-redundant database of PDB sequences and secondary structures of PPT-DB using BLAST. Homology modelling:Homology modelling is done by the Homodeller program, which is a part of the PROTEUS2 program. The proteins that are identified during the homology search step are used as the templates in homology modelling. Chemical shift re-referencing: Chemical shifts are re-referenced by the RefCor, which is a part of the RCI webserver backend. Secondary structure prediction from chemical shifts:Secondary structure is predicted from chemical shifts by CSI. Torsion angle prediction from chemical shifts: Torsion angles are predicted from chemical shifts by PREDITOR. Chemical shift threading: Backbone Phi and Psi torsion angles predicted from chemical shifts by PREDITOR are mapped into nine different regions in Ramachandran space, each of which are assigned specific letters. A protein can represented by a sequence of these nine "torsion angle letters". Thrifty is using these sequences of torsion angle letters to identify good templates in a database of ∼18 500 nonredundant PDB structures that have had their structures converted to the nine-letter Ramachandran "alphabet". In a similar manner, chemical shift threading is additionally done using three-letter secondary structure alphabet and secondary structure predicted from chemical shifts by the CSI program. Model assembly: Subfragments identified by homology modelling and chemical shift threading steps are assembled into initial 3D models using CS23D SFassembler. The initial models are evaluated by the GAFolder scoring function and the best model is further refined by GAFolder. Ab initio folding:Ab initio folding is done by Rosetta when no template was identified by the homology modelling and chemical shift threading steps. Rosetta models are evaluated by GAFolder scoring function and the best Rosetta models are refined by GAFolder. Model optimization: Model optimization in CS23D is done by a torsion-angle-based minimizer GAfolder that uses a genetic algorithm to sample conformation space. The method is similar to that employed by GENFOLD. GAFolder makes torsion angles moves within the ranges defined by the values and uncertainties of torsion angles predicted by PREDITOR. GAFolder evaluates protein models by the scoring function described below. Scoring function:Scoring function of GAFolder consists of knowledge based scores and chemical shift scores. The knowledge-based scores include:
CS23D is a template-based method. Therefore, its performance depends on sequence identity of the selected template, see the adjacent picture. Likewise, Rosetta is a fragment-biased method. Its performance depends on the quality of selected fragments. Fragment quality and, thus, Rosetta performance can be improved by using chemical shifts during the fragment selection step. For a structural solution that is not biased by a template structure or fragment structure, one may want to consider obtaining NOE-based distance restraints and using them with the GeNMR program in its ab initio mode.