Python-based Hierarchical ENvironment for Integrated Xtallography |
Documentation Home |
Ensemble creation with Ensembler
Author(s)
PurposeEnsembler can be used to superpose multiple chains to be used as an ensemble search model for molecular replacement. UsageEnsembler can be run from the PHENIX GUI and the command line, the only difference being the way commands are taken from the user. Input files
Command linephenix.ensembler \ [ command-line switches ] \ [ PHIL-format parameter files ] \ [ PHIL command-line assignments ] \ [ PDB-files ] \ [ alignment files ] Command-line switches:-h, --help show this help message and exit --show-defaults print PHIL and exit -i, --stdin read PHIL from stdin as well -v, --verbosity set verbosity level (DEBUG,INFO,WARNING,VERBOSE) PHIL arguments:Everything not starting with a dash('-') is interpreted as a PHIL argument. This can be a PHIL-format file containing parameters, command-line assignment or a file whose type is automatically recognized (based on file extension; structure files and alignment files are recognized automatically). GUIThe graphical user interface makes all settings accessible either as part of the main window (for frequently used options) or as a dialog box Ensemble generation settings.... Input files are specified either by the +/- button pair, or by drag-and-drop onto the window area. File types are automatically recognized, and added to the relevant input section. Output filesThe superposed chains can be written out either as a quasi multiple model PDB file that is readable by phaser directly (output style merged, output file name root_merged.pdb) or as a series of files containing each chain separately (output style separate, file name root_pdb_chain.pdb, where pdb is the name of the PDB file the chain was read from, and chain is the chain identifier). The file name root can be changed via the root parameter of the output (default: ensemble). DescriptionThe workflow consists of several stages that can be independently configured. These are listed in order of execution. For a summary of all keywords with the corresponding defaults, see the Additional information section. Residue mappingEstablishes the equivalence of residues among the input chains. There are several options available:
Atom mappingMaps selected atoms within equivalent residues to each other. The mapping is done by name hence the order of atoms in the residue does not matter. If atoms are missing from certain residues (or if certain residues contain extra atoms), a gap will be filled where necessary. Atom selection is controlled by the atoms parameter of the configuration scope. Default atom selection: CA. SuperpositionEquivalent positions are superposed iteratively to find a globally optimal solution. There are two superposition algorithms implemented, which primarily differ in how they handle gaps in equivalent positions.
Both algorithms use Diamond's formulation to solve the pairwise rotational superposition problem (Diamond, 1988). An exception is raised if there are less than 3 sites present for superposition. Multiple superposition is an iterative process and consists of a series of pairwise superpositions. The convergence criterion is controlled by the convergence parameter (in the superposition scope), which is the r.m.s. difference change between two consecutive iterations. WeightingAutomatic weighting can be used to improve superposition, either to amplify highly homologous regions or to decrease the effect of incorrect site-equivalence (typically arises because of a wrong alignment). Implemented weighting schemes are as follows:
Weighting is iterated with superposition until weights converge, which can be controlled by the convergence parameter of the weighting scope. In case of highly dissimilar structures (or incorrect residue mapping), weight determination may temporarily need to be damped to avoid divergence. This is done automatically (in steps controlled by the incremental_damping_factor parameter of the weighting scope), until a preset value (controlled by the max_damping_factor parameter of the weighting scope) is reached, at which point an exception is raised. Cluster analysisHierarchical cluster analysis is performed using the pairwise r.m.s. differences as a distance measure. The clustering parameter of the configuration score can be used to adjust cluster boundaries. Chain trimmingThis option trims residues from the final superposed model where the unweighted r.m.s.d. is above a certain threshold (threshold parameter of the trimming scope). Useful in removing flexible loops, etc. Default: no trimming. SortingAfter superposition is complete, the chains can be sorted by sequence identity (identity), fraction of common sites wrt all aligned atom positions (overlap), weighted r.m.s.d. (wrmsd) or unweighted r.m.s.d. (unwrmsd). This is controlled by the sort parameter of the output scope. Default: input order (input). Specific limitations and possible problemsProcessing features
Warning and error messages
Literature
Additional information |