Cryo-EM Structure Deposition

Why

Once a protein structure is solved, it is usually deposited in the World Wide Protein Data Bank (wwPDB). This is necessary for publication as most journals require a PDB ID to be included in any manuscript that describes a new crystallographic structure. The wwPDB is an invaluable resource that contains over 190,000 structures (as of July 2021).

Procedure

The model files (PDB and mmCIF) generated by phenix.real_space_refine contain information that is required when depositing a structure to the wwPDB. However, the wwPDB currently only accepts model files in mmCIF for deposition.

To generate a model file suitable to deposition, a two stage process is currently recommended:

  1. By default, phenix.real_space_refine will output model files in mmCIF and PDB format. If you turned off the option for mmCIF output, run a final cycle of phenix.real_space_refine that writes mmCIF files for model and data. This can be set in the Output section of the GUI.
  2. You can then process the model file (mmCIF) and a sequence file with the mmtbx.prepare_pdb_deposition program to create a mmCIF file with the sequence. This program requires the full sequence for the macromolecule to be provided. In the GUI, this program is in the "PDB Deposition" section of tools.

How to use mmtbx.prepare_pdb_deposition: Click here

Related programs

References