Cryo-EM Structure Deposition

Background

Once a protein structure is solved it is usually deposited in the World Wide Protein Data Bank (wwPDB). This is necessary for publication as most journals require a PDBID to be included in any manuscript under review that describes a new crystallographic structure. The wwPDB is an invaluable resource that contains over 150,000 structures (as of June 2019).

Procedure

The model files (PDB and mmCIF) generated by phenix.real_space_refine contain information that is required when depositing a structure to the wwPDB. Starting in July 2019, the wwPDB will only accept model files in mmCIF for deposition for crystallographic structures, while cryo-em structures can still use the PDB format. To aid in the eventual transition to using only mmCIF for deposition, we recommend that you start using mmCIF for deposition of cryo-em structures as well.

To generate a model file suitable to deposition, a two stage process is currently recommended:

  1. By default, phenix.real_space_refine will output model files in mmCIF and PDB format. If you turned off the option for mmCIF output, run a final cycle of phenix.real_space_refine that writes mmCIF files for model and data (this can be set in the Output section of the GUI).
  2. The model file (mmCIF) and a sequence file can then be processed with mmtbx.prepare_pdb_deposition program to create a mmCIF file with the sequence. This program requires the full sequence for the macromolecule to be provided. In the GUI, this program is in the "PDB Deposition" section of tools.

Related programs

References