Cryo-EM Structure Deposition


Once a protein structure is solved it is usually deposited in the World Wide Protein Data Bank (wwPDB). This is necessary for publication as most journals require a PDBID to be included in any manuscript under review that describes a new crystallographic structure. The wwPDB is an invaluable resource that contains over 150,000 structures (as of June 2019).


The model files (PDB and mmCIF) generated by phenix.real_space_refine contain information that is required when depositing a structure to the wwPDB. Starting in July 2019, the wwPDB will only accept model files in mmCIF for deposition for crystallographic structures, while cryo-em structures can still use the PDB format. To aid in the eventual transition to using only mmCIF for deposition, we recommend that you start using mmCIF for deposition of cryo-em structures as well.

To generate a model file suitable to deposition, a two stage process is currently recommended:

  1. By default, phenix.real_space_refine will output model files in mmCIF and PDB format. If you turned off the option for mmCIF output, run a final cycle of phenix.real_space_refine that writes mmCIF files for model and data (this can be set in the Output section of the GUI).
  2. The model file (mmCIF) and a sequence file can then be processed with mmtbx.prepare_pdb_deposition program to create a mmCIF file with the sequence. This program requires the full sequence for the macromolecule to be provided. In the GUI, this program is in the "PDB Deposition" section of tools.

