Thomas C. Terwilliger, Paul D. Adams, Pavel V. Afonine, Oleg V. Sobolev
This site contains the results of applying the Phenix tool phenix.map_to_model to 476 maps from the EMDB
NOTES:
When viewing these autobuilt models in Chimera, you may want to turn off pseudo-bonds. To do this you can go to Tools/General Controls/Pseudobond Panel/ and uncheck "Distance monitor" and "Missing segments."
The purpose of these data are to illustrate what can (and cannot) be done automatically, and to provide a head-start on model building for cryo-EM maps. Models produced with phenix.map_to_model are preliminary and only partially complete and many contain chains that are built backwards or have incorrect parts or joins.
Some of the models contain a few very bad contacts. The worst such model is map_to_model_5tc1_8397.pdb. Most of these can be removed by running the Phenix tool remove_clashes: phenix.remove_clashes map_to_model_5tc1_8397.pdb. The "all data" downloads contain trimmed models created in this way, labeled for example map_to_model_trimmed_5tc1_8397.pdb
For each structure the models present in "all data" are the full automatically-built model (e.g., for the structure EMD-2984, PDB entry 5a1a, map_to_model_5a1a_2984.pdb), the "init" model created using the trace-chain algorithm and iterative secondary structure optimization (model_init_PROTEIN_shifted_5a1a_2984.pdb), the helices-strands model (model_helices_strands_only_PROTEIN_shifted_5a1a_2984.pdb), and the RESOLVE model (model_standard_PROTEIN_shifted_5a1a_2984.pdb). The intermediate models are listed as shifted as they have been shifted from their positions to match the origin of the original map.
For EMDB-6272 (3j9s) the part of the map representing the deposited model was cut out from the deposited map as the full symmetry was not available. After running phenix.map_to_model the map_to_model.pdb model was translated to match the original map. For this structure the intermediate files (e.g., model_init_PROTEIN_shifted_3j9s_6272.pdb, model_standard_PROTEIN_shifted_3j9s_6272.pdb, model_helices_strands_only_PROTEIN_shifted_3j9s_6272.pdb) match the cut out map, not the original map.
For EMD-4054 (5lij) there is no value for sequence match because the deposited model is a poly-alanine chain.
You can download all the data on this site with all_tar_mtm.tgz
A summary of data as reported in the paper is in the spreadsheet rmsd_estimates_2018-01-14b.xlsx
The table below can be downloaded as a spreadsheet from: MTM_summary_2018-01-20.xlsx
Models generated in comparison with MAINMAST and Rosetta and the maps used to generate them can be found in: mtm_comparison_2018-05-07.tgz.
A table with additional structures (total of 629) analyzed can be found at summary_2018-03-09.html
EMDB | EMDB ID and link to EMDB entry |
PDB | PDB ID and link to PDB entry |
Resolution | Resolution from EMDB |
CC (deposited) | Map-model correlation for deposited map and model using phenix.map_model_cc |
CC (map_to_model) | Map-model correlation for automatically-generated model and auto_sharpened map |
Symmetry | Symmetry from the EMDB. Note that this symmetry may or may not match the number of copies or the symmetry file. For example, the first entry in the table below, EMDB-6272, PDB 3j9s is listed in the EMDB as having icosahedral symmetry, however the deposited map is only a portion of the molecule containing 3 chains. The number of copies in this case is 3 and the symmetry file contains 3 operators. |
Copies | Number of reconstruction symmetry operators used ( normally obtained from meta-data in the PDB or from symmetry of PDB deposit) |
Residues (Protein/RNA) | Protein residues. This is the number used in comparisons between deposited and automatically-generated models. It can be either the unique or the total number |
% Matching | Percentage of protein/RNA residues in the deposited model or the unique part of the deposited model that are within 3 A of a residue (residues represented by their CA or P atoms) in the automatically-generated model. |
% Seq match | Percentage of matching residues that have the same residue name. |
Download all data | Link to download all the data for this analysis. Includes: sequence , symmetry , resolution , automatically-generated model ( map_to_model_xxxx_yyyy.pdb), model, intermediate models: model_helices_strands_only_PROTEIN_shifted_xxxx_yyyy.pdb (helices-strands model), model_init_PROTEIN_shifted_xxxx_yyyy.pdb (trace-chain model), model_standard_PROTEIN_shifted_xxxx_yyyy.pdb (resolve model) |
Model (map_to_model) | Automatically generated model |
Deposited model | Model from PDB with symmetry (if any) applied |
Symmetry file | File containing symmetry matrices used in analysis. Note: only the first symmetry group is used; others are ignored. NCS refers to symmetry |
Sequence file | File containing sequence used in analysis. |