Few questions about the model building and refinement of cryo-EM data
Hi guys, Sorry for the detailed questions from a beginner! I am starting a modeling/refinement work for a set of cryo-EM data. Data basic information: phage data (looks like a ball with P1 space group), map size ~870 MB, resolution ~6 A, two X-ray structures for model building General strategies: I select COOT for the model building; using phenix.real_sapce_refine for the real space refinement with secondary structure restrain and REFMAC for the reciprocal space refinement. Few questions are listed here. 1. The map was too bigger to open it in COOT. The phenix.map_to_structure_factors was used to obtaine ~120 MB sized MTZ file (still a little big for my computer). I manually build up the whole ball-shaped phage with the rigid body fit in COOT (from two X-ray structures to 120 chains). My first question will be: In this case, should I crop the map in Chimera or other software and only focus on a small asymmetric unit to do the model building and the followed refinement. 2. I would like to do a real space refinement after the model building. Input files: A pdb file I just built up from COOT A original MAP file A transfered MTZ file Two restraint files from two X-ray structures by ProSMART (TXT formart) The refinement parameters I would like to select in GUI interface: minimization_global, rigid_body, local_grid_search, adp Use secondary structure restraints Reference model restraints: use starting model as reference, main chain, side chain, fix outliers, secondary structure only Rotamer restraints Ramachandran restraints Show per residue My second set of questions: Should I select the MAP file (~870 MB) or the MTZ file (~120 MB)? Is that necessary to add the two restraint files from ProSMART if I use the starting model as reference? Is the refinement parameters selected properly? 3. I gave a try by phenix.real_space_refine. An first error showed up: Number of atoms with unknown nonbonded energy type symbols: 6840 "ATOM 184 HG1 SER 1 12 .*. H " "ATOM 458 HG1 SER 1 30 .*. H " "ATOM 699 HG1 SER 1 45 .*. H " "ATOM 720 HG1 SER 1 47 .*. H " "ATOM 762 HG1 SER 1 50 .*. H " "ATOM 1241 HG1 SER 1 81 .*. H " "ATOM 1465 HG1 SER 1 95 .*. H " "ATOM 1747 HG1 SER 1 113 .*. H " "ATOM 1758 HG1 SER 1 114 .*. H " "ATOM 2173 HG1 SER 1 141 .*. H " ... (remaining 6830 not shown) I tried phenix.ready_set to fix this problem according to a previous discussion but it gave me another error: ENDMDL record missing at end of input. Thus my third question will be how to fix the first error? Thank you for patience! I would really appreciate your help! Bing
Hello Bing,
General strategies: I select COOT for the model building; using phenix.real_sapce_refine for the real space refinement with secondary structure restrain and REFMAC for the reciprocal space refinement.
what's the purpose of reciprocal space refinement if you don't have any measured reflections ? (it's cryo-EM data, which is map!).
Few questions are listed here. 1. The map was too bigger to open it in COOT. The phenix.map_to_structure_factors was used to obtaine ~120 MB sized MTZ file (still a little big for my computer). I manually build up the whole ball-shaped phage with the rigid body fit in COOT (from two X-ray structures to 120 chains). My first question will be: In this case, should I crop the map in Chimera or other software and only focus on a small asymmetric unit to do the model building and the followed refinement.
Sometimes box is way larger than actual model. If that's the case you can try phenix.map_box model.pdb map.mrc Also, you can do phenix.map_box model.pdb map.mrc selection="chain A" that will give you a box with map and selected part of model.
2. I would like to do a real space refinement after the model building. Input files: A pdb file I just built up from COOT A original MAP file
This is all you need for real-space refinement.
A transfered MTZ file
This is fiction, you don't need it.
Two restraint files from two X-ray structures by ProSMART (TXT formart)
Phenix does not recognize ProSMART. You can use Phenix tools to generate secondary structure restraints, such as phenix.secondary_structure_restraints.
The refinement parameters I would like to select in GUI interface: minimization_global, rigid_body, local_grid_search, adp
At 6A resolution you are not going to see sidechains, so no need to do local_grid_search.
Use secondary structure restraints Reference model restraints: use starting model as reference, main chain, side chain, fix outliers, secondary structure only Rotamer restraints Ramachandran restraints Show per residue
I'd do a default run first to see what happens.
My second set of questions: Should I select the MAP file (~870 MB) or the MTZ file (~120 MB)?
If MTZ file is a FT of the map (full box of reflections, not a sphere) then both files (map and mtz) contain exact same information, one in real space and the other in reciprocal space. I'd use original data (map), not manipulated one (mtz).
Is that necessary to add the two restraint files from ProSMART if I use the starting model as reference?
Phenix does not recognize ProSMART.
Is the refinement parameters selected properly?
See above.
3. I gave a try by phenix.real_space_refine. An first error showed up: Number of atoms with unknown nonbonded energy type symbols: 6840 "ATOM 184 HG1 SER 1 12 .*. H " "ATOM 458 HG1 SER 1 30 .*. H " "ATOM 699 HG1 SER 1 45 .*. H " "ATOM 720 HG1 SER 1 47 .*. H " "ATOM 762 HG1 SER 1 50 .*. H " "ATOM 1241 HG1 SER 1 81 .*. H " "ATOM 1465 HG1 SER 1 95 .*. H " "ATOM 1747 HG1 SER 1 113 .*. H " "ATOM 1758 HG1 SER 1 114 .*. H " "ATOM 2173 HG1 SER 1 141 .*. H " ... (remaining 6830 not shown)
Looks like PDB file is bad. Serine residue does not have HG1, it should be HG. Get rid of H by using phenix.reduce model.pdb -trim > model-no-h.pdb
I tried phenix.ready_set to fix this problem according to a previous discussion but it gave me another error: ENDMDL record missing at end of input. Thus my third question will be how to fix the first error?
Looks like your file contains several models (MODEL-ENDMDL). This is not supported. Convert it into multi-chain model: phenix.models_as_chains model.pdb
Thank you for patience! I would really appreciate your help!
You are welcome! Let us know should you have any more questions or need help! Pavel
participants (2)
-
Pavel Afonine
-
Wang, Bing