Hello Bing,

General strategies: I select COOT for the model building; using phenix.real_sapce_refine for the real space refinement with secondary structure restrain and REFMAC for the reciprocal space refinement.

what's the purpose of reciprocal space refinement if you don't have any measured reflections ? (it's cryo-EM data, which is map!).

Few questions are listed here.
1. The map was too bigger to open it in COOT. The phenix.map_to_structure_factors was used to obtaine ~120 MB sized MTZ file (still a little big for my computer). I manually build up the whole ball-shaped phage with the rigid body fit in COOT (from two X-ray structures to 120 chains). My first question will be: In this case, should I crop the map in Chimera or other software and only focus on a small asymmetric unit to do the model building and the followed refinement.

Sometimes box is way larger than actual model. If that's the case you can try

phenix.map_box model.pdb map.mrc

Also, you can do

phenix.map_box model.pdb map.mrc selection="chain A"

that will give you a box with map and selected part of model.


2. I would like to do a real space refinement after the model building.
Input files:� �
��� A pdb file I just built up from COOT
��� A original MAP file

This is all you need for real-space refinement.

��� A transfered MTZ file

This is fiction, you don't need it.

��� Two restraint files from two X-ray structures by ProSMART (TXT formart)

Phenix does not recognize ProSMART. You can use Phenix tools to generate secondary structure restraints, such as phenix.secondary_structure_restraints.

The refinement parameters I would like to select in GUI interface:
��� minimization_global, rigid_body, local_grid_search, adp

At 6A resolution you are not going to see sidechains, so no need to do local_grid_search.

��� Use secondary structure restraints
��� Reference model restraints: use starting model as reference, main chain, side chain, fix outliers, secondary structure only
��� Rotamer restraints
��� Ramachandran restraints
��� Show per residue

I'd do a default run first to see what happens.

My second set of questions: Should I select the MAP file (~870 MB) or the MTZ file (~120 MB)?

If MTZ file is a FT of the map (full box of reflections, not a sphere) then both files (map and mtz) contain exact same information, one in real space and the other in reciprocal space. I'd use original data (map), not manipulated one (mtz).

Is that necessary to add the two restraint files from ProSMART if I use the starting model as reference?

Phenix does not recognize ProSMART.

Is the refinement parameters selected properly?

See above.

3. I gave a try by phenix.real_space_refine. An first error showed up:
Number of atoms with unknown nonbonded energy type symbols: 6840
��� "ATOM��� 184� HG1 SER 1� 12 .*.���� H� "
��� "ATOM��� 458� HG1 SER 1� 30 .*.���� H� "
��� "ATOM��� 699� HG1 SER 1� 45 .*.���� H� "
��� "ATOM��� 720� HG1 SER 1� 47 .*.���� H� "
��� "ATOM��� 762� HG1 SER 1� 50 .*.���� H� "
��� "ATOM�� 1241� HG1 SER 1� 81 .*.���� H� "
��� "ATOM�� 1465� HG1 SER 1� 95 .*.���� H� "
��� "ATOM�� 1747� HG1 SER 1 113 .*.���� H� "
��� "ATOM�� 1758� HG1 SER 1 114 .*.���� H� "
��� "ATOM�� 2173� HG1 SER 1 141 .*.���� H� "
��� ... (remaining 6830 not shown)

Looks like PDB file is bad. Serine residue does not have HG1, it should be HG. Get rid of H by using

phenix.reduce model.pdb -trim > model-no-h.pdb

I tried phenix.ready_set to fix this problem according to a previous discussion but it gave me another error: ENDMDL record missing at end of input.
Thus my third question will be how to fix the first error?

Looks like your file contains several models (MODEL-ENDMDL). This is not supported. Convert it into multi-chain model:

phenix.models_as_chains model.pdb

Thank you for patience! I would really appreciate your help!

You are welcome! Let us know should you have any more questions or need help!

Pavel