[phenixbb] [ccp4bb] Se SAD phasing map becomes worse after refinement

Tom Terwilliger tterwilliger at newmexicoconsortium.org
Tue Mar 24 06:46:26 PDT 2020

Hi Zhu,

I would suggest starting by stepping back and looking for problems all
along the way.

I notice your space group is shown as C222.  This is an uncommon space
group, occurring only 0.25% of the time in the PDB.  A much more common but
related space group is C222(1) which occurs 20 times more frequently (5%).
Have a very close look at your systematic absences and rerun pointless and
xtriage to check and see if you have the right space group.  Depending on
your cell dimensions there are other possibilities for your space group as
well (listed for you by these programs as well).

Look over your data processing output and your xtriage output carefully. Is
there translational non-crystallographic symmetry? Twinning? Are the
intensity statistics as expected? If there are any unusual situations or
characteristics, follow them up.

When you run autosol it will test the space group you specify and its
enantiomer (if any).  If there are other possibilities you would need to
run them separately.

For autobuilding you have two choices.  You can run autobuild, and as Randy
pointed out, autobuild will build only one type of chain at a time (see

Your other option is running map_to_model. This tool is intended for
cryo-EM but you can (sometimes) use it with crystallographic data. You can
just give it a try. It can build multiple chain types, uses
multiprocessing, and can take a long time.

With autobuild when you have DNA and protein the suggested approach is (see

   - The AutoBuild model-building can only build one type of chain at a
   time (default chain_type='PROTEIN'; other choices are RNA and DNA). If you
   supply a PDB file containing more than one type of chain for rebuilding,
   then all the residues that are not that type of chain are treated as
   ligands and are (by default, keep_input_ligands=True) included in
   refinement but not in rebuilding. Any input solvent molecules are (by
   default, keep_input_waters=False) ignored.

You can include more than one type of chain in rebuilding by supplying one
type of chains as ligands with input_lig_file_list and rebuilding another

chain_type=PROTEIN  # build only protein
input_lig_file_list=MyDNA.pdb  # just read in DNA coordinates and
include in refinement

In this case only protein chains will be built, but the DNA coordinates in
MyDNA.pdb will be included in all refinements and will be written out to
the final coordinate file. You may wish to add the keyword:

keep_pdb_atoms=False  #keep the ligand atoms if model (pdb) and ligand overlap

which will tell AutoBuild that the ligand (DNA) atoms are to be kept if the
model that is being built (protein) overlaps with it. (The default is to
keep the model that is being built and to discard any ligand atoms that

This whole process is likely to require substantial editing of the PDB
files by hand because when you build DNA, a lot of chains are going to be
built into the protein region, and when you build protein, it is going to
be accidentally built into the DNA.

All the best,

Tom T

On Tue, Mar 24, 2020 at 2:55 AM Randy Read <rjr27 at cam.ac.uk> wrote:

> Dear Zhu,
> Questions specific to Phenix should really go to the Phenix-BB, so I am
> cross-posting my reply there.  Here I’ll focus more on generic issues.
> There are also CCP4 tools that you could consider and presumably other
> people will offer advice on those.
> One point you raise comes up so much that we have an entry in our FAQ (
> https://www.phaser.cimr.cam.ac.uk/index.php/FAQ) about it: “Can I use
> Phaser to check the correctness of a model I have already built and
> refined?”  The answer is no, because once you’ve refined the model it
> becomes better than random at predicting the data and therefore achieves a
> high LLG score, regardless of whether or not it is correct.
> In this case, you might be able to use Phaser to help complete the model
> if one of the two copies of the complex is more completely modelled than
> the other.  If, say, you had a model for one copy you could fix that and
> search for a second copy: this should work because the refinement didn’t
> know anything about the second copy.
> Phenix.autosol attempts to determine the NCS operators, so you need to
> check whether that has succeeded.  If not, you might need to try some more
> manual approaches to defining the NCS, which would help a great deal in map
> improvement.  Tom Terwilliger might answer this in more detail (perhaps on
> the Phenix-BB), but I don’t think you can build protein and nucleic acid in
> the same job, so you should look at the documentation to see how to do that.
> Good luck!
> Randy Read
> -----
> Randy J. Read
> Department of Haematology, University of Cambridge
> Cambridge Institute for Medical Research     Tel: +44 1223 336500
> The Keith Peters Building                               Fax: +44 1223
> 336827
> Hills Road                                                       E-mail:
> rjr27 at cam.ac.uk
> Cambridge CB2 0XY, U.K.
> www-structmed.cimr.cam.ac.uk
> > On 24 Mar 2020, at 04:31, Zhu Qiao <jasonqiao03 at GMAIL.COM> wrote:
> >
> > Dear All
> >
> > I am sorry for the long context.
> >
> > I have one protein (252 AAs, 2 Met) bound to double-stranded DNA (24 bp)
> crystalized.  I collected the Se-Met data of the crystal in C222 up to 2.8
> angstrom. the space group is confirmed by running the pointless.
> >
> > I used the Phenix.Autosol to find the heavy atoms and get a quite nice
> map after the density modification.  It seems there are two proteins and
> two DNA duplex are in one ASU. Phenix.Autobuild can only build less than
> half of the protein sequence into the map and fill in the potential DNA map
> with amino acids. The Rwork/Rfree is 0.40 and 0.46, with the map CC=0.60.
> If I do the MR with the initial model built by Autobuild, the result
> TFZ=40, LLG=200+, which suggests the partial correction of the initial
> model.
> >
> > Here is the problem. From the map, I can see one of my protein domain
> and a clear feature of DNA double helix. But whatever I go further for
> manual build using coot, like building the DNA double-strand into the map
> and building the resolved domain, the refinement statistics go bad with R
> free ~0.50.
> >
> > I am wondering what's going wrong and how come the refinement can't
> improve the R factor.
> >
> > I have attached the relevant photos.
> >
> https://drive.google.com/drive/folders/1dJ4kn7CEHkCL3sMcCJtBqa5OsVFQngBx?usp=sharing
> >
> >
> > Sincerely
> > Zhu
> >
> > ########################################################################
> >
> > To unsubscribe from the CCP4BB list, click the following link:
> > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb
> Unsubscribe: phenixbb-leave at phenix-online.org

Thomas C Terwilliger
Laboratory Fellow, Los Alamos National Laboratory
Senior Scientist, New Mexico Consortium
100 Entrada Dr, Los Alamos, NM 87544
Email: tterwilliger at newmexicoconsortium.org
Tel: 505-431-0010
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20200324/857a8177/attachment.htm>

More information about the phenixbb mailing list