[phenixbb] How to reduce clashscore value

Pavel Afonine pafonine at lbl.gov
Sat Nov 19 23:20:51 PST 2011

Hi Tim,

> The method of using the ratio of gradients doesn't make sense in a 
> maximum likelihood context,

assuming that by "a maximum likelihood context" you mean refinement 
using a maximum-likelihood (ML) criterion as X-ray term (or, more 
generally, I would call it experimental data term, as it can be neutron 
too, for instance), I find the whole statement above as a little bit 
strange since it mixes different and absolutely not related things: type 
of crystallographic data term and a method of relative scale (weight) 
determination between it and the other term (restraints).

I don't see how the choice of crystallographic data term (LS, ML, 
real-space or any other) is related to the method of this scale 

The only difference between LS and ML targets is that the latter 
accounts for model completeness and errors in a statistical manner. The 
differences between LS and ML are completely irrelevant to the choice of 
weight between crystallographic and restraints terms. In fact, the ML 
target can even be approximated with LS (J. Appl.Cryst.(2003).36, 
158-159) without any noticeable loss. ML target itself can be formulated 
in a few different ways and that alone can result in optimal weight 
values different by order of magnitude, while showing no difference in 
refinement results (since it is a matter of relative scale between two 
functions, that can be totally arbitrary).

The ratio of gradients norms gives a good estimate for the optimal 
weight. In fact, if you look in the math, for two-atoms system it should 
be multiplied by cos(angle_between_gradient_vectors), which for a 
many-atom structure averages out to be approximately ~0.5 (this is what 
is used in CNS by default), if I remember all this correctly.

If the data and restraints terms are normalized (doesn't matter how) 
then the weight value becomes predictable. For example, the optimal 
weight between ML and stereochemistry restraints in phenix.refine ranges 
between 1 and 10, most of the time being ~5, and the ratio of gradients 
norms predicts this very well.

Furthermore, you can always normalize any crystallographic data term 
such that the optimal weight will be around 1.

>     phenix.refine uses repulsion term only. Although one can imagine
>     reasons why attraction terms may be helpful, in reality they may
>     be counterproductive if the model geometry quality is not great
>     since attractive terms may lock wrong conformations and not let
>     them move towards correct positions dictated by the electron density.
> Refinement using a force field without electrostatics versus with 
> electrostatics was recently investigated 
> (http://dx.doi.org/10.1021/ct100506d), and found to favor its 
> inclusion across a range of models/resolutions.

I had a look at this and more recent papers. I apologize in advance if I 
missed it, but I couldn't find an example showing how the proposed 
methodology performs for poor models. I mean real working models 
(incomplete with errors, like the one you get right out of MR solution). 
The tests shown in (/Acta Cryst./(2011). D*67*, 957-965) are all 
performed using models from PDB, which are supposedly good already. Sure 
these models may have small "cosmetic" problems, but as Joosten et al 
demonstrated there is always room for improvement of PDB deposited 
models. This is partly because the methodology and tools keep improving. 
So re-refinement of PDB deposited models using newer tools is very 
likely to yield better models, as you confirmed it once again in your 
paper. What would be really interesting to see is how your new 
methodology performs in real-life routine cases, where a structure is 
far away from the good final one.

All the best!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20111119/bed91c66/attachment.htm>

More information about the phenixbb mailing list