[phenixbb] Are sigma cutoffs for R-free reflections cheating?
krahn at niehs.nih.gov
Thu Dec 10 10:28:18 PST 2009
(I got busy and did not follow up earlier.)
I think the value of weak reflections should be very obvious, if you
think about it correctly, and is why I/sigma cutoffs have been
discouraged for many years now. The significance of a reflection is not
its overall amplitude, but how far it deviates from the expected value,
which is approximately the average value for a resolution bin. Weak
reflections are far more useful than average-intensity reflections. If
you consider the 2D vector space of a complex number, they are very well
defined because the phase is not as important.
People get confused about I/sigma significance for two reasons. First,
weak reflections have no effect on maps, and many rules-of-thumb were
developed from heavy-atom methods. Refinement is different.
Second, I/sigma is a useful validity measure for a set (resolution bin)
of reflections, because it indicates the significance of the expectation
value. If the I/sigma for a shell is small, than all reflections are
approximately the same as the expectation value.
In my experience, anisotropic data can be poorly behaved when many weak
reflections are missing in the low-resolution directions, because there
is nothing to prevent the model from refining to non-zero values there.
My point is that no matter what the argument for the value of culling
reflections, or any other sort of weighting scheme, it should never be
applied to the "true" R-free value.
In practice, some culling is sensible when scaling, but it should be
restricted to rejections based on multiple observations of the same
reflection, and not systematic culling such as I/sigma cutoffs. That is
why HKL sets the default sigma cutoff to -3.
Randy Read wrote:
> Pavel wanted some evidence of whether or not it makes a difference to
> omit very weak reflections. Here's one relevant paper. Hirshfeld and
> Rabinovich (Acta Cryst. A29: 510-513, 1973) showed in a numerical
> experiment that, if you omit weak intensities, there is a systematic
> error in refined scale and ADP parameters. They used least squares so
> it's possible that the results would be somewhat different with
> maximum likelihood targets, but at least here is an objective
> demonstration that the weak data can have a significant influence.
More information about the phenixbb