[phenixbb] R value difference in model vs data vs refinement

Thu Mar 22 15:07:33 PDT 2012

On Thu, Mar 22, 2012 at 2:54 PM, Christian Roth
<christian.roth at bbz.uni-leipzig.de> wrote:
> I am not sure I looked into the polygon. In the log it is stated that the R
> values are calculated after a resolution and sigma cutoff applied.  If I
> understood the log correctly the values taken from the pdb header are without
> sigma cutoff. Maybe thats the reason for the difference. Does modelvsdata
> somewhere print the values without cutoff in the log file? I did not find it.
> However does this mean till firsst OHS in phenix refine a default cutoff is used
> and in than throughout the refinement no coutoff is used anymore?

After spending some time looking at similar cases today I am not sure
myself what is going on.  I do not think a sigma cutoff is applied,
unless perhaps the PDB header indicates that one was used previously
(this is a thoroughly antiquated practice).  However, outlier
filtering appears to be used throughout.  I found one example where
nearly 4000 reflections have amplitudes of zero (which is surely not
correct), and are discarded as outliers in phenix.refine.  This
reduces R-free by 0.03.  In model_vs_data, the same numbers appear
twice:

  Model_vs_Data:
    r_work(re-computed)                : 0.2030
    r_free(re-computed)                : 0.2639
...
  After applying resolution and sigma cutoffs:
    n_refl_cutoff : 31257
    r_work_cutoff : 0.2030
    r_free_cutoff : 0.2639

But this totally contradicts what I told you earlier, sorry.  I was
assuming that they would be different.

I do have a general piece of advice, however: ignore the discrepancy,
and just report the value that came out of refinement (because that is
what will end up in the PDB).  The difference in your case is
relatively small, probably less than what you'd see if you calculated
R-factors with (for instance Refmac), because of different
implementations of bulk solvent correction and scaling, etc.*.  (Even
different versions of Phenix aren't guaranteed to yield identical
R-factors, due to low-level changes.)  Considering how difficult it
can be to reproduce the statistics in published structures, a change
of 0.0004 isn't enough to worry about.

-Nat

* http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2906258/?tool=pubmed