[phenixbb] R value difference in model vs data vs refinement

Pavel Afonine pafonine at lbl.gov
Fri Mar 23 23:22:54 PDT 2012


Hi Christian,

with my current working version of Phenix I'm getting:

phenix.refine:
     r_work : 0.1486
     r_free : 0.1752

phenix.model_vs_data:
     r_work : 0.1485
     r_free : 0.1752

What's currently is my working version will be available for download 
may be sometime next week.

The difference 0.1486 vs 0.1485 may be due to:

1) loss of precision during formatting: write out/read in (what's Ralf 
explained); and/or
2) phenix.refine idealizes X-H geometry (if "hydrogens.refine=riding"), 
and phenix.model_vs_data does not do this.

A larger difference (that you reported) in older versions may be due to 
minor inconsistencies between handling input data and performing scaling.

All these differences are rather cosmetic, and should not create any 
problem. However, I totally agree that there is no place for them in a 
self-consistent system, so hopefully they will disappear very soon.

Pavel


On 3/23/12 11:39 AM, Christian Roth wrote:
> Hi,
> I am sorry I overlooked that 0.0004 is not the difference in my values.
> In my case the difference Rref-Rmvs = 0.12 and Rfreeref-Rfreemvs = 0.09
> Probbably not rounding errors, but maybe outlier corrections done internallyas
> Nat mentioned, But however Pavel asked for the data and logs. Mybe he could
> tell me finally what is the reason. Perhaps I did smethin wrong with the
> parametrization in one of the jobs.
> Christian
>
> Am Freitag 23 März 2012 18:23:13 schrieb Ralf Grosse-Kunstleve:
>> The R-factor difference of 0.0004 is what you have to expect as the result
>> of writing the coordinates, B-factors, and occupancies to the PDB file. In
>> memory floating-point numbers have>= 12 digits precision (we use double
>> precision for almost everything); in the PDB file you have only 7 digits
>> for the coordinates and just 5 digits for the B-factors and occupancies.
>> Ralf
>>
>> On Fri, Mar 23, 2012 at 9:49 AM, Christian Roth<
>>
>> christian.roth at bbz.uni-leipzig.de>  wrote:
>>> Hi Nat,
>>>
>>> I agree with you that the difference is very small and likely negligible.
>>> I just asked for curiosity if there might be any reason for this
>>> behaviour.
>>>
>>> Christian
>>>
>>> Am Donnerstag 22 März 2012 23:07:33 schrieb Nathaniel Echols:
>>>> On Thu, Mar 22, 2012 at 2:54 PM, Christian Roth
>>>>
>>>> <christian.roth at bbz.uni-leipzig.de>  wrote:
>>>>> I am not sure I looked into the polygon. In the log it is stated that
>>> the
>>>
>>>>> R values are calculated after a resolution and sigma cutoff applied.
>>>   If
>>>
>>>>> I understood the log correctly the values taken from the pdb header
>>>>> are without sigma cutoff. Maybe thats the reason for the difference.
>>>>> Does modelvsdata somewhere print the values without cutoff in the log
>>>>> file?
>>> I
>>>
>>>>> did not find it. However does this mean till firsst OHS in phenix
>>> refine
>>>
>>>>> a default cutoff is used and in than throughout the refinement no
>>> coutoff
>>>
>>>>> is used anymore?
>>>> After spending some time looking at similar cases today I am not sure
>>>> myself what is going on.  I do not think a sigma cutoff is applied,
>>>> unless perhaps the PDB header indicates that one was used previously
>>>> (this is a thoroughly antiquated practice).  However, outlier
>>>> filtering appears to be used throughout.  I found one example where
>>>> nearly 4000 reflections have amplitudes of zero (which is surely not
>>>> correct), and are discarded as outliers in phenix.refine.  This
>>>> reduces R-free by 0.03.  In model_vs_data, the same numbers appear
>>>> twice:
>>>>
>>>>    Model_vs_Data:
>>>>      r_work(re-computed)                : 0.2030
>>>>      r_free(re-computed)                : 0.2639
>>>> ...
>>>>    After applying resolution and sigma cutoffs:
>>>>      n_refl_cutoff : 31257
>>>>      r_work_cutoff : 0.2030
>>>>      r_free_cutoff : 0.2639
>>>>
>>>> But this totally contradicts what I told you earlier, sorry.  I was
>>>> assuming that they would be different.
>>>>
>>>> I do have a general piece of advice, however: ignore the discrepancy,
>>>> and just report the value that came out of refinement (because that is
>>>> what will end up in the PDB).  The difference in your case is
>>>> relatively small, probably less than what you'd see if you calculated
>>>> R-factors with (for instance Refmac), because of different
>>>> implementations of bulk solvent correction and scaling, etc.*.  (Even
>>>> different versions of Phenix aren't guaranteed to yield identical
>>>> R-factors, due to low-level changes.)  Considering how difficult it
>>>> can be to reproduce the statistics in published structures, a change
>>>> of 0.0004 isn't enough to worry about.
>>>>
>>>> -Nat
>>>>
>>>> * http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2906258/?tool=pubmed



More information about the phenixbb mailing list