[phenixbb] Gap between R and Rfree

Felix Frolow mbfrolow at post.tau.ac.il
Fri Feb 26 12:18:33 PST 2010


Yea Phil, You have  got them
I was accused several years ago for over-fitting Rfree !!!!!!!
The bad thing is that to publish you have to avoid  arguing with these 
post-crystallography era referees.
Today they ask meaningless question " I am novice to crystallography, teach me how to minimize Rfree"
Tomorrow they referee you papers and make meaningless remarks about gap between R and Rfree
" Your R factor is 17% and Rfree is 24%, it rase question"
geeeeek!!!!!! what question?????
Dr  Felix Frolow
Professor of Structural Biology and Biotechnology
Department of Molecular Microbiology
and Biotechnology
Tel Aviv University 69978, Israel

Acta Crystallographica D, co-editor

e-mail: mbfrolow at post.tau.ac.il
Tel:           ++972 3640 8723
Fax:          ++972 3640 9407
Cellular:   ++972 547 459 608

On Feb 26, 2010, at 11:30 , Phil Evans wrote:

> 
> I do worry a bit about the fetishisation of R-factors, to the point where I get the impression that some people think that the object of refinement is minimisation of Rfree. The aim is to produce the "best" model consistent with the data, and the most important tool is looking at the maps on the graphics, in conjunction with validation tools analysing consistency with what we know about stereochemistry (Ramachandran, rotamers, bumps etc). Also is your model consistent with your initial experimental phased map, if available? The Rfree is a useful guide, but no more than that.
> 
> My (admittedly superficial) experience of changing weights, freeR selection, resolution limits, NCS, etc in refinement is that often you can spend some time doing this, maybe improve the R-factors a bit,  but end up with a model which is indistinguishable from the previous model when superimposed on it.
> 
> I'm fed up with referees telling me my structure must be wrong because the Rfree is "too high" (and I haven't followed their particular pet protocol), when I know that it is essentially correct, and as good as I can make it.
> 
> Phil
> 
> 
> On 26 Feb 2010, at 01:49, Pavel Afonine wrote:
> 
>> Hi Simon,
>> 
>> the next available PHENIX nightly build will have this command:
>> 
>> phenix.r_factor_statistics
>> 
>> This will output a plain text histograms for Rwork, Rfree and Rfree-Rwork. For example:
>> 
>> ***
>> phenix.r_factor_statistics 2.5 left_offset=0.2 right_offset=0.2 n_bins=5
>> Command used:
>> 
>> phenix.r_factor_statistics 2.500 left_offset=0.200 right_offset=0.200 n_bins=5
>> 
>> 
>> Histogram of Rwork for models in PDB at resolution 2.30-2.70 A:
>>   0.115 - 0.168      : 149
>>   0.168 - 0.220      : 2998
>>   0.220 - 0.273      : 1879
>>   0.273 - 0.325      : 67
>>   0.325 - 0.378      : 2
>> Histogram of Rfree for models in PDB at resolution 2.30-2.70 A:
>>   0.146 - 0.210      : 116
>>   0.210 - 0.274      : 3288
>>   0.274 - 0.337      : 1664
>>   0.337 - 0.401      : 26
>>   0.401 - 0.465      : 1
>> Histogram of Rfree-Rwork for all model in PDB at resolution 2.30-2.70 A:
>>   0.001 - 0.021      : 233
>>   0.021 - 0.041      : 1412
>>   0.041 - 0.060      : 2180
>>   0.060 - 0.080      : 1045
>>   0.080 - 0.100      : 225
>> Number of structures considered: 5095
>> ***
>> 
>> and running it without arguments will take all structures.
>> 
>> Pavel.
>> 
>> 
>> 
>> On 2/25/10 2:01 PM, Simon Kolstoe wrote:
>>> oh yes please Pavel (command line polygon/statistics overview) - could the output be a pdf or ps rather than just starting up the GUI?
>>> 
>>> Simon
>>> 
>>> 
>>> 
>>> On 25 Feb 2010, at 19:39, Pavel Afonine wrote:
>>> 
>>>> Hi Francis,
>>>> 
>>>> in PHENIX GUI: "Validation" -> "PDB Statistics Overview"
>>>> 
>>>> or:
>>>> 
>>>> in PHENIX GUI: "Validation" -> "POLYGON"
>>>> 
>>>> The underlying idea is published here:
>>>> 
>>>> Crystallographic model quality at a glance. L. Urzhumtseva, P. V. Afonine, P. D. Adams, A. Urzhumtsev Acta Cryst. D65, 297-300 (2009)
>>>> 
>>>> I can add a command line version if anyone is interested.
>>>> 
>>>> Pavel.
>>>> 
>>>> 
>>>> 
>>>> On 2/25/10 9:45 AM, Francis E Reyes wrote:
>>>>> Pavel
>>>>> 
>>>>> What phenix utility produces such information?
>>>>> Thx
>>>>> 
>>>>> FR
>>>>> 
>>>>> On Feb 25, 2010, at 10:43 AM, Pavel Afonine <PAfonine at lbl.gov> wrote:
>>>>> 
>>>>>> Hi Young-Jin,
>>>>>> 
>>>>>> the distribution of R-factors for all structures in PDB refined at resolutions between 2.4 and 2.6 A is following:
>>>>>> 
>>>>>> Histogram of Rwork for all model in PDB at resolution 2.40-2.60:
>>>>>> 0.115 - 0.141      : 5
>>>>>> 0.141 - 0.168      : 69
>>>>>> 0.168 - 0.194      : 414
>>>>>> 0.194 - 0.220      : 955  <<<<< your structure
>>>>>> 0.220 - 0.246      : 695
>>>>>> 0.246 - 0.273      : 153
>>>>>> 0.273 - 0.299      : 25
>>>>>> 0.299 - 0.325      : 5
>>>>>> 0.325 - 0.352      : 0
>>>>>> 0.352 - 0.378      : 1
>>>>>> Histogram of Rfree for all model in PDB at resolution 2.40-2.60:
>>>>>> 0.146 - 0.178      : 3
>>>>>> 0.178 - 0.210      : 41
>>>>>> 0.210 - 0.242      : 404
>>>>>> 0.242 - 0.274      : 1104
>>>>>> 0.274 - 0.305      : 653   <<<<< your structure
>>>>>> 0.305 - 0.337      : 108
>>>>>> 0.337 - 0.369      : 8
>>>>>> 0.369 - 0.401      : 0
>>>>>> 0.401 - 0.433      : 0
>>>>>> 0.433 - 0.465      : 1
>>>>>> Histogram of Rfree-Rwork for all model in PDB at resolution 2.40-2.60:
>>>>>> 0.002 - 0.012      : 28
>>>>>> 0.012 - 0.022      : 83
>>>>>> 0.022 - 0.031      : 232
>>>>>> 0.031 - 0.041      : 430
>>>>>> 0.041 - 0.051      : 458
>>>>>> 0.051 - 0.061      : 493
>>>>>> 0.061 - 0.071      : 298
>>>>>> 0.071 - 0.080      : 186
>>>>>> 0.080 - 0.090      : 85   <<<<< your structure
>>>>>> 0.090 - 0.100      : 29
>>>>>> 
>>>>>> which I interpret as your Rwork is most likely good as well as Rfree, but the gap Rfree-Rwork is too large meaning possible overfitting.
>>>>>> 
>>>>>> A few tips:
>>>>>> - if you have NCS - use it in refinement;
>>>>>> - optimize refinement target weights: a) automatically: "optimize_wxc=true optimize_wxu=true", or b) manually using wxc_scale and wxu_scale parameters;
>>>>>> - make sure you use the latest PHENIX version.
>>>>>> 
>>>>>> Let me now if you have any questions.
>>>>>> 
>>>>>> Pavel.
>>>>>> 
>>>>>> 
>>>>>> On 2/25/10 9:17 AM, Young-Jin Cho wrote:
>>>>>>> Hi everyone,
>>>>>>> I recently got a diffraction data and am trying to refine it. The question is whatever I did the gap between R and Rfree stay far away: .2140/.2982 around 2.5 A resolution. (water added and isotropic(individual_adp) and so on).
>>>>>>> Although I am redoing with many different approaches, if anyone can give me any comments or suggestions, it would be helpful.
>>>>>>> 
>>>>>>> Thanks in advance,
>>>>>>> 
>>>>>>> Young-Jin
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> phenixbb mailing list
>>>>>>> phenixbb at phenix-online.org
>>>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> phenixbb mailing list
>>>>>> phenixbb at phenix-online.org
>>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>>> _______________________________________________
>>>>> phenixbb mailing list
>>>>> phenixbb at phenix-online.org
>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>> _______________________________________________
>>>> phenixbb mailing list
>>>> phenixbb at phenix-online.org
>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>> 
>>> _______________________________________________
>>> phenixbb mailing list
>>> phenixbb at phenix-online.org
>>> http://phenix-online.org/mailman/listinfo/phenixbb
>> _______________________________________________
>> phenixbb mailing list
>> phenixbb at phenix-online.org
>> http://phenix-online.org/mailman/listinfo/phenixbb
> 
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb




More information about the phenixbb mailing list