Hi Joe,
Normally, 5% for R-free is sufficient.
did anyone studied this and came to this conclusion (publication?)? I'm not aware. In fact, the absolute number is important. The number of test reflections per relatively thin resolution shell has to be not smaller than 50. This assures that the determination of maximum-likelihood target parameters (alpha/beta, or sigmaa) is well defined. More - better, but too much is not good since excluding too many reflections is not good too. If interpolation is used, then less than 50 can be used (I guess what CNS is using), but I have reasons to not like it. When phenix.refine creates test reflections, by default it is 10%, but not more than 2000.
Even though you may not do real-space refinement with free reflections, external tools can do that with the maps written out.
It is the most efficient to combine local and global real-space refinement with reciprocal space refinement. I call it dual-space refinement. This is why it is tightly integrated into phenix.refine: http://cci.lbl.gov/~afonine/fix_rotamers/fit_rotamers.pdf
I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.
Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email).
With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.
- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though).
IMHO, using test reflections for anything but computing R-free should always be avoided unless you are unable to proceed using only the non-test. Using test reflections is always cheating to some extent, although trivial amounts of bias are probably removed during refinement. I just think it is better to be very strict about test reflections and avoid the possibility of bias.
Test reflections are used for calculation of m and D in 2mFo-DFc and mFo-DFc maps, as well as in alpha/beta parameters of ML target. This is inevitable.
Exclusion of test reflections ought to be an option. Ideally, deposited PDB files should report whether maps used for model building included test reflections.
Ideally yes, but I can name a hundred of other similarly important parameters to report. *At least a set of all parameters must be reported so the published R-factors are 100% reproducible - I'm sure this is an easy doable goal to start with.* Pavel.