I propose that the issue of using test reflections be addressed by more than just speculation. Create a 10% test set for refinement, but divide that into two 5% groups. One is used as a "pure" test set, and the other subgroup is allowed to be used in non-refinement tasks. During the course of an entire structure determination, one can monitor whether the unbiased test set differs from the biased test set. My concern is that if the data is weak enough that including test reflections is required to interpret a part of the map, then the data is probably too weak to distinguish a correct model from a model-bias. For density modification, it may be possible to converge on a good map if both missing and test reflections use "Fcalc" fill-in values from the previous density-modified transformation. Maybe the people who could not get good results without the test reflections used a DM method that reset missing values to zero on every cycle. Pavel Afonine wrote:
Hi Joe,
Normally, 5% for R-free is sufficient.
did anyone studied this and came to this conclusion (publication?)? I'm not aware. Is there a paper showing that 10% is required or sufficient? Maybe that needs to be studied as well. ...
I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.
Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email). Yes, but people can still use maps from PHENIX with external real-space refinement.
With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.
- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though). I started wondering about test reflections im maps because I was removing some bad "waters" and found a bigger increase in Rfree than in Rwork. That made me wonder if they had a significant component of test-set density. But, I also have not checked this in detail.
Joe Krahn