Does PHENIX exclude Rfree reflections in maps?

Joe Krahn

30 Sep 2009 30 Sep '09

11:47 a.m.

Does PHENIX exclude R-free reflections when computing maps? IMHO, it is bad to use test reflections in maps, or anywhere else other than computing R-free, but opinions vary. If some people still want test reflections in maps, maybe it could be an option. Thanks, Joe Krahn

Show replies by date

Tom Terwilliger

30 Sep 30 Sep

11:51 a.m.

Hi Joe, In PHENIX R-free reflections are included in your maps. All the best, Tom T On Sep 30, 2009, at 12:47 PM, Joe Krahn wrote:

...

Does PHENIX exclude R-free reflections when computing maps? IMHO, it is bad to use test reflections in maps, or anywhere else other than computing R-free, but opinions vary. If some people still want test reflections in maps, maybe it could be an option.

Thanks, Joe Krahn _______________________________________________ phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb

Thomas C. Terwilliger Mail Stop M888 Los Alamos National Laboratory Los Alamos, NM 87545 Tel: 505-667-0072 email: [email protected] Fax: 505-665-3024 SOLVE web site: http://solve.lanl.gov PHENIX web site: http:www.phenix-online.org ISFI Integrated Center for Structure and Function Innovation web site: http://techcenter.mbi.ucla.edu TB Structural Genomics Consortium web site: http://www.doe-mbi.ucla.edu/TB CBSS Center for Bio-Security Science web site: http://www.lanl.gov/cbss

Pavel Afonine

2:29 p.m.

Hi Joe, hi Tom, until recently, all reflections (work and test) were used in map calculation. When I added a real-space refinement option to phenix.refine I spent a day thinking hard about why Rfree was almost always equal to Rwork after real-space refinement, until I realized that I have to exclude test reflections from map calculation, and that fixed the problem. I think it depends on the task: - for things like real-space refinement (where the map is heavily used) or automated model building, the test reflections have to be excluded from map calculation; - for catching a tiny detail (looking at weak ligand density = slight map use) or fixing a few side chains, all reflections should be used, since 10% of data put aside may sometime significantly affect map quality. Pavel. On 9/30/09 11:51 AM, Tom Terwilliger wrote:

...

Hi Joe, In PHENIX R-free reflections are included in your maps. All the best, Tom T

On Sep 30, 2009, at 12:47 PM, Joe Krahn wrote:

...
Does PHENIX exclude R-free reflections when computing maps? IMHO, it is bad to use test reflections in maps, or anywhere else other than computing R-free, but opinions vary. If some people still want test reflections in maps, maybe it could be an option.

Thanks, Joe Krahn

Joe Krahn

5:03 p.m.

Normally, 5% for R-free is sufficient. Even though you may not do real-space refinement with free reflections, external tools can do that with the maps written out. I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating. With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions. Those should probably only use unbiased maps. IMHO, using test reflections for anything but computing R-free should always be avoided unless you are unable to proceed using only the non-test. Using test reflections is always cheating to some extent, although trivial amounts of bias are probably removed during refinement. I just think it is better to be very strict about test reflections and avoid the possibility of bias. Exclusion of test reflections ought to be an option. Ideally, deposited PDB files should report whether maps used for model building included test reflections. Joe Pavel Afonine wrote:

...

Hi Joe, hi Tom,

until recently, all reflections (work and test) were used in map calculation. When I added a real-space refinement option to phenix.refine I spent a day thinking hard about why Rfree was almost always equal to Rwork after real-space refinement, until I realized that I have to exclude test reflections from map calculation, and that fixed the problem.

I think it depends on the task:

- for things like real-space refinement (where the map is heavily used) or automated model building, the test reflections have to be excluded from map calculation;

- for catching a tiny detail (looking at weak ligand density = slight map use) or fixing a few side chains, all reflections should be used, since 10% of data put aside may sometime significantly affect map quality.

Pavel.

On 9/30/09 11:51 AM, Tom Terwilliger wrote:

...
Hi Joe, In PHENIX R-free reflections are included in your maps. All the best, Tom T

On Sep 30, 2009, at 12:47 PM, Joe Krahn wrote:

...
Does PHENIX exclude R-free reflections when computing maps? IMHO, it is bad to use test reflections in maps, or anywhere else other than computing R-free, but opinions vary. If some people still want test reflections in maps, maybe it could be an option.

Thanks, Joe Krahn

phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb

Pavel Afonine

6:09 p.m.

Hi Joe,

...

Normally, 5% for R-free is sufficient.

did anyone studied this and came to this conclusion (publication?)? I'm not aware. In fact, the absolute number is important. The number of test reflections per relatively thin resolution shell has to be not smaller than 50. This assures that the determination of maximum-likelihood target parameters (alpha/beta, or sigmaa) is well defined. More - better, but too much is not good since excluding too many reflections is not good too. If interpolation is used, then less than 50 can be used (I guess what CNS is using), but I have reasons to not like it. When phenix.refine creates test reflections, by default it is 10%, but not more than 2000.

...

Even though you may not do real-space refinement with free reflections, external tools can do that with the maps written out.

It is the most efficient to combine local and global real-space refinement with reciprocal space refinement. I call it dual-space refinement. This is why it is tightly integrated into phenix.refine: http://cci.lbl.gov/~afonine/fix_rotamers/fit_rotamers.pdf

...

I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.

Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email).

...

With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.

- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though).

...

IMHO, using test reflections for anything but computing R-free should always be avoided unless you are unable to proceed using only the non-test. Using test reflections is always cheating to some extent, although trivial amounts of bias are probably removed during refinement. I just think it is better to be very strict about test reflections and avoid the possibility of bias.

Test reflections are used for calculation of m and D in 2mFo-DFc and mFo-DFc maps, as well as in alpha/beta parameters of ML target. This is inevitable.

...

Exclusion of test reflections ought to be an option. Ideally, deposited PDB files should report whether maps used for model building included test reflections.

Ideally yes, but I can name a hundred of other similarly important parameters to report. *At least a set of all parameters must be reported so the published R-factors are 100% reproducible - I'm sure this is an easy doable goal to start with.* Pavel.

Joe Krahn

1 Oct 1 Oct

9:05 a.m.

I propose that the issue of using test reflections be addressed by more than just speculation. Create a 10% test set for refinement, but divide that into two 5% groups. One is used as a "pure" test set, and the other subgroup is allowed to be used in non-refinement tasks. During the course of an entire structure determination, one can monitor whether the unbiased test set differs from the biased test set. My concern is that if the data is weak enough that including test reflections is required to interpret a part of the map, then the data is probably too weak to distinguish a correct model from a model-bias. For density modification, it may be possible to converge on a good map if both missing and test reflections use "Fcalc" fill-in values from the previous density-modified transformation. Maybe the people who could not get good results without the test reflections used a DM method that reset missing values to zero on every cycle. Pavel Afonine wrote:

...

Hi Joe,

...
Normally, 5% for R-free is sufficient.

did anyone studied this and came to this conclusion (publication?)? I'm not aware. Is there a paper showing that 10% is required or sufficient? Maybe that needs to be studied as well. ...

...
I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.

Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email). Yes, but people can still use maps from PHENIX with external real-space refinement.

...
With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.

- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though). I started wondering about test reflections im maps because I was removing some bad "waters" and found a bigger increase in Rfree than in Rwork. That made me wonder if they had a significant component of test-set density. But, I also have not checked this in detail.

Joe Krahn

Peter Grey

9:43 a.m.

Hi Joe, There was a very fruitful discussion about R-free and its size in ccp4bb and this was summerized (including views from Read and from Cowtan) in http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Test_set HTH, Peter. http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Test_set On Thu, Oct 1, 2009 at 6:05 PM, Joe Krahn wrote:

...

I propose that the issue of using test reflections be addressed by more than just speculation.

Create a 10% test set for refinement, but divide that into two 5% groups. One is used as a "pure" test set, and the other subgroup is allowed to be used in non-refinement tasks. During the course of an entire structure determination, one can monitor whether the unbiased test set differs from the biased test set.

My concern is that if the data is weak enough that including test reflections is required to interpret a part of the map, then the data is probably too weak to distinguish a correct model from a model-bias.

For density modification, it may be possible to converge on a good map if both missing and test reflections use "Fcalc" fill-in values from the previous density-modified transformation. Maybe the people who could not get good results without the test reflections used a DM method that reset missing values to zero on every cycle.

Pavel Afonine wrote:

...
Hi Joe,

...
Normally, 5% for R-free is sufficient.

did anyone studied this and came to this conclusion (publication?)? I'm not aware. Is there a paper showing that 10% is required or sufficient? Maybe that needs to be studied as well. ...

...
I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.

Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email). Yes, but people can still use maps from PHENIX with external real-space refinement.

...
With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.

- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though). I started wondering about test reflections im maps because I was removing some bad "waters" and found a bigger increase in Rfree than in Rwork. That made me wonder if they had a significant component of test-set density. But, I also have not checked this in detail.

Joe Krahn _______________________________________________ phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb

-- Peter

Paul Adams

8:11 p.m.

Might be worth looking at: Separating model optimization and model validation in statistical cross-validation as applied to crystallography. Kleywegt GJ. Acta Crystallogr D Biol Crystallogr. 2007 Sep;63(Pt 9):939-40 On Oct 1, 2009, at 9:05 AM, Joe Krahn wrote:

...

I propose that the issue of using test reflections be addressed by more than just speculation.

Create a 10% test set for refinement, but divide that into two 5% groups. One is used as a "pure" test set, and the other subgroup is allowed to be used in non-refinement tasks. During the course of an entire structure determination, one can monitor whether the unbiased test set differs from the biased test set.

My concern is that if the data is weak enough that including test reflections is required to interpret a part of the map, then the data is probably too weak to distinguish a correct model from a model-bias.

For density modification, it may be possible to converge on a good map if both missing and test reflections use "Fcalc" fill-in values from the previous density-modified transformation. Maybe the people who could not get good results without the test reflections used a DM method that reset missing values to zero on every cycle.

...
Hi Joe,

...
Normally, 5% for R-free is sufficient.

did anyone studied this and came to this conclusion (publication?)? I'm not aware. Is there a paper showing that 10% is required or sufficient? Maybe

Pavel Afonine wrote: that needs to be studied as well. ...

...
...
I am sure that many people will use it when they find that real-space fitting and refinement tools lower R-free, unaware that they are cheating.

Sure. This is why free-R flags are not used in maps calculation for real-space refinement (my previous email). Yes, but people can still use maps from PHENIX with external real- space refinement.

...
With 10% test reflection, I suspect that difference maps used to find waters can easily find a few noise peaks with significant R-free contributions.

- I'm not aware of any systematic study on this matter, although I can believe it in theory; - phenix.refine uses very sophisticated filtering tools; - I guess at some point I will switch to using Average Kick Maps for water picking. This will remove the noise peaks, and so eliminate the problem (I need to test this all, though). I started wondering about test reflections im maps because I was removing some bad "waters" and found a bigger increase in Rfree than in Rwork. That made me wonder if they had a significant component of test-set density. But, I also have not checked this in detail.

Joe Krahn _______________________________________________ phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb

-- Paul Adams Acting Division Director, Physical Biosciences Division, Lawrence Berkeley Lab Adjunct Professor, Department of Bioengineering, U.C. Berkeley Vice President for Technology, the Joint BioEnergy Institute Head, Berkeley Center for Structural Biology Building 64, Room 248 Tel: 1-510-486-4225, Fax: 1-510-486-5909 http://cci.lbl.gov/paul Lawrence Berkeley Laboratory 1 Cyclotron Road BLDG 64R0121 Berkeley, CA 94720, USA. Executive Assistant: Patty Jimenez [ [email protected] ] [ 1-510-486-7963 ] --

Edward Berry

2 Oct 2 Oct

9:37 p.m.

Joe Krahn wrote:

...

For density modification, it may be possible to converge on a good map if both missing and test reflections use "Fcalc" fill-in values from the previous density-modified transformation. Maybe the people who could not get good results without the test reflections used a DM method that reset missing values to zero on every cycle.

Those suggestions would be supported by results in: Rayment, I. Molecular replacement method at low resolution: optimum strategy and intrinsic limitations as determined by calculations on icosahedral virus models. Acta Crystallogr. A 39, 102–116 (1983). (Perhaps this is the paper B. Shaanan was thinking of?) The authors used what is now called "fill-in" for the missing reflections, and they also tried intentially omitting reflections and using the fillin (Fcalc) values at each cycle. They found when this was done the resulting F's approached the actual unused experimental values with cycles of averaging. If they did not fillin the missing values the averaging suffered, because in effect they were forcing those reflections to zero. Ed PS: NCS averaging can be seen as a numerical method for solving the M.R. equations, i.e. finding a set of phases that is consistent with the observed amplitudes and the known symmetry. NCS averaging with 5% of reflections omitted is finding a set of phases consistent with the known symmetry and observed working reflections *AND* with all of the free refections having amplitude zero (which is probably not the set of phases you want). For example suppose you started with perfect data and perfect phases, and agreement with the NCS operators is very good. Now you make a 2Fo-Fc map with the test set omitted, i.e. replaced by zero. The missing sine waves do not in general obey the NCS, so each omitted one introduces asymmetry in the map-to-be averaged. Averaging then restores the symmetry, but with slightly different density values. Fc for the next round is calculated from the averaged map, and the omitted reflections will in general have non-zero value. If these Fc values are taken for the next round, by fill-in, they may eventually converge to the true values and the map will be correct. If they are rezeroed at each round, the only way for the process to restore symmetry is by modifying the phases of all the other reflections to obtain a symmetrical density wich is different from the correct one obtained with all the reflections.

Thomas C. Terwilliger

30 Sep 30 Sep

5:32 p.m.

Hi Pavel and Joe, I'm glad you put that option in, Pavel. However for model-building it is not so straightforward. Normally we are building into density-modified maps. Density modification works poorly when a significant set of reflections is excluded, so this becomes impractical. We could exclude those reflections in the map only, but they will have been used in density modification so they are no longer truly free. I don't have a good solution for that. I am guessing that for model-building, the free R is hardly affected, compared to refinement against the map, though I haven't tested this. All the best, Tom T

...

...
Hi Joe, hi Tom,

until recently, all reflections (work and test) were used in map calculation. When I added a real-space refinement option to phenix.refine I spent a day thinking hard about why Rfree was almost always equal to Rwork after real-space refinement, until I realized that I have to exclude test reflections from map calculation, and that fixed the problem.

I think it depends on the task:

- for things like real-space refinement (where the map is heavily used) or automated model building, the test reflections have to be excluded from map calculation;

- for catching a tiny detail (looking at weak ligand density = slight map use) or fixing a few side chains, all reflections should be used, since 10% of data put aside may sometime significantly affect map quality.

Pavel.

On 9/30/09 11:51 AM, Tom Terwilliger wrote:

...
Hi Joe, In PHENIX R-free reflections are included in your maps. All the best, Tom T

On Sep 30, 2009, at 12:47 PM, Joe Krahn wrote:

...
Does PHENIX exclude R-free reflections when computing maps? IMHO, it is bad to use test reflections in maps, or anywhere else other than computing R-free, but opinions vary. If some people still want test reflections in maps, maybe it could be an option.

Thanks, Joe Krahn

phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb

Alexandre OURJOUMTSEV

1 Oct 1 Oct

12:56 a.m.

New subject: Re : Re: Does PHENIX exclude Rfree reflections in maps?

Hi, Tom, Pavel and Joe,

...

I'm glad you put that option in, Pavel. However for model- building it is not so straightforward. Normally we are building into density-modified maps. Density modification works poorly when a significant set of reflections is excluded, so this becomes impractical.

AsI remember it was a paper (by Blanc, ..., Bricogne ? am I wrong?Ifailed to find it right now) where they faced exactly thisproblem.Again, as I remember they suggested to exclude the test-setreflectionsfrom the map calculation but , to minimize the mapdeformation, insteadof them include average Fobs for the correspondingresolution shell. Is it a way to answer the original question / problem ? As I understand, a similar technique Pavel already uses to calculate the maps with Fobs missed. Best regards, Sacha

Tom Terwilliger

8:11 a.m.

New subject: Re : Re: Does PHENIX exclude Rfree reflections in maps?

Hi Sacha, Yes, that would help. In parallel, to get a little more data on the practical aspects of all this, I think I will do a test to see how much difference taking out a super-test-set of reflections completely makes in overall structure determination from start to finish using fully automated structure determination with phenix.autosol and phenix.autobuild. All the best, Tom T On Oct 1, 2009, at 1:56 AM, Alexandre OURJOUMTSEV wrote:

...

Hi, Tom, Pavel and Joe,

...
I'm glad you put that option in, Pavel. However for model- building it is not so straightforward. Normally we are building into density-modified maps. Density modification works poorly when a significant set of reflections is excluded, so this becomes impractical.

As I remember it was a paper (by Blanc, ..., Bricogne ? am I wrong? Ifailed to find it right now) where they faced exactly this problem.Again, as I remember they suggested to exclude the test-set reflectionsfrom the map calculation but , to minimize the map deformation, insteadof them include average Fobs for the corresponding resolution shell.

Is it a way to answer the original question / problem ?

As I understand, a similar technique Pavel already uses to calculate the maps with Fobs missed.

Best regards,

Sacha

5782

Age (days ago)

5785

Last active (days ago)

List overview

Download

11 comments

8 participants

participants (8)

Alexandre OURJOUMTSEV
Edward Berry
Joe Krahn
Paul Adams
Pavel Afonine
Peter Grey
Thomas C. Terwilliger
Tom Terwilliger