phenix.map_to_model input mtz file failure
Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected]
Hi Wei,
I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work!
All the best,
Tom T
________________________________
From: [email protected]
Dear Thomas,
I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok.
Thank you very much for you help.
Best!
--
Wei Ding
P.O.Box 603
The Institute of Physics,Chinese Academy of Sciences
Beijing,China
100190
Tel: +86-10-82649083
E-mail: [email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
?Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
________________________________
From: [email protected]
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?). My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei,
I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Hi Ed, including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so. This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112 So nothing to worry about when including all reflections in map calculations. Cheers, Tim On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
-- -- Paul Scherrer Institut Tim Gruene - persoenlich - OFLC/104 CH-5232 Villigen PSI phone: +41 (0)56 310 5297 GPG Key ID = A46BEE1A
Hi Ed, Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step: 1) Get data and model from PDB: phenix.fetch_pdb 1f8t --mtz 2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms: phenix.python run.py 1f8t.{pdb,mtz} This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps. 3) Shake model a bit: phenix.dynamics 1f8t.pdb number_of_steps=500 4) Run real-space refinement using two maps: phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all 5) Compute R-factors using data and real-space refined models: phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441 phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756 The result is self-explicable and is inline with Tom's reply to Wei. All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/ All the best, Pavel On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
The difference of opinion between Tim and Pavel is that Tim is assuming that the real space refinement will be followed by reciprocal space refinement with the test set withheld. This will be the case for model building in Coot which is usually followed by full-blown refinement in one of the classic packages. The story is different for cryo-EM maps where there is no reciprocal space refinement. While I would question the wisdom of using R values at all, when both the model and the original data live in real space, people do seem to like it. I think it would be better to develop cross-validation metrics designed with cryo-EM's experimental setup in mind and not simply reuse those designed for diffraction experiments. Dale Tronrud On 6/13/2017 11:15 AM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Hi Dale,
The difference of opinion between Tim and Pavel is that Tim is assuming that the real space refinement will be followed by reciprocal space refinement with the test set withheld. This will be the case for model building in Coot which is usually followed by full-blown refinement in one of the classic packages.
agreed! Also, in scenario you describe, typically only a (small) part of the model is likely to be real-space refined in Coot. While in my example I refined entire model against the map.
The story is different for cryo-EM maps where there is no reciprocal space refinement.
Also agreed! In case of cryo-EM there is no structure factors, crystallographic R-factors, etc..
While I would question the wisdom of using R values at all, when both the model and the original data live in real space, people do seem to like it. I think it would be better to develop cross-validation metrics designed with cryo-EM's experimental setup in mind and not simply reuse those designed for diffraction experiments.
Can't agree more! In fact, refinement against cryo-EM data in Phenix does not require structure factors. phenix.real_space_refine program refines model against map directly. All the best, Pavel
Dale Tronrud
On 6/13/2017 11:15 AM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
I've done (but never bothered to publish) essentially this experiment, on a 3.6A crystal model that was eventually deposited as 5jpn (if I recall correctly). In my case I was applying MDFF rather than RSR, and all I did was settle the finished model at 0K into either of two maps: one with the free set included, and one with it excluded and filled with Fcalc. The result was two functionally indistinguishable models (differing by less than 0.1A essentially throughout), except that the former had Rfree about 2% lower than the latter. After refinement to convergence in PHENIX, the stats were still substantially different: the former model had ~1% lower Rfree and 1% higher Rwork than the latter. Even so, there remained no meaningful physical difference between the two models - validation statistics were essentially identical, and visually the differences still mostly amounted to sub-pm shifts in individual atoms. It would be interesting to see the results of many rounds of this starting from a bad model. Tristan Croll Research Fellow Cambridge Institute for Medical Research University of Cambridge CB2 0XY
On 13 Jun 2017, at 19:30, Dale Tronrud
wrote: The difference of opinion between Tim and Pavel is that Tim is assuming that the real space refinement will be followed by reciprocal space refinement with the test set withheld. This will be the case for model building in Coot which is usually followed by full-blown refinement in one of the classic packages. The story is different for cryo-EM maps where there is no reciprocal space refinement.
While I would question the wisdom of using R values at all, when both the model and the original data live in real space, people do seem to like it. I think it would be better to develop cross-validation metrics designed with cryo-EM's experimental setup in mind and not simply reuse those designed for diffraction experiments.
Dale Tronrud
On 6/13/2017 11:15 AM, Pavel Afonine wrote: Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote: Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote: Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote: Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
-------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- -------------------------------------------------------------------------- ---------------------------- *From:* [email protected]
on behalf of [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:* [email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail: [email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:* [email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:* [email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail: [email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Thanks, Pavel, I really appreciate your taking the time to generate the example. While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement. I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy? Ed On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
Another test- do RSR, excluding the free set, on the fully refined model without shaking. You cannot expect to decrease R or R-free, but if R-free is not being biased, at least it should not go up. If the free reflections are being biased toward zero's, Rfree should increase with this. Ed On 06/13/2017 03:30 PM, Edward A. Berry wrote:
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
On 6/13/2017 12:30 PM, Edward A. Berry wrote:
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Of course the model is refined as though the test set Fourier components were equal to zero. In reciprocal space refinement when you leave a reflection out of the "sum over all reflections" when calculating the difference map you are saying that you have no opinion about the amplitude of that reflection. When you calculate a real space map from Fourier coefficients you can't not have an opinion, i.e. you can't leave a term out of the sum you can only set that term to zero. If your model produces a prediction for that term which is not equal to zero it will be penalized. (If you set that term to Fcalc you tie your model to its starting point. To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.) What this means is that Rfree is not a meaningful stat for assessing overfitting of real space refinement. This is hardly a surprise. A test of a refinement protocol has to be based on the mathematics of that protocol, not the protocol you happened to have used yesterday. If you want an unbiased estimate of the quality of a real space refinement you have to leave out a region of the map and then see how well the model fits that region. This is harder to do in an automated fashion and there will be a lot of caveats about your results (e.g. you know about the ability to fit one region but does that generalize to other areas?). If you recall there are a lot of caveats about Rfree too - we have just stopped worrying about them. (e.g. low resolution vrs high resolution reflections, choosing based on shells or randomly, what to do about ncs...) I think you should consider yourself on the wrong track if you come up with a statistical test, but haven't given any thought to the actual experiment that produced your map. Dale Tronrud
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
---------------------------- *From:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
---------------------------------------------------------------------- ---------- *From:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Just got back from my afternoon walk...
If you are insistent on placating the unruly mob and still want to
publish an Rfree for real space refinement here is reasonably efficient
way to avoid bias.
Calculate your MAPcalc by first calculating Fcalc and follow up by a
Fourier synthesis. Before the synthesis replace all the test set Fcalc
equivalents with their Fobs counterparts. F's are complex numbers here
so you are putting in both the observed amplitude and phase. When you
compare this map to the observed map the model will never be penalized
for disagreeing about test set reflections and is free to do whatever it
wants.
This does not really add any more time to the calculation since you
have to calculate Fcalcs anyway if you are going to report R values. It
also allows you to use the experimental map w/o ever having to modify
it. As a general rule I think experimental results should be kept in
read-only files during interpretation.
Dale Tronrud
P.S. I still think the whole idea of Rfree in real-space refinement is
not a good one.
-------- Forwarded Message --------
Subject: Re: [phenixbb] phenix.map_to_model input mtz file failure
--caution on using map_to_model with X-ray dataDate: Tue, 13 Jun 2017
14:40:57 -0700
From: Dale Tronrud
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Of course the model is refined as though the test set Fourier components were equal to zero. In reciprocal space refinement when you leave a reflection out of the "sum over all reflections" when calculating the difference map you are saying that you have no opinion about the amplitude of that reflection. When you calculate a real space map from Fourier coefficients you can't not have an opinion, i.e. you can't leave a term out of the sum you can only set that term to zero. If your model produces a prediction for that term which is not equal to zero it will be penalized. (If you set that term to Fcalc you tie your model to its starting point. To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.) What this means is that Rfree is not a meaningful stat for assessing overfitting of real space refinement. This is hardly a surprise. A test of a refinement protocol has to be based on the mathematics of that protocol, not the protocol you happened to have used yesterday. If you want an unbiased estimate of the quality of a real space refinement you have to leave out a region of the map and then see how well the model fits that region. This is harder to do in an automated fashion and there will be a lot of caveats about your results (e.g. you know about the ability to fit one region but does that generalize to other areas?). If you recall there are a lot of caveats about Rfree too - we have just stopped worrying about them. (e.g. low resolution vrs high resolution reflections, choosing based on shells or randomly, what to do about ncs...) I think you should consider yourself on the wrong track if you come up with a statistical test, but haven't given any thought to the actual experiment that produced your map. Dale Tronrud
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
---------------------------- *From:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
---------------------------------------------------------------------- ---------- *From:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
Thanks, Dale, Could you explain this "neighborhood correlation"? My very simple (maybe too simple) understanding of how real space would bias reflections is as follows: You make a map using phases (and Fc?) from the current model and Fobs. But you omit the free set. Now if you take the fourier transform of that unmodified map, you would get back exactly the coefficients you put in: 2Fo-Fc (?) for the working reflections, and zero for the free set. Then you make modifications to the model to make its density match as nearly as possible the density of the map. If you were able to make the density of the model exactly match that of the map, then the Fc for the model would be that of the map. Of course you can never make the density of the model exactly match that of the map - modelization is the severest form of density modification. But, to the extent that you make the model's density more nearly like that of the map, the Fourier transform of the new model will be more like that of the map. That means for the working reflections, Fc will get closer to 2Fo-Fc which brings them closer to Fo; and the R-work improves. (If there is error in the Fobs, that will be reflected in the map, and the Fcalc of the model will tend toward these eromeous Fobs (fitting the error) and Rwork will get better than it should (bias).) Free reflections will move closer to zero, and most likely Rfree will get worse. I think that's all consistent with what you wrote, but then I had the impression that the bias could be prevented by making the map with Fc for the test set (proposed in an old paper by Ivan Rayment. That way the free reflections get are following the process by their coupling to neighboring reflections in reciprocal space (neighborhood correlation?), the same way they do in reciprocal space refinement, rather than the Fobs being used. The information in these free Fcalc is coming from the neighboring working reflections due to redundancy of information in a finely sampled molecular transform. Ed On 06/13/2017 05:40 PM, Dale Tronrud wrote:
On 6/13/2017 12:30 PM, Edward A. Berry wrote:
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Of course the model is refined as though the test set Fourier components were equal to zero. In reciprocal space refinement when you leave a reflection out of the "sum over all reflections" when calculating the difference map you are saying that you have no opinion about the amplitude of that reflection. When you calculate a real space map from Fourier coefficients you can't not have an opinion, i.e. you can't leave a term out of the sum you can only set that term to zero. If your model produces a prediction for that term which is not equal to zero it will be penalized. (If you set that term to Fcalc you tie your model to its starting point. To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
What this means is that Rfree is not a meaningful stat for assessing overfitting of real space refinement. This is hardly a surprise. A test of a refinement protocol has to be based on the mathematics of that protocol, not the protocol you happened to have used yesterday. If you want an unbiased estimate of the quality of a real space refinement you have to leave out a region of the map and then see how well the model fits that region. This is harder to do in an automated fashion and there will be a lot of caveats about your results (e.g. you know about the ability to fit one region but does that generalize to other areas?). If you recall there are a lot of caveats about Rfree too - we have just stopped worrying about them. (e.g. low resolution vrs high resolution reflections, choosing based on shells or randomly, what to do about ncs...)
I think you should consider yourself on the wrong track if you come up with a statistical test, but haven't given any thought to the actual experiment that produced your map.
Dale Tronrud
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
---------------------------- *From:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
---------------------------------------------------------------------- ---------- *From:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
First we have to agree on exactly what we are talking about. I presumed we were talking about real space refinement against an experimental map such as one gets in cryo-EM. In that case there are no Fobs. Reciprocal space is a fiction and should be avoided. If you are working, instead, with a 2Fo-Fc map then just do reciprocal space refinement. I don't know of any reason to do whole-molecule real space refinement when you are working with crystal diffraction data. Reciprocal space is where the experiment lives and the analysis should be done there. In cases of model building it is computationally quicker to do a local real space refinement to touch up a model just so you can see if it looks reasonable before going back into reciprocal space. This real space refinement is quick-and-dirty and any flaws will be erased by the proper reciprocal space refinement that follows. As for "neighborhood correlation" I was thinking of cryo-EM maps. Since the individual measurements (pictures) are real space in nature, I can't imagine an experimental error in the voltage of one voxel wouldn't tend to show up similarly in its neighbors. The whole group of voxels will be illuminated by electrons who all had very similar histories passing through the microscope. We have similar situations with diffraction data. A reflection whose neighbor is in a shadow has a much higher chance of being shadowed itself. Our spots, however, are much further apart on the detector than the voxels of an EM image. There is another type of correlation that is probably more important. Our diffraction spots are separated enough that you cannot predict the intensity of a reflection based on its neighbors. You can make a very good prediction of the darkness of a voxel based on its neighbors. If you leave out one voxel, as a test set member, you could easily deduce its hidden value without even building a molecular model - just interpolate. You can't do that with diffraction data. This means if you want to leave out a chunk of map data for a test set you have to pull out a big enough piece (many contiguous voxels) that you can't deduce anything about their opaqueness from the remaining image. To do this you have to know something about how the microscope works. Dale Tronrud On 6/13/2017 4:08 PM, Edward A. Berry wrote:
To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
Thanks, Dale, Could you explain this "neighborhood correlation"?
My very simple (maybe too simple) understanding of how real space would bias reflections is as follows:
You make a map using phases (and Fc?) from the current model and Fobs. But you omit the free set.
Now if you take the fourier transform of that unmodified map, you would get back exactly the coefficients you put in: 2Fo-Fc (?) for the working reflections, and zero for the free set.
Then you make modifications to the model to make its density match as nearly as possible the density of the map. If you were able to make the density of the model exactly match that of the map, then the Fc for the model would be that of the map.
Of course you can never make the density of the model exactly match that of the map - modelization is the severest form of density modification. But, to the extent that you make the model's density more nearly like that of the map, the Fourier transform of the new model will be more like that of the map.
That means for the working reflections, Fc will get closer to 2Fo-Fc which brings them closer to Fo; and the R-work improves. (If there is error in the Fobs, that will be reflected in the map, and the Fcalc of the model will tend toward these eromeous Fobs (fitting the error) and Rwork will get better than it should (bias).) Free reflections will move closer to zero, and most likely Rfree will get worse.
I think that's all consistent with what you wrote, but then I had the impression that the bias could be prevented by making the map with Fc for the test set (proposed in an old paper by Ivan Rayment. That way the free reflections get are following the process by their coupling to neighboring reflections in reciprocal space (neighborhood correlation?), the same way they do in reciprocal space refinement, rather than the Fobs being used. The information in these free Fcalc is coming from the neighboring working reflections due to redundancy of information in a finely sampled molecular transform.
Ed
On 06/13/2017 05:40 PM, Dale Tronrud wrote:
On 6/13/2017 12:30 PM, Edward A. Berry wrote:
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded).
Of course the model is refined as though the test set Fourier components were equal to zero. In reciprocal space refinement when you leave a reflection out of the "sum over all reflections" when calculating the difference map you are saying that you have no opinion about the amplitude of that reflection. When you calculate a real space map from Fourier coefficients you can't not have an opinion, i.e. you can't leave a term out of the sum you can only set that term to zero. If your model produces a prediction for that term which is not equal to zero it will be penalized. (If you set that term to Fcalc you tie your model to its starting point. To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
What this means is that Rfree is not a meaningful stat for assessing overfitting of real space refinement. This is hardly a surprise. A test of a refinement protocol has to be based on the mathematics of that protocol, not the protocol you happened to have used yesterday. If you want an unbiased estimate of the quality of a real space refinement you have to leave out a region of the map and then see how well the model fits that region. This is harder to do in an automated fashion and there will be a lot of caveats about your results (e.g. you know about the ability to fit one region but does that generalize to other areas?). If you recall there are a lot of caveats about Rfree too - we have just stopped worrying about them. (e.g. low resolution vrs high resolution reflections, choosing based on shells or randomly, what to do about ncs...)
I think you should consider yourself on the wrong track if you come up with a statistical test, but haven't given any thought to the actual experiment that produced your map.
Dale Tronrud
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote: > Hi Wei, > > > I want to give a word of caution about how to use > phenix.map_to_model on > crystallographic data...The bottom line is you should remove the > test set > from your map coefficients before running phenix.map_to model on > X-ray > data. Here is why: > > > phenix.map_to_model uses real-space refinement, which is refinement > against the map. If you supply map coefficients that include your > test > reflections, then you will be refining against data that is in your > test > set. This will make your Rfree invalid when you go back and > refine your > model against the original crystallographic data. > > > To remove the test set from your map coefficients you can use: > > > phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz > free_in=my_data_file_with_freeR_flags.mtz > mtz_out=my_map_coeffs_no_free.mtz > > > Also note that phenix.map_to_model uses a fixed map (it does not do > density modification). Consequently for most crystallographic > data at > moderate resolution or higher phenix.autobuild is going to do much > better > than phenix.map_to_model. > > > All the best, > > Tomrom:*[email protected] >
on behalf [email protected] > *Sent:* Tuesday, June 6, 2017 9:16 PM > *To:* Terwilliger, Thomas Charles > *Cc:*[email protected] > *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file > failure > Dear Thomas, > I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then > submit > this job again (without map_coeffs_labels=... ), and everything > seems ok. > Thank you very much for you help. > Best! > > > -- > Wei Ding > P.O.Box 603 > The Institute of Physics,Chinese Academy of Sciences > Beijing,China > 100190 > Tel: +86-10-82649083 > > E-mail:[email protected] mailto:[email protected] > > At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles" wrote: > Hi Wei, > > > I'm sorry for the trouble! > > > If you supply an MTZ file that has FWT,PHFWT or similar > labels, then > you can skip the "labels=...." statement and it should run. > > > Let me know if that does not work! > All the best, > > Tomrom:*[email protected] > mailto:[email protected] > mailto:[email protected]> on behalf of > [email protected] mailto:[email protected] > mailto:[email protected]> *Sent:* > Tuesday, > June 6, 2017 8:19 PM > *To:*[email protected] > mailto:[email protected] > *Subject:* [phenixbb] phenix.map_to_model input mtz file > failure > Dear Phenix bb, > I intend to build a initial model by phenix.map_to_model. > And the > command line is as follows: phenix.map_to_model_1.12rc0-2787 > map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' > 'PHIDM' > 'FOMDM'" seq_file=../resolve.seq is_crystal=True > use_sg_symmetry=True density_select=False > truncate_at_d_min=True > and the feedback like this: > Sorry: No initial assignment made for map_coeffs. Labels used: > FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', > 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', > 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like > 'FP,SIGFP' must > stay together, > have commas, and have no spaces. If they come from an MTZ > file, > they must be in adjacent columns as well. > Suggested labels to use: PHIDM FOMDM > I try many other input format of map_coeffs_labels, such as > map_coeffs_labels="FP,SIGFP PHIDM FOMDM" > map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] > ... ... > but the result is the same. Dose anyone can tell me how to fix > this > problem? Thank a lot. > > > > > > -- > Wei Ding > P.O.Box 603 > The Institute of Physics,Chinese Academy of Sciences > Beijing,China > 100190 > Tel: +86-10-82649083 > E-mail:[email protected] mailto:[email protected] > > _______________________________________________ > phenixbb mailing list > [email protected] > http://phenix-online.org/mailman/listinfo/phenixbb > Unsubscribe:[email protected] _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected] _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
On 06/13/2017 07:44 PM, Dale Tronrud wrote:
First we have to agree on exactly what we are talking about. I presumed we were talking about real space refinement against an experimental map such as one gets in cryo-EM. In that case there are no Fobs. Reciprocal space is a fiction and should be avoided.
If you are working, instead, with a 2Fo-Fc map then just do reciprocal space refinement. I don't know of any reason to do whole-molecule real space refinement when you are working with crystal diffraction data. Reciprocal space is where the experiment lives and the analysis should be done there.
In cases of model building it is computationally quicker to do a local real space refinement to touch up a model just so you can see if it looks reasonable before going back into reciprocal space. This real space refinement is quick-and-dirty and any flaws will be erased by the proper reciprocal space refinement that follows.
I have no argument with that! And I do realize how significant cryo-EM has become to structural biology with the advent of direct electron detectors. I was still stuck on the perhaps not so important theoretical question of "to exclude or not to exclude" the free-R set in real-space refinement prior to reciprocal space refinement.
As for "neighborhood correlation" I was thinking of cryo-EM maps. Since the individual measurements (pictures) are real space in nature, I can't imagine an experimental error in the voltage of one voxel wouldn't tend to show up similarly in its neighbors. The whole group of voxels will be illuminated by electrons who all had very similar histories passing through the microscope.
We have similar situations with diffraction data. A reflection whose neighbor is in a shadow has a much higher chance of being shadowed itself. Our spots, however, are much further apart on the detector than the voxels of an EM image.
There is another type of correlation that is probably more important. Our diffraction spots are separated enough that you cannot predict the intensity of a reflection based on its neighbors. You can make a very good prediction of the darkness of a voxel based on its neighbors. If you leave out one voxel, as a test set member, you could easily deduce its hidden value without even building a molecular model - just interpolate. You can't do that with diffraction data.
This means if you want to leave out a chunk of map data for a test set you have to pull out a big enough piece (many contiguous voxels) that you can't deduce anything about their opaqueness from the remaining image. To do this you have to know something about how the microscope works.
Dale Tronrud
On 6/13/2017 4:08 PM, Edward A. Berry wrote:
To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
Thanks, Dale, Could you explain this "neighborhood correlation"?
My very simple (maybe too simple) understanding of how real space would bias reflections is as follows:
You make a map using phases (and Fc?) from the current model and Fobs. But you omit the free set.
Now if you take the fourier transform of that unmodified map, you would get back exactly the coefficients you put in: 2Fo-Fc (?) for the working reflections, and zero for the free set.
Then you make modifications to the model to make its density match as nearly as possible the density of the map. If you were able to make the density of the model exactly match that of the map, then the Fc for the model would be that of the map.
Of course you can never make the density of the model exactly match that of the map - modelization is the severest form of density modification. But, to the extent that you make the model's density more nearly like that of the map, the Fourier transform of the new model will be more like that of the map.
That means for the working reflections, Fc will get closer to 2Fo-Fc which brings them closer to Fo; and the R-work improves. (If there is error in the Fobs, that will be reflected in the map, and the Fcalc of the model will tend toward these eromeous Fobs (fitting the error) and Rwork will get better than it should (bias).) Free reflections will move closer to zero, and most likely Rfree will get worse.
I think that's all consistent with what you wrote, but then I had the impression that the bias could be prevented by making the map with Fc for the test set (proposed in an old paper by Ivan Rayment. That way the free reflections get are following the process by their coupling to neighboring reflections in reciprocal space (neighborhood correlation?), the same way they do in reciprocal space refinement, rather than the Fobs being used. The information in these free Fcalc is coming from the neighboring working reflections due to redundancy of information in a finely sampled molecular transform.
Ed
On 06/13/2017 05:40 PM, Dale Tronrud wrote:
On 6/13/2017 12:30 PM, Edward A. Berry wrote:
Thanks, Pavel, I really appreciate your taking the time to generate the example.
While I agree with Tim and Ian that refinement to convergence should remove the bias making it perhaps not a serious problem, my question was in fact whether there is any bias immediately after the refinement.
I will need to study this example a bit, but one thing I notice is that you are doing exactly what I was guessing, comparing Rfree after real-space refinement with and without using the free set. Then, I still think, we
> have to think about how much of that difference results from > bias towards the observed values (when the reflections are included) > and > how much is from bias towards zero (when the free set is excluded).
Of course the model is refined as though the test set Fourier components were equal to zero. In reciprocal space refinement when you leave a reflection out of the "sum over all reflections" when calculating the difference map you are saying that you have no opinion about the amplitude of that reflection. When you calculate a real space map from Fourier coefficients you can't not have an opinion, i.e. you can't leave a term out of the sum you can only set that term to zero. If your model produces a prediction for that term which is not equal to zero it will be penalized. (If you set that term to Fcalc you tie your model to its starting point. To unbias you would have to calculate a new map with current Fcalc's for every iteration of the model, but this method would not take into account the neighborhood correlation present in experimental maps.)
What this means is that Rfree is not a meaningful stat for assessing overfitting of real space refinement. This is hardly a surprise. A test of a refinement protocol has to be based on the mathematics of that protocol, not the protocol you happened to have used yesterday. If you want an unbiased estimate of the quality of a real space refinement you have to leave out a region of the map and then see how well the model fits that region. This is harder to do in an automated fashion and there will be a lot of caveats about your results (e.g. you know about the ability to fit one region but does that generalize to other areas?). If you recall there are a lot of caveats about Rfree too - we have just stopped worrying about them. (e.g. low resolution vrs high resolution reflections, choosing based on shells or randomly, what to do about ncs...)
I think you should consider yourself on the wrong track if you come up with a statistical test, but haven't given any thought to the actual experiment that produced your map.
Dale Tronrud
Things I need to look at- What are R and R-free for the original refined model What are R and R-free after shaking (did RSR lower R but not Rfree, or did it raise Rfree? What if RSR is done using a map made with fill-in strategy?
Ed
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote: > Hi, Tom, > Please forgive what may be a silly question from an outsider who > hasn't > really kept up with the crystallography literature or even all the > Phenix > newsletters- What is the evidence that including the free set in > real space > refinement biases R-free of the resulting model? Is this Rfree also > biased > when map coefficients use "fill-in" for the excluded free > reflections (and > is that what phenix.remove_free_from_map does?). > > My point is that literally excluding the free reflections, as > opposed to > substituting their values with Fc, will bias the free set toward > grossly > incorrect values (namely zero) and therefore greatly worsen R-free. > Thus if > the evidence for bias is that you get worse R-free when you > exclude the > free set, you have to think about how much of that difference > results from > bias towards the observed values (when the reflections are included) > and > how much is from bias towards zero (when the free set is excluded). > (Again, I realize this may be all very well understood by the > crystallography community and properly taken care of in phenix; I'm > just > asking for my own information) eab > > On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote: >> Hi Wei, >> >> >> I want to give a word of caution about how to use >> phenix.map_to_model on >> crystallographic data...The bottom line is you should remove the >> test set >> from your map coefficients before running phenix.map_to model on >> X-ray >> data. Here is why: >> >> >> phenix.map_to_model uses real-space refinement, which is refinement >> against the map. If you supply map coefficients that include your >> test >> reflections, then you will be refining against data that is in your >> test >> set. This will make your Rfree invalid when you go back and >> refine your >> model against the original crystallographic data. >> >> >> To remove the test set from your map coefficients you can use: >> >> >> phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz >> free_in=my_data_file_with_freeR_flags.mtz >> mtz_out=my_map_coeffs_no_free.mtz >> >> >> Also note that phenix.map_to_model uses a fixed map (it does not do >> density modification). Consequently for most crystallographic >> data at >> moderate resolution or higher phenix.autobuild is going to do much >> better >> than phenix.map_to_model. >> >> >> All the best, >> >> Tom T >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> >> >> ---------------------------- *From:*[email protected] >>
on behalf [email protected] >> *Sent:* Tuesday, June 6, 2017 9:16 PM >> *To:* Terwilliger, Thomas Charles >> *Cc:*[email protected] >> *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file >> failure >> Dear Thomas, >> I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then >> submit >> this job again (without map_coeffs_labels=... ), and everything >> seems ok. >> Thank you very much for you help. >> Best! >> >> >> -- >> Wei Ding >> P.O.Box 603 >> The Institute of Physics,Chinese Academy of Sciences >> Beijing,China >> 100190 >> Tel: +86-10-82649083 >> >> E-mail:[email protected] mailto:[email protected] >> >> At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles" wrote: >> Hi Wei, >> >> >> I'm sorry for the trouble! >> >> >> If you supply an MTZ file that has FWT,PHFWT or similar >> labels, then >> you can skip the "labels=...." statement and it should run. >> >> >> Let me know if that does not work! >> All the best, >> >> Tom T >> >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> >> ---------------------------------------------------------------------- >> >> ---------- *From:*[email protected] >> mailto:[email protected] >> > mailto:[email protected]> on behalf of >> [email protected] mailto:[email protected] >> mailto:[email protected]> *Sent:* >> Tuesday, >> June 6, 2017 8:19 PM >> *To:*[email protected] >> mailto:[email protected] >> *Subject:* [phenixbb] phenix.map_to_model input mtz file >> failure >> Dear Phenix bb, >> I intend to build a initial model by phenix.map_to_model. >> And the >> command line is as follows: phenix.map_to_model_1.12rc0-2787 >> map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' >> 'PHIDM' >> 'FOMDM'" seq_file=../resolve.seq is_crystal=True >> use_sg_symmetry=True density_select=False >> truncate_at_d_min=True >> and the feedback like this: >> Sorry: No initial assignment made for map_coeffs. Labels used: >> FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', >> 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', >> 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like >> 'FP,SIGFP' must >> stay together, >> have commas, and have no spaces. If they come from an MTZ >> file, >> they must be in adjacent columns as well. >> Suggested labels to use: PHIDM FOMDM >> I try many other input format of map_coeffs_labels, such as >> map_coeffs_labels="FP,SIGFP PHIDM FOMDM" >> map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] >> ... ... >> but the result is the same. Dose anyone can tell me how to fix >> this >> problem? Thank a lot. >> >> >> >> >> >> -- >> Wei Ding >> P.O.Box 603 >> The Institute of Physics,Chinese Academy of Sciences >> Beijing,China >> 100190 >> Tel: +86-10-82649083 >> E-mail:[email protected] mailto:[email protected] >> >> _______________________________________________ >> phenixbb mailing list >> [email protected] >> http://phenix-online.org/mailman/listinfo/phenixbb >> Unsubscribe:[email protected] > _______________________________________________ > phenixbb mailing list > [email protected] > http://phenix-online.org/mailman/listinfo/phenixbb > Unsubscribe:[email protected] _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
The result is self-explicable and is inline with Tom's reply to Wei.
Thanks, Pavel. Yes, that is convincing. I filled in some of the other results I wanted, and they do not change the conclusion. in case anyone is interested: After shaking, R and Rfree are very high and not significantly different (values below). RS Refinement, against any of the maps generated from the refined structure. greatly reduces both. (So R-free is NOT increasing due to free reflections being zero in the transform of the map). If free reflections are excluded in making the map then R-free is higher than Rwork. Using fill-in resulted in an R-free gap similar to that with excluding reflections. R and R-free were both a little lower with fill-in, but they were lower by about the same amount when the free set was included, so this may be more the effect of filling in the truly missing reflections than filling the free reflections. And could be considered as Fc-Phic bias towards the excellent map from fully refined structure. Refining the original refined model from the pdb by RSR (against its own maps) resulted in R, R-free increasing to the same values obtained starting with the shaken models. This is like the increase seen when you solve a structure with a high-resolution search model and refine against a low-resolution dataset. Making the maps using Fobs and the Fc Phi-c from the shaken model rather than final refined model, which is more relevant for a crystallographer (in case any crystallographer should ignore the advice given already in this thread and use real-space-refine/map-to-model instead of autobuild) still gave a significant decrease in both R's, and a similar R-Rfree gap (or lack thereof) in all four cases. Fill-in gave higher R's this time (Fc-Phic bias toward the original shaken model?) Specifically: model R-work R-free Original fully refined PDB 0.171 0.221 from header After dynamics shaking: 0.3923 0.3917 After RS refinement of shaken: Using all reflections 0.2409 0.2434 Using all, + Fc for missing 0.2352 0.2409 excluding free 0.2419 0.2731 Fc for free (and missing) 0.2387 0.2606 After RS refinement of 1f8t: Using all reflections 0.2421 0.2453 Using all, + Fc for missing 0.2425 0.2479 excluding free 0.2453 0.2742 Fc for free (and missing) 0.2393 0.2585 After RS refinement of shaken against map made from shaken: Using all reflections 0.2870 0.2915 Using all, + Fc for missing 0.3008 0.3034 excluding free 0.2917 0.3236 Fc for free (and missing) 0.3026 0.3290 It is surprising to me that making the map with fillin Fc from the refined model vs the bad starting model makes a significant difference, but making the map with Fc=0 doesn't seem to hurt. On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
rom:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
rom:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
How about this for an explanation of why setting Fcalc to zero doesn't hurt so much? Lets assume that Fcalc is correlated with its corresponding Fobs. (This is reasonable since your R values are 20 to 30%.) I presume you are refining against some sort of (2Fo-Fc, PhiC) map. If you put in the real Fc you will get amplitudes that are roughly equal to (Fo, PhiC) since the Fc roughly cancels out one of the Fo. For reflections where Fc is zeroed out the coefficient will be, roughly, (2Fo, PhiC). 2Fo is still pretty strongly correlated to Fo. Certainly large Fo's will produce large 2Fo's and small Fo's will produce small 2Fo's. A reflection with twice the amplitude but the same phase will have peaks and valleys in exactly the same place, so I wouldn't expect huge differences in the location of atoms. Now I have to ask exactly how this "set Fcalc to zero" works. The simple-minded thing would be to set Fc to zero in 2Fo-Fc and get 2Fo, but I don't think this is the best thing to do. The logic for "(2Fo-Fc,Phic)" is that this is a good estimate of the "true" Fourier coefficient given that you know the Fo amplitude and a (Fc,Phic) from a model. If you say, instead, that you know only the Fo amplitude and Phic, you find that the centroid of that probability distribution is (Fo, Phic). Of course it is a little odd to say that you have a high confidence in Phic but no confidence in Fc... Since they are so tightly coupled I would rather assume I don't know either if I wanted to leave the test reflection out of the map calculation. In that case the entire coefficient has to be set to zero. Maybe that is what is being done here. Dale Tronrud On 6/18/2017 7:31 PM, Edward A. Berry wrote:
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
The result is self-explicable and is inline with Tom's reply to Wei.
Thanks, Pavel. Yes, that is convincing. I filled in some of the other results I wanted, and they do not change the conclusion.
in case anyone is interested:
After shaking, R and Rfree are very high and not significantly different (values below).
RS Refinement, against any of the maps generated from the refined structure. greatly reduces both. (So R-free is NOT increasing due to free reflections being zero in the transform of the map). If free reflections are excluded in making the map then R-free is higher than Rwork.
Using fill-in resulted in an R-free gap similar to that with excluding reflections. R and R-free were both a little lower with fill-in, but they were lower by about the same amount when the free set was included, so this may be more the effect of filling in the truly missing reflections than filling the free reflections. And could be considered as Fc-Phic bias towards the excellent map from fully refined structure.
Refining the original refined model from the pdb by RSR (against its own maps) resulted in R, R-free increasing to the same values obtained starting with the shaken models. This is like the increase seen when you solve a structure with a high-resolution search model and refine against a low-resolution dataset.
Making the maps using Fobs and the Fc Phi-c from the shaken model rather than final refined model, which is more relevant for a crystallographer (in case any crystallographer should ignore the advice given already in this thread and use real-space-refine/map-to-model instead of autobuild) still gave a significant decrease in both R's, and a similar R-Rfree gap (or lack thereof) in all four cases. Fill-in gave higher R's this time (Fc-Phic bias toward the original shaken model?)
Specifically:
model R-work R-free Original fully refined PDB 0.171 0.221 from header
After dynamics shaking: 0.3923 0.3917
After RS refinement of shaken: Using all reflections 0.2409 0.2434 Using all, + Fc for missing 0.2352 0.2409 excluding free 0.2419 0.2731 Fc for free (and missing) 0.2387 0.2606
After RS refinement of 1f8t: Using all reflections 0.2421 0.2453 Using all, + Fc for missing 0.2425 0.2479 excluding free 0.2453 0.2742 Fc for free (and missing) 0.2393 0.2585
After RS refinement of shaken against map made from shaken: Using all reflections 0.2870 0.2915 Using all, + Fc for missing 0.3008 0.3034 excluding free 0.2917 0.3236 Fc for free (and missing) 0.3026 0.3290
It is surprising to me that making the map with fillin Fc from the refined model vs the bad starting model makes a significant difference, but making the map with Fc=0 doesn't seem to hurt.
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
---------------------------- *From:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
---------------------------------------------------------------------- ---------- *From:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
On 06/19/2017 01:24 AM, Dale Tronrud wrote:
In that case the entire
coefficient has to be set to zero. Maybe that is what is being done here.
Yes- The reflection is omitted entirely in making the map. Phenix.maps makes map coefficients in which the 2mFo-DFc stuff is already incorporated; one amplitude and one phase for each reflection. Looking in that file, when free reflections are excluded both phase and amplitude are "?" (missing number flag). This is what is given to real-space-refine, so it has no way to generate a coefficient for that reflection. I verified this by using ccp4 fft to make a map with the same coefficients that I gave RSR, then sfall to calculate structure factors from the map. Then looked at half a dozen missing reflection, and Fcalc from the map was precisely zero. A cleaner way to do it would have been to tell phenix.maps to actually make the maps (real-space-refine takes either maps or map coefficients), then I could sfall on precisely the map that real-space refine is using. But if real-space refine used the map-coeff labels I told it to, it has no access to Fc and so could not be filling in. ed On 06/19/2017 01:24 AM, Dale Tronrud wrote:
How about this for an explanation of why setting Fcalc to zero doesn't hurt so much?
Lets assume that Fcalc is correlated with its corresponding Fobs. (This is reasonable since your R values are 20 to 30%.) I presume you are refining against some sort of (2Fo-Fc, PhiC) map. If you put in the real Fc you will get amplitudes that are roughly equal to (Fo, PhiC) since the Fc roughly cancels out one of the Fo. For reflections where Fc is zeroed out the coefficient will be, roughly, (2Fo, PhiC). 2Fo is still pretty strongly correlated to Fo. Certainly large Fo's will produce large 2Fo's and small Fo's will produce small 2Fo's.
A reflection with twice the amplitude but the same phase will have peaks and valleys in exactly the same place, so I wouldn't expect huge differences in the location of atoms.
Now I have to ask exactly how this "set Fcalc to zero" works. The simple-minded thing would be to set Fc to zero in 2Fo-Fc and get 2Fo, but I don't think this is the best thing to do. The logic for "(2Fo-Fc,Phic)" is that this is a good estimate of the "true" Fourier coefficient given that you know the Fo amplitude and a (Fc,Phic) from a model. If you say, instead, that you know only the Fo amplitude and Phic, you find that the centroid of that probability distribution is (Fo, Phic). Of course it is a little odd to say that you have a high confidence in Phic but no confidence in Fc... Since they are so tightly coupled I would rather assume I don't know either if I wanted to leave the test reflection out of the map calculation. In that case the entire coefficient has to be set to zero. Maybe that is what is being done here.
Dale Tronrud
On 6/18/2017 7:31 PM, Edward A. Berry wrote:
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
The result is self-explicable and is inline with Tom's reply to Wei.
Thanks, Pavel. Yes, that is convincing. I filled in some of the other results I wanted, and they do not change the conclusion.
in case anyone is interested:
After shaking, R and Rfree are very high and not significantly different (values below).
RS Refinement, against any of the maps generated from the refined structure. greatly reduces both. (So R-free is NOT increasing due to free reflections being zero in the transform of the map). If free reflections are excluded in making the map then R-free is higher than Rwork.
Using fill-in resulted in an R-free gap similar to that with excluding reflections. R and R-free were both a little lower with fill-in, but they were lower by about the same amount when the free set was included, so this may be more the effect of filling in the truly missing reflections than filling the free reflections. And could be considered as Fc-Phic bias towards the excellent map from fully refined structure.
Refining the original refined model from the pdb by RSR (against its own maps) resulted in R, R-free increasing to the same values obtained starting with the shaken models. This is like the increase seen when you solve a structure with a high-resolution search model and refine against a low-resolution dataset.
Making the maps using Fobs and the Fc Phi-c from the shaken model rather than final refined model, which is more relevant for a crystallographer (in case any crystallographer should ignore the advice given already in this thread and use real-space-refine/map-to-model instead of autobuild) still gave a significant decrease in both R's, and a similar R-Rfree gap (or lack thereof) in all four cases. Fill-in gave higher R's this time (Fc-Phic bias toward the original shaken model?)
Specifically:
model R-work R-free Original fully refined PDB 0.171 0.221 from header
After dynamics shaking: 0.3923 0.3917
After RS refinement of shaken: Using all reflections 0.2409 0.2434 Using all, + Fc for missing 0.2352 0.2409 excluding free 0.2419 0.2731 Fc for free (and missing) 0.2387 0.2606
After RS refinement of 1f8t: Using all reflections 0.2421 0.2453 Using all, + Fc for missing 0.2425 0.2479 excluding free 0.2453 0.2742 Fc for free (and missing) 0.2393 0.2585
After RS refinement of shaken against map made from shaken: Using all reflections 0.2870 0.2915 Using all, + Fc for missing 0.3008 0.3034 excluding free 0.2917 0.3236 Fc for free (and missing) 0.3026 0.3290
It is surprising to me that making the map with fillin Fc from the refined model vs the bad starting model makes a significant difference, but making the map with Fc=0 doesn't seem to hurt.
On 06/13/2017 02:15 PM, Pavel Afonine wrote:
Hi Ed,
Including free-r reflections into map calculation and then using such map in real-space refinement of entire model will affect Rfree. Here is a simple example that illustrates my statement, step-by-step:
1) Get data and model from PDB:
phenix.fetch_pdb 1f8t --mtz
2) Compute two 2mFo-DFc maps: one includes all reflections the other one has no free-r terms:
phenix.python run.py 1f8t.{pdb,mtz}
This will create an MTZ file (map_coeffs.mtz) that contains Fourier map coefficients for both maps.
3) Shake model a bit:
phenix.dynamics 1f8t.pdb number_of_steps=500
4) Run real-space refinement using two maps:
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
5) Compute R-factors using data and real-space refined models:
phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb r_work(re-computed) : 0.2419 r_free(re-computed) : 0.2441
phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb r_work(re-computed) : 0.2444 r_free(re-computed) : 0.2756
The result is self-explicable and is inline with Tom's reply to Wei.
All files necessary to reproduce calculations above are here: http://cci.lbl.gov/~afonine/tmp/
All the best, Pavel
On 6/8/17 10:05, Tim Gruene wrote:
Hi Ed,
including the 'free' reflections in the map for modelling does not taint the value of Rfree. That is a misconception that i s very persistent (as prejudice usually are). I believe it was Ian Tickle who formulated that when you simply refine long enough towards convergence, all reflections excluded from refinement will become independent, i.e. you can assign a new set for Rfree every time you refine, if you wish so.
This concept is the reason why Rcomplete (the "better" equivalent to Rfree for small data sets with < 10,000 unique reflections), introduced by Axel Brunger, works, as we could demonstrate in doi: 10.1073/pnas.1502136112
So nothing to worry about when including all reflections in map calculations.
Cheers, Tim
On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
Hi, Tom, Please forgive what may be a silly question from an outsider who hasn't really kept up with the crystallography literature or even all the Phenix newsletters- What is the evidence that including the free set in real space refinement biases R-free of the resulting model? Is this Rfree also biased when map coefficients use "fill-in" for the excluded free reflections (and is that what phenix.remove_free_from_map does?).
My point is that literally excluding the free reflections, as opposed to substituting their values with Fc, will bias the free set toward grossly incorrect values (namely zero) and therefore greatly worsen R-free. Thus if the evidence for bias is that you get worse R-free when you exclude the free set, you have to think about how much of that difference results from bias towards the observed values (when the reflections are included) and how much is from bias towards zero (when the free set is excluded). (Again, I realize this may be all very well understood by the crystallography community and properly taken care of in phenix; I'm just asking for my own information) eab
On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
Hi Wei,
I want to give a word of caution about how to use phenix.map_to_model on crystallographic data...The bottom line is you should remove the test set from your map coefficients before running phenix.map_to model on X-ray data. Here is why:
phenix.map_to_model uses real-space refinement, which is refinement against the map. If you supply map coefficients that include your test reflections, then you will be refining against data that is in your test set. This will make your Rfree invalid when you go back and refine your model against the original crystallographic data.
To remove the test set from your map coefficients you can use:
phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz free_in=my_data_file_with_freeR_flags.mtz mtz_out=my_map_coeffs_no_free.mtz
Also note that phenix.map_to_model uses a fixed map (it does not do density modification). Consequently for most crystallographic data at moderate resolution or higher phenix.autobuild is going to do much better than phenix.map_to_model.
All the best,
Tom T
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
--------------------------------------------------------------------------
---------------------------- *From:*[email protected]
on behalf [email protected] *Sent:* Tuesday, June 6, 2017 9:16 PM *To:* Terwilliger, Thomas Charles *Cc:*[email protected] *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file failure Dear Thomas, I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then submit this job again (without map_coeffs_labels=... ), and everything seems ok. Thank you very much for you help. Best! -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083
E-mail:[email protected] mailto:[email protected]
At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
wrote: Hi Wei, I'm sorry for the trouble!
If you supply an MTZ file that has FWT,PHFWT or similar labels, then you can skip the "labels=...." statement and it should run.
Let me know if that does not work! All the best,
Tom T
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------
---------------------------------------------------------------------- ---------- *From:*[email protected] mailto:[email protected]
mailto:[email protected]> on behalf of [email protected] mailto:[email protected] mailto:[email protected]> *Sent:* Tuesday, June 6, 2017 8:19 PM *To:*[email protected] mailto:[email protected] *Subject:* [phenixbb] phenix.map_to_model input mtz file failure Dear Phenix bb, I intend to build a initial model by phenix.map_to_model. And the command line is as follows: phenix.map_to_model_1.12rc0-2787 map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP' 'PHIDM' 'FOMDM'" seq_file=../resolve.seq is_crystal=True use_sg_symmetry=True density_select=False truncate_at_d_min=True and the feedback like this: Sorry: No initial assignment made for map_coeffs. Labels used: FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM', 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM', 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like 'FP,SIGFP' must stay together, have commas, and have no spaces. If they come from an MTZ file, they must be in adjacent columns as well. Suggested labels to use: PHIDM FOMDM I try many other input format of map_coeffs_labels, such as map_coeffs_labels="FP,SIGFP PHIDM FOMDM" map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"] ... ... but the result is the same. Dose anyone can tell me how to fix this problem? Thank a lot. -- Wei Ding P.O.Box 603 The Institute of Physics,Chinese Academy of Sciences Beijing,China 100190 Tel: +86-10-82649083 E-mail:[email protected] mailto:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe:[email protected]
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Dear Thomas,
Thank you for the cation. And now, I always convert the mtz file to the ccp4 map (by phenix.mtz2map) before using the phenix.map_to_model, so there are not test case set in my data now. I am not sure it is a good method or not.
But phenix.map_to_model will always stop in RESOLVE step:
######################################################################################
Probability that grid points are in protein region
1.0 .............................................xxxxx
. + x .
. + x .
. + .
. + .
. +x .
. + .
p(protein) . + .
0.5 . x .
. + .
. + .
. x+ .
. + .
. x + .
. xx + .
0.0 .xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx....+........
Percentile of grid points
-------------------------------------------------------------------------------
Number of reflections in input refl_db: 0
Number of reflections in current array: 1
Discarding all extra reflections from output array
Number of reflections in output array with values: 0. Without: 0
Writing mask to memory (entire unit cell)
CC of prob map with current map: 1.000000
resolve exit_info:
source_file: /home/builder/slave/phenix-nightly-intel-linux-2_6-x86_64-centos6/modules/solve_resolve/resolve/aaa_resolve_main.cpp
source_line: 1630
status: 0
EndOfResolve
#####################################################################################
I check the cpu, found that the CPU is still running but nothing will be outputted, and after hours waiting, this procedure will stop automatically.
The input is as follow:
phenix.map_to_model_1.12rc0-2787 RpeHgSIR_0_oasis_1.ccp4 ../resolve.seq resolution=2.8 nproc=6 segment=True is_crystal=True auto_sharpen=False use_sg_symmetry=True density_select=False truncate_at_d_min=True space_group=R3 verbose=True wang_radius=20
Are there any problems for my data or input parameters?
Thank you for your time!
--
Wei Ding
P.O.Box 603
The Institute of Physics,Chinese Academy of Sciences
Beijing,China
100190
Tel: +86-10-82649083
E-mail: [email protected]
At 2017-06-08 19:28:05, "Terwilliger, Thomas Charles"
participants (7)
-
Dale Tronrud
-
dancingdream@163.com
-
Edward A. Berry
-
Pavel Afonine
-
Terwilliger, Thomas Charles
-
Tim Gruene
-
Tristan Croll