In upgrading to CCP4 6.2.0 and Phenix-1.7.2-869 (from CCP4 6.1.13 and Phenix-1.7.1-743, respectively) a bug has crept into reading of the MTZ files. We tend to scale and merge XDS processed data with Scala in CCP4, then use the resulting data in CCP4 and Phenix (with great success I might add). However, I just noticed that in testing the upgraded CCP4 and Phenix on OSX 7.2 (Lion) that the MTZ file is no longer read correctly. An example is below where after the output from scaling and merging of nonanomalous data from the CCP4i GUI gives, at the end of FREERFLAG, some 47,000 reflections. Putting this into xtriage or the reflection file editor in the Phenix GUI, one finds that the data for the F and the DANO columns are used; just selecting Fs alone is impossible. The end result is a doubling of the reflection number and the switching of the anomalous flag on. Processing with earlier versions did not do this except when actually using anomalous data. Why this has happened is not clear to me, but the column order in the MTZ file changed slightly between CCP4 6.1.13 and CCP4 6.2.0. Any ideas, am I missing something, or is this a push to only refine with intensities? Regards, Michael ***************** From Scala (3.3.20) Total number unique : 46813 From FREERFLAG Number of Reflections = 47044 (including non-observed NaN reflections) MTZ column labels H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 DANO_S13E_X2 SIGDANO_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) ISYM_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-) ****************** In Phenix: Xtriage output choosing "Imean" Miller array info: /Users/garavito/ccp4_projects/sus1_S13/reflections_tmp.mtz:IMEAN_S13E_X2,SIGIMEAN_S13E_X2 Observation type: xray.amplitude Type of data: double, size=46772 Type of sigmas: double, size=46772 Number of Miller indices: 46772 Anomalous flag: False Xtriage output choosing "Fs" Miller array info: /Users/garavito/ccp4_projects/sus1_S13/reflections_tmp.mtz:F_S13E_X2,SIGF_S13E_X2,DANO_S13E_X2,SIGDANO_S13E_X2 Observation type: xray.amplitude Type of data: double, size=88486 Type of sigmas: double, size=88486 Number of Miller indices: 88486 Anomalous flag: True **************************************************************** R. Michael Garavito, Ph.D. Professor of Biochemistry & Molecular Biology 513 Biochemistry Bldg. Michigan State University East Lansing, MI 48824-1319 Office: (517) 355-9724 Lab: (517) 353-9125 FAX: (517) 353-9334 Email: [email protected] ****************************************************************
On Thu, Oct 20, 2011 at 12:14 PM, R.M. Garavito
However, I just noticed that in testing the upgraded CCP4 and Phenix on OSX 7.2 (Lion) that the MTZ file is no longer read correctly. An example is below where after the output from scaling and merging of nonanomalous data from the CCP4i GUI gives, at the end of FREERFLAG, some 47,000 reflections. Putting this into xtriage or the reflection file editor in the Phenix GUI, one finds that the data for the F and the DANO columns are used; just selecting Fs alone is impossible. The end result is a doubling of the reflection number and the switching of the anomalous flag on. Processing with earlier versions did not do this except when actually using anomalous data. Why this has happened is not clear to me, but the column order in the MTZ file changed slightly between CCP4 6.1.13 and CCP4 6.2.0. Any ideas, am I missing something, or is this a push to only refine with intensities?
It's actually more an inherent problem with representing data as merged amplitudes and anomalous differences (and how we handle them) - PHENIX does not use these data as such, and instead converts them into what Ralf calls "reconstructed" amplitudes, i.e. F(+) SIGF(+) F(-) SIGF(-). There is potential for loss of precision in this conversion, which is why we recommend against it (the Phaser-EP GUI will actually complain if you try to run it with those data). The reason this changed is probably F SIGF being adjacent to DANO SIGDANO in the MTZ file - if they were previously separated by other columns, they would not be automatically combined into a single data array. I'm not sure if this matters for Xtriage; it will still perform the analysis of anomalous signal based on the reconstructed Friedel pairs, but I don't know of use of separate F+/F- will affect the other analyses. It definitely makes a difference for the reflection file editor and any other program which outputs these data, because they will always end up as F(+) SIGF(+) F(-) SIGF(-). Unfortunately I didn't take into account that the reflection file editor was doing this until a few days ago, so in version 1.7.2 it will probably still label the columns F SIGF DANO SIGDANO (or the equivalent) by default. I think the latest nightly build should fix this bug (and several others). This is probably not an ideal situation - it might make more sense to simply ignore DANO SIGDANO, just to avoid confusion, but I'm worried that users will complain about PHENIX not accepting their anomalous data. -Nat
Dear Nat and Ralf, I did notice that CCP4 6.2.0 outputs the mtz file with this column order: H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 DANO_S13E_X2 SIGDANO_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) ISYM_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-) But CCP4 6.1.13 outputs the mtz file with a slightly different column order: H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) DANO_S13E_X2 SIGDANO_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-) Note that DANO_S13E_X2 and SIGDANO_S13E_X2 have shifted position and ISYM_S13E_X2 in now included. Thanks for the quick response, Michael **************************************************************** R. Michael Garavito, Ph.D. Professor of Biochemistry & Molecular Biology 513 Biochemistry Bldg. Michigan State University East Lansing, MI 48824-1319 Office: (517) 355-9724 Lab: (517) 353-9125 FAX: (517) 353-9334 Email: [email protected] **************************************************************** On Oct 20, 2011, at 3:30 PM, Nathaniel Echols wrote:
On Thu, Oct 20, 2011 at 12:14 PM, R.M. Garavito
wrote: However, I just noticed that in testing the upgraded CCP4 and Phenix on OSX 7.2 (Lion) that the MTZ file is no longer read correctly. An example is below where after the output from scaling and merging of nonanomalous data from the CCP4i GUI gives, at the end of FREERFLAG, some 47,000 reflections. Putting this into xtriage or the reflection file editor in the Phenix GUI, one finds that the data for the F and the DANO columns are used; just selecting Fs alone is impossible. The end result is a doubling of the reflection number and the switching of the anomalous flag on. Processing with earlier versions did not do this except when actually using anomalous data. Why this has happened is not clear to me, but the column order in the MTZ file changed slightly between CCP4 6.1.13 and CCP4 6.2.0. Any ideas, am I missing something, or is this a push to only refine with intensities?
It's actually more an inherent problem with representing data as merged amplitudes and anomalous differences (and how we handle them) - PHENIX does not use these data as such, and instead converts them into what Ralf calls "reconstructed" amplitudes, i.e. F(+) SIGF(+) F(-) SIGF(-). There is potential for loss of precision in this conversion, which is why we recommend against it (the Phaser-EP GUI will actually complain if you try to run it with those data). The reason this changed is probably F SIGF being adjacent to DANO SIGDANO in the MTZ file - if they were previously separated by other columns, they would not be automatically combined into a single data array.
I'm not sure if this matters for Xtriage; it will still perform the analysis of anomalous signal based on the reconstructed Friedel pairs, but I don't know of use of separate F+/F- will affect the other analyses. It definitely makes a difference for the reflection file editor and any other program which outputs these data, because they will always end up as F(+) SIGF(+) F(-) SIGF(-). Unfortunately I didn't take into account that the reflection file editor was doing this until a few days ago, so in version 1.7.2 it will probably still label the columns F SIGF DANO SIGDANO (or the equivalent) by default. I think the latest nightly build should fix this bug (and several others).
This is probably not an ideal situation - it might make more sense to simply ignore DANO SIGDANO, just to avoid confusion, but I'm worried that users will complain about PHENIX not accepting their anomalous data.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Michael,
I did notice that CCP4 6.2.0 outputs the mtz file with this column order:
H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 DANO_S13E_X2 SIGDANO_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) ISYM_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-)
This new F,SIGF,DANO,SIGDANO order explains why the four MTZ columns are combined into an anomalous array. (The phenix tools have been doing this for many years.) Is this an actual problem for you? You could use phenix.reflection_file_convert --non-anomalous to obtain a non-anomalous array. In phenix.refine you can say force_anomalous_flag_to_be_equal_to=False. Ralf
Ralf, No, this is no real problem as we have now solved it locally by deleting or rearranging columns in sftools. Using phenix.reflection_file_convert would work just as well. The point is that input and output formats tend to be obscured for the GUI users (which tend to be the younger users). Having a phenix page or wiki which briefly informs users of file handling and input/output data order would be helpful. Some of our efforts involve comparing CCP4 and Phenix results, which means that compatible mtz files are required. The nuts and bolts of CCP4 tends to be a little more transparent than in Phenix, but I like the data handling organization in Phenix. As the two program suites diverge in philosophy and implementation, the differences will need to be highlighted for users who want to use both. Thanks for the help. Cheers, Michael **************************************************************** R. Michael Garavito, Ph.D. Professor of Biochemistry & Molecular Biology 513 Biochemistry Bldg. Michigan State University East Lansing, MI 48824-1319 Office: (517) 355-9724 Lab: (517) 353-9125 FAX: (517) 353-9334 Email: [email protected] **************************************************************** On Oct 21, 2011, at 6:04 PM, Ralf Grosse-Kunstleve wrote:
Hi Michael,
I did notice that CCP4 6.2.0 outputs the mtz file with this column order:
H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 DANO_S13E_X2 SIGDANO_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) ISYM_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-)
This new F,SIGF,DANO,SIGDANO order explains why the four MTZ columns are combined into an anomalous array. (The phenix tools have been doing this for many years.)
Is this an actual problem for you?
You could use phenix.reflection_file_convert --non-anomalous to obtain a non-anomalous array.
In phenix.refine you can say force_anomalous_flag_to_be_equal_to=False.
Ralf
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Mon, Oct 24, 2011 at 5:36 AM, R.M. Garavito
No, this is no real problem as we have now solved it locally by deleting or rearranging columns in sftools. Using phenix.reflection_file_convert would work just as well. The point is that input and output formats tend to be obscured for the GUI users (which tend to be the younger users). Having a phenix page or wiki which briefly informs users of file handling and input/output data order would be helpful. Some of our efforts involve comparing CCP4 and Phenix results, which means that compatible mtz files are required. The nuts and bolts of CCP4 tends to be a little more transparent than in Phenix, but I like the data handling organization in Phenix. As the two program suites diverge in philosophy and implementation, the differences will need to be highlighted for users who want to use both. Thanks for the help.
Agreed, this is confusing for us sometimes, and we wrote the code! For the editor, I have slightly expanded the documentation here: https://www.phenix-online.org/version_docs/dev-891/reflection_file_editor.ht... On reading it again, I think I need to clarify what constitutes an "array" in PHENIX, since this is usually *not* the same as a "column" in CCP4. -Nat
Do you still have both Phenix versions (1.7.1 and 1.7.2)?
Could you run this command with both versions and send me the outputs?
iotbx.reflection_file_reader reflections_tmp.mtz
Could you do this with mtz files from both CCP4 versions?
Alternatively, if you send me the two mtz files I'll try to find out what
causes the difference in behavior.
Ralf
On Thu, Oct 20, 2011 at 12:14 PM, R.M. Garavito
In upgrading to CCP4 6.2.0 and Phenix-1.7.2-869 (from CCP4 6.1.13 and Phenix-1.7.1-743, respectively) a bug has crept into reading of the MTZ files. We tend to scale and merge XDS processed data with Scala in CCP4, then use the resulting data in CCP4 and Phenix (with great success I might add).
However, I just noticed that in testing the upgraded CCP4 and Phenix on OSX 7.2 (Lion) that the MTZ file is no longer read correctly. An example is below where after the output from scaling and merging of nonanomalous data from the CCP4i GUI gives, at the end of FREERFLAG, some 47,000 reflections. Putting this into xtriage or the reflection file editor in the Phenix GUI, one finds that the data for the F and the DANO columns are used; just selecting Fs alone is impossible. The end result is a doubling of the reflection number and the switching of the anomalous flag on. Processing with earlier versions did not do this except when actually using anomalous data.
Why this has happened is not clear to me, but the column order in the MTZ file changed slightly between CCP4 6.1.13 and CCP4 6.2.0. Any ideas, am I missing something, or is this a push to only refine with intensities?
Regards,
Michael
***************** From Scala (3.3.20) Total number unique : 46813 From FREERFLAG Number of Reflections = 47044 (including non-observed NaN reflections)
MTZ column labels H K L FreeR_flag F_S13E_X2 SIGF_S13E_X2 DANO_S13E_X2 SIGDANO_S13E_X2 F_S13E_X2(+) SIGF_S13E_X2(+) F_S13E_X2(-) SIGF_S13E_X2(-) ISYM_S13E_X2 IMEAN_S13E_X2 SIGIMEAN_S13E_X2 I_S13E_X2(+) SIGI_S13E_X2(+) I_S13E_X2(-) SIGI_S13E_X2(-) ****************** In Phenix:
Xtriage output choosing "Imean" Miller array info: /Users/garavito/ccp4_projects/sus1_S13/reflections_tmp.mtz:IMEAN_S13E_X2,SIGIMEAN_S13E_X2 Observation type: xray.amplitude Type of data: double, size=46772 Type of sigmas: double, size=46772 Number of Miller indices: 46772 Anomalous flag: False
Xtriage output choosing "Fs" Miller array info: /Users/garavito/ccp4_projects/sus1_S13/reflections_tmp.mtz:F_S13E_X2,SIGF_S13E_X2,DANO_S13E_X2,SIGDANO_S13E_X2 Observation type: xray.amplitude Type of data: double, size=88486 Type of sigmas: double, size=88486 Number of Miller indices: 88486 Anomalous flag: True
******************************************************************
*R. Michael Garavito, Ph.D.*
*Professor of Biochemistry & Molecular Biology*
*513 Biochemistry Bldg. *
*Michigan State University *
*East Lansing, MI 48824-1319*
*Office:** **(517) 355-9724 Lab: (517) 353-9125***
*FAX: (517) 353-9334 Email: [email protected]*
******************************************************************
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
participants (3)
-
Nathaniel Echols
-
R.M. Garavito
-
Ralf Grosse-Kunstleve