Reflection file woes phenix/ccp4/PDB
Y'all, While depositing a structure, the PDB 'complained' about my structure-factor file, saying they don't contain R-free flags or are otherwise incomplete. The comment was prompted by their Refmac R-factor test failing. The structure-factor file I submitted came from phenix.refine (xxx_data.mtz). I am using version 1.7.3-928. Indeed, using this file in CCP4 fails with the message: "CCP4MTZfile: Mtz column type mismatch: I-obs(+) K-J CCP4MTZfile: Mtz column type mismatch: SIGI-obs(+) M-Q" The file contains anomalous data. I have no problems with files that do not contain anomalous data. Needless to say that Phenix uses the structure-factor file just fine when reading it back into, say, its validation utility. When I use iotbx.reflection_file_reader, I cannot detect any anomalies. I am at a loss as to how to resolve this issue. I would like to deposit the data as prepared by Phenix so that the PDB tools can use them. Anybody ran into this issue before? Any ideas for how to resolve it? Many thanks in advance! MM
On Thu, 2011-12-15 at 16:40 +0000, Machius, Mischa Christian wrote:
Indeed, using this file in CCP4 fails with the message: "CCP4MTZfile: Mtz column type mismatch: I-obs(+) K-J CCP4MTZfile: Mtz column type mismatch: SIGI-obs(+) M-Q"
According to this http://www.ccp4.ac.uk/html/mtzformat.html#coltypes the I-obs(+) column should be type K, and SIGI-obs(+) of type M. It's possible that phenix does not follow that convention. As for R-free flags, did you generate them in phenix? If so, the refmac run with default parameters that pdb does during validation would run against about 5% of reflections (in phenix test_value=1, in refmac it's zero). I guess general way to avoid such calamity is to generate test set with freeflag initially. At this point, you can definitely use something like sftools to reset the flags so that your model passes pdb validation with flying colors. Cheers, Ed. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
On Thu, Dec 15, 2011 at 9:05 AM, Ed Pozharski
As for R-free flags, did you generate them in phenix? If so, the refmac run with default parameters that pdb does during validation would run against about 5% of reflections (in phenix test_value=1, in refmac it's zero).
I thought the PDB finally fixed this? Their R-factor validation is nearly useless anyway, but this seems like such an obvious and easy mistake to avoid.
I guess general way to avoid such calamity is to generate test set with freeflag initially. At this point, you can definitely use something like sftools to reset the flags so that your model passes pdb validation with flying colors.
Does freerflag take into account the possibility of twinning? FYI, you can also use the reflection file editor in the GUI to convert the flags to something CCP4 can understand, with the caveat that any flags other than '0' in the output file are not guaranteed to be evenly distributed or reproducible. -Nat
On Thu, 2011-12-15 at 10:21 -0800, Nathaniel Echols wrote:
I thought the PDB finally fixed this? Their R-factor validation is nearly useless anyway, but this seems like such an obvious and easy mistake to avoid.
The refmac based validation that PDB runs is not without problems. For instance, I recently had about 7% jump in R values upon deposition, which turned out to be due to fixed TLS from the header being applied to a file which (per PDB requirements) had the full B-factors listed already. However, I disagree that it is useless. It surely does guard against accidentally depositing the wrong dataset. It also warns you if you have poor electron density match for heteroatoms, which does help to keep the "creative" electron density interpretation in check.
Does freerflag take into account the possibility of twinning?
Not to my knowledge. -- Oh, suddenly throwing a giraffe into a volcano to make water is crazy? Julian, King of Lemurs
Hi Mischa,
It may be helpful to share the header and first several lines of the
.mtz. There might be something weird about the column labels that's
causing the problem.
-bob
On Thu, Dec 15, 2011 at 11:40 AM, Machius, Mischa Christian
Y'all,
While depositing a structure, the PDB 'complained' about my structure-factor file, saying they don't contain R-free flags or are otherwise incomplete. The comment was prompted by their Refmac R-factor test failing.
The structure-factor file I submitted came from phenix.refine (xxx_data.mtz). I am using version 1.7.3-928.
Indeed, using this file in CCP4 fails with the message: "CCP4MTZfile: Mtz column type mismatch: I-obs(+) K-J CCP4MTZfile: Mtz column type mismatch: SIGI-obs(+) M-Q"
The file contains anomalous data. I have no problems with files that do not contain anomalous data.
Needless to say that Phenix uses the structure-factor file just fine when reading it back into, say, its validation utility.
When I use iotbx.reflection_file_reader, I cannot detect any anomalies.
I am at a loss as to how to resolve this issue. I would like to deposit the data as prepared by Phenix so that the PDB tools can use them.
Anybody ran into this issue before? Any ideas for how to resolve it?
Many thanks in advance!
MM _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Mischa,
The comment was prompted by their Refmac R-factor test failing.
I guess this is the root of the problem, in the first place. It is strange to test a file content with a refinement program. When I was depositing a structure a while ago it also failed that test, obviously because that program does not have a neutron scattering dictionary. For PDB deposition you can use any of two files created by phenix.refine: xxx_data.mtz or the one that has Fmodel and map coefficients. Both files contain the original reflection data (Iobs or Fobs - whatever the input was) and Rfree flags. It is very unlikely that there is a bug in any of these files since they are so heavily used in so many places for so long time... Also, you can use phenix.mtz.dump file.mtz to see the content of file.mtz file. Anyway, if you suspect a problem, please send me the file and I will quickly investigate. Pavel
Hi Mischa, thanks for sending the data file. 1) The file passes this simple sanity check: python run.py where run.py contains the code inclosed between "******": ****** from iotbx import reflection_file_reader def run(): reflection_file = reflection_file_reader.any_reflection_file( file_name="data.mtz") miller_arrays = reflection_file.as_miller_arrays() cntr = 0 for ma1 in miller_arrays: if(ma1.info().labels[0].count("free")): assert ma1.data().count(1)+ma1.data().count(0) == ma1.data().size() assert ma1.data().count(1) < ma1.data().count(0) cntr+=1 for ma2 in miller_arrays: assert ma1.indices().all_eq(ma2.indices()) assert cntr == 1 if (__name__ == "__main__"): run() ****** - I don't see any anomalies in the output of phenix.mtz.dump data.mtz command: Processing: data.mtz Title: /Users/machius/Work/_Projects/Kuhlman/bd2/03-ReflectionFiles/bd2-h35e- Space group symbol from file: P212121 Space group number from file: 19 Space group from matrices: P 21 21 21 (No. 19) Point group symbol from file: 222 Number of crystals: 2 Number of Miller indices: 38549 Resolution range: 26.345 1.15068 History: Crystal 1: Name: HKL_base Project: HKL_base Id: 0 Unit cell: (37.241, 46.528, 62.277, 90, 90, 90) Number of datasets: 1 Dataset 1: Name: HKL_base Id: 0 Wavelength: 0 Number of columns: 0 Crystal 2: Name: crystal Project: project Id: 2 Unit cell: (37.241, 46.528, 62.277, 90, 90, 90) Number of datasets: 1 Dataset 1: Name: dataset Id: 1 Wavelength: 1 Number of columns: 9 label #valid %valid min max type H 38549 100.00% 0.00 32.00 H: index h,k,l K 38549 100.00% 0.00 40.00 H: index h,k,l L 38549 100.00% 0.00 54.00 H: index h,k,l I-obs(+) 38249 99.22% -120.50 106929.00 K: I(+) or I(-) SIGI-obs(+) 38249 99.22% 1.00 3348.00 M: standard deviation I-obs(-) 34269 88.90% -221.00 100983.00 K: I(+) or I(-) SIGI-obs(-) 34269 88.90% 1.90 3140.00 M: standard deviation R-free-flags(+) 38249 99.22% 0.00 1.00 I: integer R-free-flags(-) 34269 88.90% 0.00 1.00 I: integer So I would say everything is ok with your data file and you should probably talk to PDB people to resolve the issue they have with your file. Pavel On 12/15/11 8:40 AM, Machius, Mischa Christian wrote:
Y'all,
While depositing a structure, the PDB 'complained' about my structure-factor file, saying they don't contain R-free flags or are otherwise incomplete. The comment was prompted by their Refmac R-factor test failing.
The structure-factor file I submitted came from phenix.refine (xxx_data.mtz). I am using version 1.7.3-928.
Indeed, using this file in CCP4 fails with the message: "CCP4MTZfile: Mtz column type mismatch: I-obs(+) K-J CCP4MTZfile: Mtz column type mismatch: SIGI-obs(+) M-Q"
The file contains anomalous data. I have no problems with files that do not contain anomalous data.
Needless to say that Phenix uses the structure-factor file just fine when reading it back into, say, its validation utility.
When I use iotbx.reflection_file_reader, I cannot detect any anomalies.
I am at a loss as to how to resolve this issue. I would like to deposit the data as prepared by Phenix so that the PDB tools can use them.
Anybody ran into this issue before? Any ideas for how to resolve it?
Many thanks in advance!
MM _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
participants (5)
-
Ed Pozharski
-
Machius, Mischa Christian
-
Nathaniel Echols
-
Pavel Afonine
-
Robert Immormino