I'm trying to run phenix.cif_as_mtz on a large number of PDB entries, and several entries give me an assertion error. The example below is from 8icz, but the error always seems to be the same. Traceback (most recent call last): File "/media/funky/phenix-1.8-1069/build/intel-linux-2.6/../../cctbx_project/mmtbx/command_line/cif_as_mtz.py", line 544, in <module> run(sys.argv[1:]) File "/media/funky/phenix-1.8-1069/build/intel-linux-2.6/../../cctbx_project/mmtbx/command_line/cif_as_mtz.py", line 135, in run incompatible_flags_to_work_set=command_line.options.incompatible_flags_to_work_set) File "/media/funky/phenix-1.8-1069/build/intel-linux-2.6/../../cctbx_project/mmtbx/command_line/cif_as_mtz.py", line 197, in process_files r_free_flags = r_free_flags).mtz_object() File "/media/funky/phenix-1.8-1069/cctbx_project/mmtbx/utils.py", line 2001, in __init__ skip_twin_detection = True) File "/media/funky/phenix-1.8-1069/cctbx_project/mmtbx/utils.py", line 2113, in get_r_factor twin_laws = twin_laws) File "/media/funky/phenix-1.8-1069/cctbx_project/mmtbx/utils.py", line 1556, in fmodel_simple assert f_obs.sys_absent_flags().data().count(True)==0 Any ideas? Thanks. :) Eric Williams PhD candidate Intelligent Systems Program University of Pittsburgh
On Wed, Jun 13, 2012 at 7:36 AM, Eric Williams
I'm trying to run phenix.cif_as_mtz on a large number of PDB entries, and several entries give me an assertion error. The example below is from 8icz, but the error always seems to be the same.
This is a bug (which one of us will fix ASAP), but if you omit the --use-model argument, it should avoid the step that's crashing. I haven't come across too many entries in the PDB that actually require this, especially among newer structures. -Nat
Thanks. Good to know it's not just me. :)
How might I know which structures do and don't require --use-model? I'm
running cif_as_mtz on the whole PDB so I can then run model_vs_data. If I
know when to use that switch and when not to, I can mend my script
accordingly.
Thanks. :)
Eric
On Wed, Jun 13, 2012 at 10:59 AM, Nathaniel Echols
On Wed, Jun 13, 2012 at 7:36 AM, Eric Williams
wrote: I'm trying to run phenix.cif_as_mtz on a large number of PDB entries, and several entries give me an assertion error. The example below is from 8icz, but the error always seems to be the same.
This is a bug (which one of us will fix ASAP), but if you omit the --use-model argument, it should avoid the step that's crashing. I haven't come across too many entries in the PDB that actually require this, especially among newer structures.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Wed, Jun 13, 2012 at 8:05 AM, Eric Williams
How might I know which structures do and don't require --use-model? I'm running cif_as_mtz on the whole PDB so I can then run model_vs_data. If I know when to use that switch and when not to, I can mend my script accordingly.
Pavel might know this, since he's the person who normally runs this. Normally, I would also suggest that you take advantage of a resource in Phenix: $PHENIX/chem_data/polygon_data/all_mvd.pickle which contains model_vs_data results for all applicable PDB entries. However, this has not been updated since at least last August, and in the meantime, there have been thousands of new entries added to the PDB, plus Pavel recently changed the bulk solvent correction and scaling procedure, which tends to result in slightly lower R-factors, so I think it's officially obsolete right now. As it happens, phenix.cif_as_mtz appears to offer an additional way around the crash, which is the argument "--remove_systematic_absences". Try adding this to your existing script (keeping "--use_model") and see if that fixes the problem. Just to be safe, I would also add "--map_to_asu" (which will move reflections around so the h,k,l indices are in the canonical setting) and "--merge" (which will merge non-unique reflections - keeping Friedel pairs separate, of course). Looks like we need to update our documentation. -Nat
The work-around seems to work...er...around. Thanks. :)
Eric
On Wed, Jun 13, 2012 at 11:17 AM, Nathaniel Echols
On Wed, Jun 13, 2012 at 8:05 AM, Eric Williams
wrote: How might I know which structures do and don't require --use-model? I'm running cif_as_mtz on the whole PDB so I can then run model_vs_data. If I know when to use that switch and when not to, I can mend my script accordingly.
Pavel might know this, since he's the person who normally runs this. Normally, I would also suggest that you take advantage of a resource in Phenix:
$PHENIX/chem_data/polygon_data/all_mvd.pickle
which contains model_vs_data results for all applicable PDB entries. However, this has not been updated since at least last August, and in the meantime, there have been thousands of new entries added to the PDB, plus Pavel recently changed the bulk solvent correction and scaling procedure, which tends to result in slightly lower R-factors, so I think it's officially obsolete right now.
As it happens, phenix.cif_as_mtz appears to offer an additional way around the crash, which is the argument "--remove_systematic_absences". Try adding this to your existing script (keeping "--use_model") and see if that fixes the problem. Just to be safe, I would also add "--map_to_asu" (which will move reflections around so the h,k,l indices are in the canonical setting) and "--merge" (which will merge non-unique reflections - keeping Friedel pairs separate, of course). Looks like we need to update our documentation.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi,
Normally, I would also suggest that you take advantage of a resource in Phenix:
$PHENIX/chem_data/polygon_data/all_mvd.pickle
which contains model_vs_data results for all applicable PDB entries. However, this has not been updated since at least last August,
I will try to find time to update it next week. Since phenix.cif_as_mtz extracts all data arrays now (thanks Richard!) I will have to change my scripts in order to check these arrays to choose the one that was actually used to produced the final deposited structure.
and in the meantime, there have been thousands of new entries added to the PDB, plus Pavel recently changed the bulk solvent correction and scaling procedure, which tends to result in slightly lower R-factors, so I think it's officially obsolete right now.
I wouldn't say it's obsolete, it's just reflects the state as of August last year, and that's 50+k entries - good enough for statistical exploration. But of course it's not suitable if you want to look at recent year-old entries. Pavel
Hi Eric, data sets in PDB are not super clean, for example: - there is a whole lot of entries labeled as intensities but in fact amplitudes, same true the other way around; - some neutron structures annotated as X-ray; - the worst case (I forgot the PDB id) is the neutron structure called as X-ray with intensities labeled as amplitudes. So, the --use-model option tries to deconvolute these cases simply by trying all possibilities and scoring the results by R-factor. It makes phenix.cif_as_mtz to run a bit longer. If you really know what you are doing then probably not using --use-model is totally ok. Pavel On 6/13/12 8:05 AM, Eric Williams wrote:
Thanks. Good to know it's not just me. :)
How might I know which structures do and don't require --use-model? I'm running cif_as_mtz on the whole PDB so I can then run model_vs_data. If I know when to use that switch and when not to, I can mend my script accordingly.
Thanks. :)
Eric
On Wed, Jun 13, 2012 at 10:59 AM, Nathaniel Echols
mailto:[email protected]> wrote: On Wed, Jun 13, 2012 at 7:36 AM, Eric Williams
mailto:[email protected]> wrote: > I'm trying to run phenix.cif_as_mtz on a large number of PDB entries, and > several entries give me an assertion error. The example below is from 8icz, > but the error always seems to be the same. This is a bug (which one of us will fix ASAP), but if you omit the --use-model argument, it should avoid the step that's crashing. I haven't come across too many entries in the PDB that actually require this, especially among newer structures.
-Nat
Might this also be a bug? When processing PDB entry 4a0e, I get
"CifParserError: Wrong number of data items for loop containing _
symmetry_equiv.id".
Eric
On Wed, Jun 13, 2012 at 10:59 AM, Nathaniel Echols
On Wed, Jun 13, 2012 at 7:36 AM, Eric Williams
wrote: I'm trying to run phenix.cif_as_mtz on a large number of PDB entries, and several entries give me an assertion error. The example below is from 8icz, but the error always seems to be the same.
This is a bug (which one of us will fix ASAP), but if you omit the --use-model argument, it should avoid the step that's crashing. I haven't come across too many entries in the PDB that actually require this, especially among newer structures.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Thu, Jun 14, 2012 at 9:13 AM, Eric Williams
Might this also be a bug? When processing PDB entry 4a0e, I get "CifParserError: Wrong number of data items for loop containing _symmetry_equiv.id".
Richard's reply got swallowed (wrong email address!), but here is the answer:
---------- Forwarded message ----------
From: Richard Gildea
participants (3)
-
Eric Williams
-
Nathaniel Echols
-
Pavel Afonine