[cctbxbb] Fwd: [Dials-support] dials.stills_process crash report

Aaron Brewster asbrewster at lbl.gov
Wed Dec 13 08:52:52 PST 2017


Thanks Rob.  I've switched dials.stills_process to using
easy_mp.multi_core_run.  I tested raising an error in one of the processes
and confirmed the program finishes, and that I can print out the error
message raised in the process.  This is great.

Thanks,
-Aaron

On Wed, Dec 13, 2017 at 4:13 AM, R.D. Oeffner <rdo20 at cam.ac.uk> wrote:

> Hi Graeme,
>
> As far as I can tell from stills_process.py you're using
> easy_mp.parallel_map() for your multiprocessing. I should use
> easy_mp.multi_core_run() instead. It is a different version of
> easy_mp.parallel_map().
> See https://www.phenix-online.org/newsletter/CCN_2017_01.pdf#page=6
>
> One of its features is that it preserves stacktraces from individual child
> processes should they crash which is helpful for debugging. If a child
> process crashes in parallel_map() however you will not get a stacktrace
> from the offending child which makes correcting the misbehaviour more
> difficult.
>
> Rob
>
>
>
> On 13/12/2017 09:34, Graeme.Winter at diamond.ac.uk wrote:
>
>> Folks
>>
>> Once again, lack of stack traces in easy_mp is a big annoyance
>>
>> I forget the reasons why printing the actual error was a bad thing,
>> but I do remember a discussion some 18 mo past
>>
>> Not wishing to replay the entire transaction, but could we revisit this?
>>
>> Thanks Graeme
>>
>> Begin forwarded message:
>>
>> From: <Danny.Axford at Diamond.ac.uk<mailto:Danny.Axford at Diamond.ac.uk>>
>> Subject: [Dials-support] dials.stills_process crash report
>> Date: 12 December 2017 at 17:43:41 GMT
>> To:
>> <dials-support at lists.sourceforge.net<mailto:dials-support@
>> lists.sourceforge.net>>
>> Cc: martin.appleby at diamond.ac.uk<mailto:martin.appleby at diamond.ac.uk>
>>
>> Perhaps this means something to someone. This is the SACLA data again,
>> it seems to run happily and then hit a problematic frame? Martin and I
>> see the error at pretty much the same point in the dataset, we were
>> both running on local machines rather than the cluster.
>>
>> Cheers,
>> Danny
>>
>>
>>
>> Detector 1 RMSDs by panel:
>> ----------------------------------------------------
>> | Panel | Nref | RMSD_X  | RMSD_Y  | RMSD_DeltaPsi |
>> | id    |      | (px)    | (px)    | (deg)         |
>> ----------------------------------------------------
>> | 1     | 89   | 0.37442 | 0.48949 | 0.2192        |
>> | 2     | 69   | 0.49627 | 0.4847  | 0.21385       |
>> | 5     | 90   | 0.48267 | 0.59909 | 0.22839       |
>> | 6     | 70   | 0.6094  | 0.55592 | 0.20944       |
>> ----------------------------------------------------
>> Traceback (most recent call last):
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/build/../modules/dials/command_line/stills_process.py",
>> line 870, in <module>
>>    halraiser(e)
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/build/../modules/dials/command_line/stills_process.py",
>> line 868, in <module>
>>    script.run()
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/build/../modules/dials/command_line/stills_process.py",
>> line 383, in run
>>    preserve_exception_message=True)
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/modules/cctbx_project/libtbx/easy_mp.py",
>> line 623, in parallel_map
>>    result = res()
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/modules/cctbx_project/libtbx/scheduling/result.py",
>> line 119, in __call__
>>    self.traceback( exception = self.exception() )
>>  File
>> "/dls/science/groups/scisoft/DIALS/CD/latest/dials-dev201712
>> 12/modules/cctbx_project/libtbx/scheduling/stacktrace.py",
>> line 86, in __call__
>>    raise exception
>> RuntimeError: Please report this error to
>> dials-support at lists.sourceforge.net<mailto:dials-support@
>> lists.sourceforge.net>:
>> exit code = -9
>>
>>
>> --
>> This e-mail and any attachments may contain confidential, copyright
>> and or privileged material, and are for the use of the intended
>> addressee only. If you are not the intended addressee or an authorised
>> recipient of the addressee please notify us of receipt by returning
>> the e-mail and do not use, copy, retain, distribute or disclose the
>> information in or attached to the e-mail.
>> Any opinions expressed within this e-mail are those of the individual
>> and not necessarily of Diamond Light Source Ltd.
>> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
>> attachments are free from viruses and we cannot accept liability for
>> any damage which you may sustain as a result of software viruses which
>> may be transmitted in or with the message.
>> Diamond Light Source Limited (company no. 4375679). Registered in
>> England and Wales with its registered office at Diamond House, Harwell
>> Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United
>> Kingdom
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org<http://slashdot.org/>!
>> http://sdm.link/slashdot
>> _______________________________________________
>> DIALS-support mailing list
>> DIALS-support at lists.sourceforge.net<mailto:DIALS-support@
>> lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/dials-support
>>
>>
>> _______________________________________________
>> cctbxbb mailing list
>> cctbxbb at phenix-online.org
>> http://phenix-online.org/mailman/listinfo/cctbxbb
>>
>
> _______________________________________________
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/cctbxbb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20171213/f098821f/attachment-0001.htm>


More information about the cctbxbb mailing list