[cctbxbb] Trapping NaN or not trapping NaN

Nicholas Sauter nksauter at lbl.gov
Wed Apr 13 11:28:15 PDT 2016


OK, code is deleted.

Nick

Nicholas K. Sauter, Ph. D.
Computer Staff Scientist, Molecular Biophysics and Integrated Bioimaging
Division
Lawrence Berkeley National Laboratory
1 Cyclotron Rd., Bldg. 33R0345
Berkeley, CA 94720
(510) 486-5713

On Wed, Apr 13, 2016 at 10:18 AM, Marcin Wojdyr <wojdyr at gmail.com> wrote:

> btw in env_config.py where you check if traps should be set you have:
> if (libtbx_cvs_root.lower().find("ccp") >= 0):
> which apparently was kindly added for CCP4 in 2005:
>
> https://github.com/xia2/cctbx/commit/79e5e718121855f5f663bdba570d6868b48ef199
>
> It can be safely removed, CCP4 is not using CVS now.
> Cheers
> Marcin
>
> On Wed, Apr 13, 2016 at 5:17 PM, Luc Bourhis <luc_j_bourhis at mac.com>
> wrote:
> > So, shall we make a move then?
> >
> > Richard, does disabling NaN trapping solve the problems for dials that
> you mentioned the other day? Basically, the question is what problems are
> left after disabling trapping so as to know which commits to revert.
> >
> >
> >> On 11 Apr 2016, at 10:11, Graeme.Winter at Diamond.ac.uk wrote:
> >>
> >> Luc
> >>
> >> Agree - compiler writers know more than I do about making fast code :)
> >>
> >> This trapping is a real pain and probably was useful back when SGI's
> and Alphas roamed the earth, but today I think this is just unhelpful - I
> vote for retiring this
> >>
> >> Will also help with graphics programs etc too! Often get crashes in
> matplotlib from this...
> >>
> >> Thanks Graeme
> >>
> >> -----Original Message-----
> >> From: cctbxbb-bounces at phenix-online.org [mailto:
> cctbxbb-bounces at phenix-online.org] On Behalf Of Luc Bourhis
> >> Sent: 11 April 2016 09:07
> >> To: cctbx mailing list
> >> Subject: Re: [cctbxbb] Trapping NaN or not trapping NaN
> >>
> >> Hi Paul,
> >>
> >>> On 11 Apr 2016, at 05:07, Paul Adams <pdadams at lbl.gov> wrote:
> >>>
> >>>
> >>> Okay, this sounds like a reasonable solution under the circumstances.
> Do you anticipate any problems with removing the trap? I only have the
> vaguest recollection of why it was in there, other than to make sure the
> code didn’t have errors that created NaN.
> >>
> >>
> >> First, let me remind we trap three floating point exceptions:
> division-by-zero, overflow (i.e. infinity), and NaN. And that we don’t do
> that for releases: only for development. The only point of trapping,
> really, is that the code fails as close to the origin of the problem as
> possible. Indeed if a NaN, or an +/- infinity was to make its way to our
> tests, we could catch it: one test would fail. Especially true for NaN,
> since if a is NaN then a > b, a <= b, etc are all false, for any value of
> b. But then we see a NaN or a +/- Infinity as the end result of what might
> be a long series of code, going through several layers of the cctbx and
> then the scitbx for example: it might be difficult to find where the
> problem actually occurs.
> >>
> >> The problem is mitigated by the fine granularity of our tests: if e.g.
> a division-by-zero occurs in matrix inversion in the scitbx and makes
> infinities appear in a cctbx test, I would say that it is most likely the
> scitbx test for matrix inversion would have caught it too. But there is
> still the risk of a hard to chase bug. But then the developer investigating
> the bug can switch trapping on again. With the caveat that he may then
> trigger spurious bugs like the one we are discussing. Considering all those
> remarks, I’d say it is a good as it gets.
> >>
> >> Thinking some more about it, the problem spreads to all 3 floating
> point exceptions. Indeed the problem we have now for NaN could appear for
> the other traps too. The following code:
> >>
> >> double x = x != 0 ? 1/x : 0;
> >>
> >> could be optimised by the compiler so as to compute 1/x regardless of
> the value of x and then pick either the result of that or 0 depending on
> the test x != 0. Since we trap division by 0, we would crash although the
> code returns a perfectly valid value. Same with the following code:
> >>
> >> double y = x > too_big ? plateau : std::exp(x);
> >>
> >> The compiler may decide to always evaluate “plateau” and std::exp(x) in
> parallel and then select the right result. Then we may get +infinity if x
> is too big.
> >>
> >> But let’s be conservative: we have only seen issues with NaN’s, so
> let’s disable that trap only for now, keeping in mind we may need to remove
> the others too.
> >>
> >>> I’m hoping that compilers may have moved on in the last few years…
> >>
> >> You have to realise that it is what we are doing that is esoteric.
> Trapping floating-point exceptions is not standard at all and the like of
> Boost certainly don’t routinely test in an environment with such traps.
> There are high-profile codes out there that do take advantage of infinities
> and division-by-zero to produce accurate results fast. Boost implementation
> of special functions and probability distribution thrives to be compliant
> with IEEE 754 and therefore gracefully handle infinities and zeroes for
> example.
> >>
> >> Best wishes,
> >>
> >> Luc
> >>
> >>
> >> _______________________________________________
> >> cctbxbb mailing list
> >> cctbxbb at phenix-online.org
> >> http://phenix-online.org/mailman/listinfo/cctbxbb
> >>
> >> --
> >> This e-mail and any attachments may contain confidential, copyright and
> or privileged material, and are for the use of the intended addressee only.
> If you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> >> Any opinions expressed within this e-mail are those of the individual
> and not necessarily of Diamond Light Source Ltd.
> >> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> >> Diamond Light Source Limited (company no. 4375679). Registered in
> England and Wales with its registered office at Diamond House, Harwell
> Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> >>
> >> _______________________________________________
> >> cctbxbb mailing list
> >> cctbxbb at phenix-online.org
> >> http://phenix-online.org/mailman/listinfo/cctbxbb
> >
> >
> > _______________________________________________
> > cctbxbb mailing list
> > cctbxbb at phenix-online.org
> > http://phenix-online.org/mailman/listinfo/cctbxbb
>
> _______________________________________________
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/cctbxbb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20160413/33376e27/attachment-0001.htm>


More information about the cctbxbb mailing list