[cctbxbb] pickling nan values

Billy Poon BKPoon at lbl.gov
Thu Jun 13 10:20:17 PDT 2019


Hi Rob,

We looked a little into this and it seems like the byte string interface
will preserve NAN. We're going to dig deeper into the serialization code
for a fix.

Anyway, here is a code snippet for using the byte string interface. The
byte string can be pickled, so a workaround for flex would be to add an
extra conversion step to byte string before pickling.

from __future__ import print_function

import pickle
from scitbx.array_family import flex

# NAN in flex.double
a = flex.random_double(5)
n = float('nan')
a[3] = n

# pickle test
p = pickle.dumps(a)
ap = pickle.loads(p)

# byte string test
b = a.copy_to_byte_str()
ab = flex.double_from_byte_str(b)

# pickled byte string test
pb = pickle.dumps(b)
apb = flex.double_from_byte_str(pickle.loads(pb))

# check arrays
print('Original:', list(a))
print('Pickle:  ', list(ap), 'Oh no!')
print('Bytes:   ', list(ab))
print('Bytes2:  ', list(apb))


--
Billy K. Poon
Research Scientist, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
1 Cyclotron Road, M/S 33R0345
Berkeley, CA 94720
Tel: (510) 486-5709
Fax: (510) 486-5909
Web: https://phenix-online.org


On Mon, Jun 10, 2019 at 7:10 AM Robert Oeffner <rdo20 at cam.ac.uk> wrote:

> Hi all,
>
> It seems that a flex.double array containing "nan" values doesn't
> survive pickling as in:
>
> oeffner at grove:~$ cctbx.python
> Python 2.7.15 (default, Sep 11 2018, 23:48:45)
> [GCC 4.4.7 20120313 (Red Hat 4.4.7-18)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>>
>  >>> from cctbx.array_family import flex
>  >>> import pickle
>  >>> flex_nans = flex.double(3, float('nan'))
>  >>> list(flex_nans)
> [nan, nan, nan]
>  >>> list(pickle.loads( pickle.dumps(flex_nans)))
> [0.0, 0.0, 0.0]
>  >>>
>  >>>
>
> If the flex double is cast into a python list then it does survive
> pickling as in:
>
>  >>>
>  >>> list(pickle.loads( pickle.dumps(list( flex_nans ))))
> [nan, nan, nan]
>  >>>
>
> I would like to use nan values to indicate missing or invalid data
> values. Would anyone have a suggestion how to go about this?
>
>
> Many thanks,
>
>
> Rob
>
>
> --
> Robert Oeffner, Ph.D.
> Research Associate, The Read Group
> Department of Haematology,
> Cambridge Institute for Medical Research
> University of Cambridge
> Cambridge Biomedical Campus
> The Keith Peters Building
> Hills Road
> Cambridge CB2 0XY
>
> www.cimr.cam.ac.uk/investigators/read/index.html
> tel: +44(0)1223 763234
>
> _______________________________________________
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/cctbxbb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20190613/361d000d/attachment.htm>


More information about the cctbxbb mailing list