[cctbxbb] [Cctbx-cvs] SF.net SVN: cctbx: trunk/libtbx/env_config.py
bkpoon at lbl.gov
Thu Sep 8 11:45:01 PDT 2016
There is an issue with non-ASCII paths (unicode type) and basic Python
functions if the locale (like 'C') does not support UTF-8. Without UTF-8
support, these functions try to convert the unicode type into a str type
with the 'ascii' encoding, which triggers a UnicodeEncodeError. I attached
a script that tests it. The unicode path should fail for libtbx.python
before my change and pass for after my change. Or change the LC_ALL setting
in the build/bin/libtbx.python dispatcher (if the en_US locale is
available, en_US will fail, en_US.UTF-8 will work).
An additional wrinkle is that LC_ALL=C works fine on my mac (OS X 10.10.5).
Also, there is a "C.UTF-8" locale on Ubuntu, but not on CentOS.
Basically, to support non-ASCII paths (unicode type) in basic Python
functions, any locale with UTF-8 or utf8 will work. The en_US part is not
What are the errors that you get? I ran the regression tests for dials
(libtbx.run_tests_parallel module=dials) and dials_regression
(module=dials_regression) and everything passes except for one test in
dials_regression (dials_regression/test.py). But the error seems to be
about a goniometer object. Do you have the en_US locale installed?
Right now, I'm just checking if LC_ALL is set in the user environment and
using that if it has the extra UTF-8 part. I can also check the LANG
environment variable. That might be work better for users that do not have
the en_US locale installed.
Billy K. Poon
Research Scientist, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
1 Cyclotron Road, M/S 33R0345
Berkeley, CA 94720
Tel: (510) 486-5709
Fax: (510) 486-5909
On Thu, Sep 8, 2016 at 2:26 AM, <markus.gerstel at diamond.ac.uk> wrote:
> I just spent some time tracking software crashes to this change. Is
> setting the default to en_US really appropriate and what we want?
> In particular it affects the output of downstream, external software we
> run from within python.
> What is the unicode issue you hint at in the commit message?
> Dr Markus Gerstel MBCS
> Postdoctoral Research Associate
> Tel: +44 1235 778698
> Diamond Light Source Ltd.
> Diamond House
> Harwell Science & Innovation Campus
> OX11 0DE
> -----Original Message-----
> From: bkpoon at users.sourceforge.net [mailto:bkpoon at users.sourceforge.net]
> Sent: 07 September 2016 00:54
> To: cctbx-cvs at lists.sourceforge.net
> Subject: [Cctbx-cvs] SF.net SVN: cctbx: trunk/libtbx/env_config.py
> Revision: 25333
> Author: bkpoon
> Date: 2016-09-06 23:54:29 +0000 (Tue, 06 Sep 2016)
> Log Message:
> Unicode support: set LC_ALL in dispatchers to the one in the user's
> environment (if available, and supports UTF-8), otherwise use the default
> setting of en_US.UTF-8; fixes unicode issue with python in Linux (e.g.
> os.path functions do not work correctly with unicode if LC_ALL=C
> Modified Paths:
> Modified: trunk/libtbx/env_config.py
> --- trunk/libtbx/env_config.py 2016-09-06 21:15:34 UTC (rev 25332)
> +++ trunk/libtbx/env_config.py 2016-09-06 23:54:29 UTC (rev 25333)
> @@ -945,6 +945,15 @@
> def write_bin_sh_dispatcher(self,
> source_file, target_file, source_is_python_exe=False):
> + # determine LC_ALL from environment (Python UTF-8 compatibility in
> + LC_ALL = os.environ.get('LC_ALL') # user setting
> + if (LC_ALL is not None):
> + if ( ('UTF-8' not in LC_ALL) and ('utf8' not in LC_ALL) ):
> + LC_ALL = None
> + if (LC_ALL is None):
> + LC_ALL = 'en_US.UTF-8' # default
> f = target_file.open("w")
> if (source_file is not None):
> print >> f, '#! /bin/sh'
> @@ -975,7 +984,7 @@
> print >> f, '#'
> print >> f, _SHELLREALPATH_CODE
> print >> f, 'unset PYTHONHOME'
> - print >> f, 'LC_ALL=C'
> + print >> f, 'LC_ALL=' + LC_ALL
> print >> f, 'export LC_ALL'
> print >> f, 'LIBTBX_BUILD="$(shellrealpath "$0" && cd "$(dirname
> "$RESULT")/.." && pwd)"'
> print >> f, 'export LIBTBX_BUILD'
> This was sent by the SourceForge.net collaborative development platform,
> the world's largest Open Source development site.
> Cctbx-cvs mailing list
> Cctbx-cvs at lists.sourceforge.net
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> cctbxbb mailing list
> cctbxbb at phenix-online.org
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 501 bytes
Desc: not available
More information about the cctbxbb