[cctbxbb] Race condition in bootstrap python installation

markus.gerstel at diamond.ac.uk markus.gerstel at diamond.ac.uk
Wed May 27 08:28:34 PDT 2015


We have recently seen intermittent build failures with the bootstrap base installation at Diamond. These all occurred during the compilation of python and were usually of the form

: gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Wl,-rpath=\$ORIGIN/../lib Parser/acceler.o Parser/grammar1.o Parser/listnode.o Parser/node.o Parser/parser.o Parser/parsetok.o Parser/bitset.o Parser/metagrammar.o Parser/firstsets.o Parser/grammar.o Parser/pgen.o Objects/obmalloc.o Python/mysnprintf.o Python/pyctype.o Parser/tokenizer_pgen.o Parser/printgrammar.o Parser/pgenmain.o -lpthread -ldl  -lutil -o Parser/pgen
: Parser/node.o: file not recognized: File truncated
: collect2: ld returned 1 exit status
: make[1]: *** [Parser/pgen] Error 1
: make[1]: Leaving directory `/scratch/jenkins_slave/workspace/dials_bootstrap_platforms/compilationtarget/native/label/dials-ws154/build_dials/base_tmp/Python-2.7.8_cci'
: make: *** [Include/graminit.h] Error 2
: make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py", line 1194, in <module>
    installer(args=sys.argv, log=sys.stdout)
  File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py", line 184, in __init__
    self.build_dependencies(packages=packages)
  File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py", line 571, in build_dependencies
    getattr(self, 'build_%s'%i)()
  File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py", line 622, in build_python
    self.call('make -j %s install'%(self.nproc), log=log, cwd=python_dir)
  File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py", line 264, in call
    return call(args, log=log, verbose=self.verbose, **kwargs)
  File "/scratch/jenkins_slave/workspace/dials_bootstrap_platforms/compilationtarget/native/label/dials-ws154/build_dials/modules/cctbx_project/libtbx/auto_build/installer_utils.py", line 81, in call
    raise RuntimeError("Call to '%s' failed with exit code %d" % (args, rc))
RuntimeError: Call to 'make -j 4 install' failed with exit code 2

Other indicated errors are
  /usr/bin/ld: final link failed: File truncated
or
/usr/bin/ld: can not read symbols: File truncated

I suspect that these errors are manifestations of a race condition in the python build process, which the bootstrap script now by default runs with a -j 4 parallel make.
To test this we are now using --nproc=1 for the base installation step. We haven't seen these build failures since.
I suggest we set the python compilation to never run with parallel make. Any other ideas?

-Markus

-- 
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20150527/cb087c9f/attachment.htm>


More information about the cctbxbb mailing list