[cctbxbb] Race condition in bootstrap python installation

Nigel Moriarty nwmoriarty at lbl.gov
Wed May 27 09:20:13 PDT 2015


Markus

I have seen the "make" error on occasion here also. It works fine if I
force a build so I agree with the race conditions analysis. However, do we
have to drop to 1 when 2 or even 3 may have the same end result and not
delay the builds as much. Either way we should may the default 1 and then
we can use which ever value works best at each site.

Cheers

Nigel

---
Nigel W. Moriarty
Building 64R0246B, Physical Biosciences Division
Lawrence Berkeley National Laboratory
Berkeley, CA 94720-8235
Phone : 510-486-5709     Email : NWMoriarty at LBL.gov
Fax   : 510-486-5909       Web  : CCI.LBL.gov

On Wed, May 27, 2015 at 8:28 AM, <markus.gerstel at diamond.ac.uk> wrote:

>  We have recently seen intermittent build failures with the bootstrap
> base installation at Diamond. These all occurred during the compilation of
> python and were usually of the form
>
>
>
> : gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
> -Wl,-rpath=\$ORIGIN/../lib Parser/acceler.o Parser/grammar1.o
> Parser/listnode.o Parser/node.o Parser/parser.o Parser/parsetok.o
> Parser/bitset.o Parser/metagrammar.o Parser/firstsets.o Parser/grammar.o
> Parser/pgen.o Objects/obmalloc.o Python/mysnprintf.o Python/pyctype.o
> Parser/tokenizer_pgen.o Parser/printgrammar.o Parser/pgenmain.o -lpthread
> -ldl  -lutil -o Parser/pgen
>
> : Parser/node.o: file not recognized: File truncated
>
> : collect2: ld returned 1 exit status
>
> : make[1]: *** [Parser/pgen] Error 1
>
> : make[1]: Leaving directory
> `/scratch/jenkins_slave/workspace/dials_bootstrap_platforms/compilationtarget/native/label/dials-ws154/build_dials/base_tmp/Python-2.7.8_cci'
>
> : make: *** [Include/graminit.h] Error 2
>
> : make: *** Waiting for unfinished jobs....
>
> Traceback (most recent call last):
>
>   File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py",
> line 1194, in <module>
>
>     installer(args=sys.argv, log=sys.stdout)
>
>   File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py",
> line 184, in __init__
>
>     self.build_dependencies(packages=packages)
>
>   File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py",
> line 571, in build_dependencies
>
>     getattr(self, 'build_%s'%i)()
>
>   File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py",
> line 622, in build_python
>
>     self.call('make -j %s install'%(self.nproc), log=log, cwd=python_dir)
>
>   File "modules/cctbx_project/libtbx/auto_build/install_base_packages.py",
> line 264, in call
>
>     return call(args, log=log, verbose=self.verbose, **kwargs)
>
>   File
> "/scratch/jenkins_slave/workspace/dials_bootstrap_platforms/compilationtarget/native/label/dials-ws154/build_dials/modules/cctbx_project/libtbx/auto_build/installer_utils.py",
> line 81, in call
>
>     raise RuntimeError("Call to '%s' failed with exit code %d" % (args,
> rc))
>
> RuntimeError: Call to 'make -j 4 install' failed with exit code 2
>
>
>
> Other indicated errors are
>
>   /usr/bin/ld: final link failed: File truncated
>
> or
>
> /usr/bin/ld: can not read symbols: File truncated
>
>
>
> I suspect that these errors are manifestations of a race condition in the
> python build process, which the bootstrap script now by default runs with a *-j
> 4* parallel make.
>
> To test this we are now using *--nproc=1* for the base installation step.
> We haven’t seen these build failures since.
>
> I suggest we set the python compilation to never run with parallel make.
> Any other ideas?
>
>
>
> -Markus
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
> _______________________________________________
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/cctbxbb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/cctbxbb/attachments/20150527/170b4d48/attachment-0001.htm>


More information about the cctbxbb mailing list