Bugzilla – Bug 1142164
LTO: petsc:gnu-openmpi3-hpc build hang
Last modified: 2019-07-19 16:39:07 UTC
Created attachment 810999 [details] petsc_gnu_openmpi3_hpc_standard_x86_64_201907190841.log LTO: petsc:gnu-openmpi3-hpc build hang as reported in https://build.opensuse.org/project/show/openSUSE:Factory:Staging:adi:26 the first log extract === [ 120s] TESTING: checkMPICHorOpenMPI from config.packages.MPI(/home/abuild/rpmbuild/BUILD/petsc-3.8.3/config/BuildSystem/config/packages/MPI.py:431) [ 120s] TESTING: checkSharedLibrary from config.packages.MPI(/home/abuild/rpmbuild/BUILD/petsc-3.8.3/config/BuildSystem/config/packages/MPI.py:128) [ 5530s] TESTING: configureMPIEXEC from config.packages.MPI(/home/abuild/rpmbuild/BUILD/petsc-3.8.3/config/BuildSystem/config/packages/MPI.py:141)qemu-system-x86_64: terminating on signal 15 from pid 12869 (<unknown process>) === If I am removing lto with "%define _lto_cflags %{nil}" then no more failure as per second attached log.
Created attachment 811000 [details] petsc_gnu_openmpi3_hpc_standard_x86_64_201907191102.log this 2nd log is w/o LTO and build completed.
while petsc:gnu-openmpi3-hpc build passed for x86_64 i586 in my own adi:26 branch (1) it failed for i586 in my own science branch (2) both with LTO disabled :( So not sure this bypass is sufficient. (1) https://build.opensuse.org/package/show/home:michel_mno:branches:openSUSE:Factory:Staging:adi:26/petsc (2) https://build.opensuse.org/package/show/home:michel_mno:branches:science/petsc
Michel, we see these transient build failures quite often with HPC packages. Usually, they are due to memory starvation: a lot of OBS machines are quite small in memory size. Of course, more CPUs will get amplified when having more cores available of course. For trilinos we had to do some fun games with macros to limit the number of cores to 4. The way you deal with this problem is using constraints: https://openbuildservice.org/help/manuals/obs-reference-guide/cha.obs.build_job_constraints.html One way to get numbers to put in there is to wait for a build to succeed and then download the binaries. The _statistics file contains numbers which should guide you. I will up these to 6G for petsc.
Downside of this approach: the higher the constraint, the fewer machines are available meeting them.
ok, the openmpi3-hpc variant has built now in the devel project which hadn't before. Closing.