Bug 1077218 - llvm5: build often stalls obs workers
llvm5: build often stalls obs workers
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: Michal Srb
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-23 12:35 UTC by Dominique Leuenberger
Modified: 2018-05-25 09:19 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dominique Leuenberger 2018-01-23 12:35:49 UTC
Quite often, I see llvm5 builds failing due to the workers apparently stalling

This is then detected by OBS (after 8 hours of inactivity) and the job being marked as 'failed'

I've seen the stalls in various locations, like for example:

[ 4817s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/llvm5-5.0.1-3.2.x86_64/usr/bin/llvm-pdbutil
[ 4817s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/llvm5-5.0.1-3.2.x86_64/usr/bin/llvm-readobj
[ 4817s] Creating llvm-readelf
[ 4817s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/llvm5-5.0.1-3.2.x86_64/usr/bin/llvm-rtdyld
[ 4817s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/llvm5-5.0.1-3.2.x86_64/usr/lib64/libLLVM.so.5.0.1

(build has been going on for '4 hours')

often it stalls during rpm write process, debuginfo extraction

Bad enough that llvm5 takes that long to build - but even worse if it takes more than one attempt to get going
Comment 1 Dominique Leuenberger 2018-01-23 12:36:30 UTC
The stalls seem to correlate to the lamb77 and lamb78 workers; cumulus seem not to suffer from it
Comment 2 Adrian Schröter 2018-01-23 13:56:12 UTC
They had an out-of-disk-space situation, I reduced them from 3 to 2 parallel instances now.
Comment 3 Michal Srb 2018-01-24 08:47:21 UTC
Does it mean LLVM needed more than the 50GB of disk space required by _constraints? (I.e. something I should fix in llvm.) Or did the worker actually not have the required space?

Note that the llvm disk usage situation will get a bit better soon since I'll have to get rid of static libraries because of bug 1065464.
Comment 4 Michal Srb 2018-05-25 09:19:34 UTC
Bug just got reassigned to me... Many optimizations were done in llvm build since this bug was opened. The build is now faster, needs less disk space and seems to be no longer stalling on OBS workers. So closing this as fixed.