Bug 1201273 - sundials fails to build on 1-core VM with openmpi4
sundials fails to build on 1-core VM with openmpi4
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other
Current
Other All
: P5 - None : Normal (vote)
: ---
Assigned To: Klaus Kämpf
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2022-07-07 05:59 UTC by Bernhard Wiedemann
Modified: 2022-07-07 08:43 UTC (History)
2 users (show)

See Also:
Found By: Development
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bernhard Wiedemann 2022-07-07 05:59:20 UTC
While working on reproducible builds for openSUSE, I found that
our sundials package fails tests 
when built on a 1-core VM
in 3 of 6 configurations

To reproduce:
osc co openSUSE:Factory/sundials && cd $_
osc build -M mvapich2 --vm-type=kvm -j1 --clean --noservice standard
osc build -M openmpi3 --vm-type=kvm -j1 --clean --noservice standard
osc build -M openmpi4 --vm-type=kvm -j1 --clean --noservice standard


Actual results:
         15 - test_nvector_mpi_4_1000_0 (Failed)
         16 - test_nvector_mpimanyvector_parallel1_4_1000_200_0 (Failed)
         17 - test_nvector_mpimanyvector_parallel2_4_200_1000_0 (Failed)
         18 - test_nvector_mpiplusx_4_1000_9 (Failed)
         59 - test_sunlinsol_spgmr_parallel_100_1_1_50_1e-3_0 (Failed)
         60 - test_sunlinsol_spgmr_parallel_100_1_2_50_1e-3_0 (Failed)
         61 - test_sunlinsol_spgmr_parallel_100_2_1_50_1e-3_0 (Failed)
         62 - test_sunlinsol_spgmr_parallel_100_2_2_50_1e-3_0 (Failed)
         63 - test_sunlinsol_spfgmr_parallel_100_1_50_1e-3_0 (Failed)
         64 - test_sunlinsol_spfgmr_parallel_100_2_50_1e-3_0 (Failed)
         65 - test_sunlinsol_spbcgs_parallel_100_1_50_1e-3_0 (Failed)
         66 - test_sunlinsol_spbcgs_parallel_100_2_50_1e-3_0 (Failed)
         67 - test_sunlinsol_sptfqmr_parallel_100_1_50_1e-3_0 (Failed)
         68 - test_sunlinsol_sptfqmr_parallel_100_2_50_1e-3_0 (Failed)


Upstream is https://github.com/LLNL/sundials
Comment 1 Klaus Kämpf 2022-07-07 07:22:10 UTC
You're using it wrong ! ;-)

15/76 Test #15: test_nvector_mpi_4_1000_0 ...........................***Failed    0.01 sec
[yoga:mpi_rank_0][MPIDI_CH3I_set_affinity] WARNING: You are running 4 MPI processes on a processor that supports up to 1 cores. If you still wish to run in oversubscribed mode, please set MV2_ENABLE_AFFINITY=0 and re-run the program.

Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(493)........: 
MPID_Init(400)...............: 
MPIDI_CH3I_set_affinity(3416): MV2_ENABLE_AFFINITY: oversubscribed cores.

[cli_0]: aborting job:
Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(493)........: 
MPID_Init(400)...............: 
MPIDI_CH3I_set_affinity(3416): MV2_ENABLE_AFFINITY: oversubscribed cores.
Comment 2 Klaus Kämpf 2022-07-07 08:02:42 UTC
setting MV2_ENABLE_AFFINITY=0 in %check actually helps.
Comment 3 Klaus Kämpf 2022-07-07 08:03:17 UTC
https://build.opensuse.org/request/show/987364
Comment 4 Klaus Kämpf 2022-07-07 08:43:39 UTC
While MV2_ENABLE_AFFINITY is good for mvapich2, openmpi{3,4} still fail in their "4 core" and "parallel" tests (for obvious reasons ;-)
Added a _constraints file now instead to make the CPU requirements explicit.