Bug 1110294

Summary: vpp-18.07.1 build -j1 fails
Product: [openSUSE] openSUSE Tumbleweed Reporter: Bernhard Wiedemann <bwiedemann>
Component: DevelopmentAssignee: Marco Varlese <marco.varlese>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: jengelh, nirmoy.das
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: openSUSE Factory   
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: WIP incomplete patch

Description Bernhard Wiedemann 2018-10-01 08:48:16 UTC
While working on reproducible builds for openSUSE, I found that
Factory vpp does not build with -j1


Steps To Reproduce:
osc co openSUSE:Factory/vpp && cd $_
osc build --no-service -j1

Actual Results:
Build fails with
  CC       vnet/bier/bier_disp_table.lo
  CC       vnet/bier/bier_bift_table.lo
  CCLD     libvnet.la
libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled argument 'libvnet_avx2.la'
make[4]: *** [Makefile:6349: libvnet.la] Error 1
make[4]: Leaving directory '/home/abuild/rpmbuild/BUILD/vpp-18.07.1/build-root/build-vpp-native/vpp'


This error goes away with

Index: vpp-18.07.1/src/vnet.am
===================================================================
--- vpp-18.07.1.orig/src/vnet.am
+++ vpp-18.07.1/src/vnet.am
@@ -1295,6 +1295,7 @@ libvnet_avx2_la_CFLAGS =                  \
        -DCLIB_MARCH_VARIANT=avx2
 noinst_LTLIBRARIES += libvnet_avx2.la
 libvnet_la_LIBADD += libvnet_avx2.la
+libvnet_la_DEPENDENCIES += libvnet_avx2.la
 endif

 if CC_SUPPORTS_AVX512
@@ -1307,6 +1308,7 @@ libvnet_avx512_la_CFLAGS =                        \
        -DCLIB_MARCH_VARIANT=avx512
 noinst_LTLIBRARIES += libvnet_avx512.la
 libvnet_la_LIBADD += libvnet_avx512.la
+libvnet_la_DEPENDENCIES += libvnet_avx512.la
 endif
 endif


but then a different, related error pops up
about double-defined symbols from avx512


vnet.am seems to be dropped from vpp in
commit 855e26868ff8b9e6d00ca4d69ce6c9fdc0f2e121
Author: Damjan Marion <damarion@cisco.com>
Date:   Fri Aug 24 13:37:45 2018 +0200

    Switch to cmake
    
    Change-Id: I982b69390c55b5ffbd744f355efc0aaf425b360c
    Signed-off-by: Damjan Marion <damarion@cisco.com>

so maybe it is easier to upgrade first and then re-try?
Comment 1 Marco Varlese 2018-10-01 08:59:35 UTC
(In reply to Bernhard Wiedemann from comment #0)
> While working on reproducible builds for openSUSE, I found that
> Factory vpp does not build with -j1
> 
> 
> Steps To Reproduce:
> osc co openSUSE:Factory/vpp && cd $_
> osc build --no-service -j1
> 
> Actual Results:
> Build fails with
>   CC       vnet/bier/bier_disp_table.lo
>   CC       vnet/bier/bier_bift_table.lo
>   CCLD     libvnet.la
> libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled
> argument 'libvnet_avx2.la'
> make[4]: *** [Makefile:6349: libvnet.la] Error 1
> make[4]: Leaving directory
> '/home/abuild/rpmbuild/BUILD/vpp-18.07.1/build-root/build-vpp-native/vpp'
> 
> 
> This error goes away with
> 
> Index: vpp-18.07.1/src/vnet.am
> ===================================================================
> --- vpp-18.07.1.orig/src/vnet.am
> +++ vpp-18.07.1/src/vnet.am
> @@ -1295,6 +1295,7 @@ libvnet_avx2_la_CFLAGS =                  \
>         -DCLIB_MARCH_VARIANT=avx2
>  noinst_LTLIBRARIES += libvnet_avx2.la
>  libvnet_la_LIBADD += libvnet_avx2.la
> +libvnet_la_DEPENDENCIES += libvnet_avx2.la
>  endif
> 
>  if CC_SUPPORTS_AVX512
> @@ -1307,6 +1308,7 @@ libvnet_avx512_la_CFLAGS =                        \
>         -DCLIB_MARCH_VARIANT=avx512
>  noinst_LTLIBRARIES += libvnet_avx512.la
>  libvnet_la_LIBADD += libvnet_avx512.la
> +libvnet_la_DEPENDENCIES += libvnet_avx512.la
>  endif
>  endif
> 
> 
> but then a different, related error pops up
> about double-defined symbols from avx512
> 
> 
> vnet.am seems to be dropped from vpp in
> commit 855e26868ff8b9e6d00ca4d69ce6c9fdc0f2e121
> Author: Damjan Marion <damarion@cisco.com>
> Date:   Fri Aug 24 13:37:45 2018 +0200
> 
>     Switch to cmake
>     
>     Change-Id: I982b69390c55b5ffbd744f355efc0aaf425b360c
>     Signed-off-by: Damjan Marion <damarion@cisco.com>
> 
> so maybe it is easier to upgrade first and then re-try?

So, the latest stable version is 18.07.1
We cannot update to anything beyond that because of cross-dependencies (e.g. OpenvSwitch) on DPDK: VPP uses DPDK and the only common version right now we can use is 18.02.

Beside the actual bug, may I ask you why you need to run the build with -j1 ?
With parallel builds VPP already takes 20-25 mins to build... :(
Comment 2 Swamp Workflow Management 2018-10-01 16:20:06 UTC
This is an autogenerated message for OBS integration:
This bug (1110294) was mentioned in
https://build.opensuse.org/request/show/639426 Factory / vpp
Comment 3 Bernhard Wiedemann 2018-10-02 09:17:52 UTC
Created attachment 784868 [details]
WIP incomplete patch
Comment 4 Swamp Workflow Management 2018-10-02 12:00:07 UTC
This is an autogenerated message for OBS integration:
This bug (1110294) was mentioned in
https://build.opensuse.org/request/show/639575 Factory / vpp
Comment 5 Jan Engelhardt 2018-10-02 16:24:52 UTC
The patch currently posted to openSUSE:Factory makes absolutely no sense whatsoever.

>I found that Factory vpp does not build with -j1
>CCLD     libvnet.la
>libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled argument 'libvnet_avx2.la'

1. I cannot reproduce this. After branching network/vpp into home:jengelh:branches:network/vpp and commenting out the %patch2 call, the package *still* builds.

The only way I can raise the error is by forcing libvnet_avx2.la out of libvnet_la_LIBADD and into libvnet_la_DEPENDENCIES --- kind of what part of your patch does. So the patch actually broke stuff.

>Error goes away with
>+libvnet_la_DEPENDENCIES += libvnet_avx2.la

2. Such lines should not be needed: all words in the libvnet_la_LIBADD variable are implicitly counted as dependencies already if the word refers to a make target.
Comment 6 Marco Varlese 2018-10-02 16:59:07 UTC
(In reply to Jan Engelhardt from comment #5)
> The patch currently posted to openSUSE:Factory makes absolutely no sense
> whatsoever.
> 
> >I found that Factory vpp does not build with -j1
> >CCLD     libvnet.la
> >libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled argument 'libvnet_avx2.la'
> 
> 1. I cannot reproduce this. After branching network/vpp into
> home:jengelh:branches:network/vpp and commenting out the %patch2 call, the
> package *still* builds.
> 
> The only way I can raise the error is by forcing libvnet_avx2.la out of
> libvnet_la_LIBADD and into libvnet_la_DEPENDENCIES --- kind of what part of
> your patch does. So the patch actually broke stuff.
> 
> >Error goes away with
> >+libvnet_la_DEPENDENCIES += libvnet_avx2.la
> 
> 2. Such lines should not be needed: all words in the libvnet_la_LIBADD
> variable are implicitly counted as dependencies already if the word refers
> to a make target.

For you to get the issue reported by Bernhard, you have to build using 1 core 1 thread. You do not specify how you build the package out of network but the default would be to take any number of cores available and that does _NOT_ show the issue.
If you took a look at the patch upstream you would see that both the _avx2 and _avx512 libraries are actually part of the main libvnet.la 
Interestingly my patch was also merged upstream after _ALL_ tests passed for VPP integration...
Comment 7 Marco Varlese 2018-10-02 17:06:11 UTC
I wrongly said "libraries" in my previous statement
<<If you took a look at the patch upstream you would see that both the _avx2 and _avx512 libraries are actually part of the main libvnet.la>>

That should have been "symbols"...
Comment 8 Marco Varlese 2018-10-02 17:09:03 UTC
(In reply to Jan Engelhardt from comment #5)
> The patch currently posted to openSUSE:Factory makes absolutely no sense
> whatsoever.
> 
> >I found that Factory vpp does not build with -j1
> >CCLD     libvnet.la
> >libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled argument 'libvnet_avx2.la'
> 
> 1. I cannot reproduce this. After branching network/vpp into
> home:jengelh:branches:network/vpp and commenting out the %patch2 call, the
> package *still* builds.
> 
> The only way I can raise the error is by forcing libvnet_avx2.la out of
> libvnet_la_LIBADD and into libvnet_la_DEPENDENCIES --- kind of what part of
> your patch does. So the patch actually broke stuff.
> 
> >Error goes away with
> >+libvnet_la_DEPENDENCIES += libvnet_avx2.la
> 
> 2. Such lines should not be needed: all words in the libvnet_la_LIBADD
> variable are implicitly counted as dependencies already if the word refers
> to a make target.

Another interesting thing: you say that my patch break things but how come that actually the package builds successful even in OBS? The issue you face is a compile-time error so it should show up in OBS as well, right?
Comment 9 Jan Engelhardt 2018-10-02 17:42:56 UTC
Gee, thanks for pointing out vpp is a silly piece that ignores the job count inherited from rpmbuild, and - now that I can reproduce that error - also broke dependencies bigtime.

Despite

 libvnet_la_LIBADD += libvnet_avx2.la

the make call does not build avx2 first.

$ make libvnet.la
  CCLD     libvnet.la
libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled argument 'libvnet_avx2.la'

So that's a dead give away.
Find my patch in SR 639646.
Comment 10 Marco Varlese 2018-10-03 06:45:02 UTC
(In reply to Jan Engelhardt from comment #9)
> Gee, thanks for pointing out vpp is a silly piece that ignores the job count
> inherited from rpmbuild, and - now that I can reproduce that error - also
> broke dependencies bigtime.
> 
> Despite
> 
>  libvnet_la_LIBADD += libvnet_avx2.la
> 
> the make call does not build avx2 first.
> 
> $ make libvnet.la
>   CCLD     libvnet.la
> libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled
> argument 'libvnet_avx2.la'
> 
> So that's a dead give away.
> Find my patch in SR 639646.

Your SR is not building successfully.
Comment 11 Marco Varlese 2018-10-03 06:47:17 UTC
(In reply to Jan Engelhardt from comment #9)
> Gee, thanks for pointing out vpp is a silly piece that ignores the job count
> inherited from rpmbuild, and - now that I can reproduce that error - also
> broke dependencies bigtime.
Well, not sure if "silly" is an appropriate word for it.
Its build-system simply tries to be smart enough to detect how many cores there are available. Having said that, the same build-system allows the end-user to change that by using the env variable MAKE_PARALLEL_JOBS.
> 
> Despite
> 
>  libvnet_la_LIBADD += libvnet_avx2.la
> 
> the make call does not build avx2 first.
> 
> $ make libvnet.la
>   CCLD     libvnet.la
> libtool:   error: cannot find the library 'libvnet_avx2.la' or unhandled
> argument 'libvnet_avx2.la'
> 
> So that's a dead give away.
> Find my patch in SR 639646.
Comment 12 Jan Engelhardt 2018-10-03 14:41:37 UTC
>Well, not sure if "silly" is an appropriate word for it.

I think it fits because it goes against established practices: a user requesting execution with `make -j7` won't get 7 but some other number in most deployed environments.
Comment 13 Bernhard Wiedemann 2018-10-08 14:23:47 UTC
build with kvm and -j1 is now working
you may open another bug for the stupid auto-detection of available cores.