Bug 1197780 - rpm varies from parallelism
Summary: rpm varies from parallelism
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Development (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Dirk Mueller
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-31 02:37 UTC by Bernhard Wiedemann
Modified: 2023-04-06 10:21 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bernhard Wiedemann 2022-03-31 02:37:06 UTC
Today I observed a new class of issues that affects at least
adinatha-fonts arabic-fonts gdouros-akkadian-fonts icingaweb2 solaar xsane

I did not re-test all packages.
There are unaffected packages.
Affected package sizes range from 1MB to 5MB.

I made a minimal reproducer in
https://build.opensuse.org/package/show/home:bmwiedemann:reproducible:test/adinatha-fonts

Building it with 
for N in 1 2 ; do
 osc build --noservice --clean --vm-type=kvm -j$N \
 --keep-pkgs=pkgs.$N openSUSE_Tumbleweed x86_64 ; done
causes it to vary

It probably has to do with

filterdiff printrpmtags pkgs.[12]/adinatha-fonts-1.0-0.noarch.rpm
 LONGARCHIVESIZE=2146852
 LONGFILESIZES=0
-LONGSIGSIZE=1897928
+LONGSIGSIZE=1897768
 LONGSIZE=2146214


What recent change could cause this? Does it have to do with compression?
There have been previous cases where single-threaded compression gave different results, but afaik zstd avoided these by always using the multi-threaded code, even on single-core.
Comment 1 Bernhard Wiedemann 2022-03-31 02:43:48 UTC
http://w3.nue.suse.com/~bwiedemann/rb/boo1197780/ has sample binaries.

diff of
rpm --debug -qpvl
< ufdio:       6 reads,     6809 total bytes in 0.000008 secs
> ufdio:       6 reads,     6969 total bytes in 0.000008 secs

shows that there are 60 extra bytes - probably in the header.
Comment 2 Martin Liška 2022-04-04 10:04:54 UTC
All right, I can reproduce it. The first investigation shows that it happens even if I disable ZSTD thread pool and use threads == 1. But one needs parallel processing of files in rpm (using openMP).
Comment 3 Martin Liška 2022-04-04 10:39:03 UTC
Do you think it's a recent regression?
Comment 4 Martin Liška 2022-04-04 10:52:55 UTC
All right, so the issue is not related to the ZSTD compression algorithm. I can see the same behavior when I switch either to xzdio or gzio:

$ rpm -qp --queryformat '%{PAYLOADCOMPRESSOR}\n' pkgs.1/adinatha-fonts-1.0-0.noarch.rpm 
gzip

$ rpm --debug -qpvl ./pkgs.1/adinatha-fonts-1.0-0.noarch.rpm 2>&1 | grep 'total'
ufdio:       6 reads,     6765 total bytes in 0.000005 secs

$ rpm --debug -qpvl ./pkgs.2/adinatha-fonts-1.0-0.noarch.rpm 2>&1 | grep 'total'
ufdio:       6 reads,     6925 total bytes in 0.000007 secs
Comment 5 Bernhard Wiedemann 2022-04-04 12:14:53 UTC
> Do you think it's a recent regression?

Yes, I only observed this by 2022-03-31 and I'm pretty sure it was not present at end of February when I spent some time on analysis for the previous monthly rb report.

The oldest such diff is from 2022-03-31 01:38

I reviewed the openSUSE changes and on the 30th, only this looked like a candidate
https://github.com/bmwiedemann/openSUSE/commit/7c9447751d9dfb
Comment 6 Bernhard Wiedemann 2022-04-06 12:39:09 UTC
I enhanced the reproducer to find that I can still trigger this issue when I just keep the first 154 bytes of one file and 17 of the other, but not with less.

This makes me suspect mime-type detection to be involved.

and indeed
filterdiff strings pkgs.*/*noarch.rpm
shows
 noarch-suse-linux
 directory
+TrueType Font data, 19 tables, 1st "Feat"
 utf-8

The version built on 1 core lacks the .otf metadata.
Comment 7 Martin Liška 2022-04-06 12:51:35 UTC
Nice reduction, yes it would be output of 'file' command:

$ file *.otf
Adinatha-Tamil-Brahmi.otf: TrueType Font data, 19 tables, 1st "Feat", 42 names, Macintosh, Typeface \251 (your company). 2010. All Rights ReservedRegularAdinatha Tamil Brahmi:Version 1.00
Comment 8 Martin Liška 2022-04-06 15:05:42 UTC
I've got a debugging patch:

diff --git a/build/rpmfc.c b/build/rpmfc.c
index eb51a36..51f0998 100644
--- a/build/rpmfc.c
+++ b/build/rpmfc.c
@@ -1251,7 +1252,16 @@ rpmRC rpmfcClassify(rpmfc fc, ARGV_t argv, rpm_mode_t * fmode)
 	    else if (slen >= fc->brlen+sizeof("/dev/") && rstreqn(s+fc->brlen, "/dev/", sizeof("/dev/")-1))
 		ftype = "";
 	    else
-		ftype = magic_file(ms, s);
+            {
+              ftype = magic_file(ms, s);
+              fprintf (stderr, "ftype called for %s with result=%s\n", s, ftype);
+
+              magic_t ms2 = magic_open(msflags);
+              int r = magic_load(ms2, NULL);
+              fprintf (stderr, "magic_load=%d\n", r);
+              const char *ftype2 = magic_file(ms2, s);
+              fprintf (stderr, "ftype2 called for %s with result=%s\n", s, ftype2);
+            }
 
 	    /* Silence errors from immaterial %ghosts */
 	    if (ftype == NULL && errno == ENOENT)

That shows for job=1:

[    8s] ftype called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] ftype called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=data
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"

while for jobs>1:

[    8s] ftype called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] ftype called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"

Note with jobs>2 we use openmp and thus:

    /* libmagic is not thread-safe, each thread needs to a private handle */
    magic_t ms = magic_open(msflags);
    magic_t mime = magic_open(mimeflags);

each thread has its own 'ms' and 'mime' variables. Seems reusing of 'ms' does not work for some reason..
Comment 9 Martin Liška 2022-04-06 15:12:41 UTC
The issue might be caused by some reason change in libmagic.
Comment 10 Martin Liška 2022-04-06 15:36:57 UTC
The modified version of rpm package can be seen here:
home:marxin:stability2/rpm
Comment 11 Martin Liška 2022-04-06 15:52:04 UTC
The output looks like:

[    8s] ms=0x55c928947810 created
[    8s] ftype called with ms=0x55c928947810 for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    8s] ftype called with ms=0x55c928947810 for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=data
[    8s] magic_load=0
[    8s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"
[    8s] closing ms=0x55c928947810

'result=data' is the problematic output, while jobs>2 does:

[    9s] ms=0x562b74b3ef90 created
[    9s] ms=0x7f8364000b70 created
[    9s] ms=0x7f835c000b70 created
[    9s] ms=0x7f8354000b70 created
[    9s] ftype called with ms=0x7f8354000b70 for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"
[    9s] magic_load=0
[    9s] ftype called with ms=0x7f835c000b70 for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    9s] magic_load=0
[    9s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha-Tamil-Brahmi.otf with result=TrueType Font data, 19 tables, 1st "Feat"
[    9s] ftype2 called for /home/abuild/rpmbuild/BUILDROOT/adinatha-fonts-1.0-0.x86_64/usr/share/doc/packages/adinatha-fonts/Adinatha Tamil Brahmi Manual.pdf with result=PDF document, version 1.5
[    9s] closing ms=0x7f835c000b70
[    9s] closing ms=0x562b74b3ef90
[    9s] closing ms=0x7f8354000b70
[    9s] closing ms=0x7f8364000b70
Comment 12 Dr. Werner Fink 2022-04-07 06:38:02 UTC
About which version of file package (and sub package libmagic1) do we talk?


 Wed Mar 23 09:02:37 UTC 2022 - Dirk Müller <dmueller@suse.com>
 - add file-5.41-cache-regexps-locale-restore.patch to restore
   previous locale handling behavior 

 Sat Mar 19 18:00:32 UTC 2022 - Dirk Müller <dmueller@suse.com>
 - add file-5.41-cache-regexps.patch to cache regexp lookups 

 Thu Feb 24 10:05:17 UTC 2022 - Dr. Werner Fink <werner@suse.de>
 - Reenable libseccomp sandboxing
Comment 13 Dirk Mueller 2022-04-07 07:10:07 UTC
I will fix this.
Comment 14 Dirk Mueller 2022-04-13 20:45:48 UTC
the issue was rules with use/name. after one of those hit, the regexp caching offsets were miscalculated, which led to false caches and therefore mismatches. 

simple reproducer, no rpm(1) involved: 

file /usr/share/doc/packages/adinatha-fonts/Adinatha\ Tamil\ Brahmi\ Manual.pdf \
/usr/share/fonts/truetype/Adinatha-Tamil-Brahmi.otf

pdf was caching the wrong regexp under the offset that the truetype font was executing, hence it didn't match anymore. 

patch sent upstream.