Bugzilla – Bug 1094323
packages do not build reproducibly from pip install
Last modified: 2020-03-27 11:40:07 UTC
When working on reproducible builds for openSUSE I found that some packages use pip install which creates .pyc files that contain a random tmp path in them that is different for every build affects at least: python-cluster python-jupyter_bqplot python-jupyter_imatlab_kernel python-PsyLab python-sphinxcontrib-github-alt here is an example diff: /usr/lib/python3.6/site-packages/cluster/method/__pycache__/kmeans.cpython-36.pyc @@ -110,7 +110,7 @@ 000006d0 00 5a 0e 63 6f 6e 74 72 6f 6c 5f 6c 65 6e 67 74 |.Z.control_lengt| 000006e0 68 da 04 69 74 65 6d a9 00 72 13 00 00 00 fa 3a |h..item..r.....:| 000006f0 2f 74 6d 70 2f 70 69 70 2d 69 6e 73 74 61 6c 6c |/tmp/pip-install| -00000700 2d 32 75 38 6a 6f 79 74 76 2f 63 6c 75 73 74 65 |-2u8joytv/cluste| +00000700 2d 6e 69 6c 34 6b 39 77 73 2f 63 6c 75 73 74 65 |-nil4k9ws/cluste| 00000710 72 2f 63 6c 75 73 74 65 72 2f 6d 65 74 68 6f 64 |r/cluster/method| 00000720 2f 6b 6d 65 61 6e 73 2e 70 79 da 08 5f 5f 69 6e |/kmeans.py..__in| 00000730 69 74 5f 5f 2e 00 00 00 73 20 00 00 00 00 01 06 |it__....s ......| IMHO, those .pyc files should not contain any tmp path because it does not exist in the target system anyway.
Feel free to convert them to setuptools, all packages that are using pip are wrong anyway. I've converted python-cluster.
I know too little about python-setuptools to do it. I was actually hoping, that Todd could help there since he originally added these packages Current list of affected packages in Factory is: python-cluster python-jupyter_bqplot python-jupyter_imatlab_kernel python-jupyter_jupyterlab_discovery python-jupyter_jupyterlab_github python-jupyter_jupyterlab_latex python-jupyter_jupyterlab python-jupyter_kernel_test python-jupyter_matlab_kernel python-jupyter_nbdime python-jupyter_rise python-jupyter_Video_Widget python-PsyLab and some with pip-build instead of pip-install: python-jupyter_octave_kernel python-jupyter_widgetsnbextension python-sphinxcontrib-github-alt
This is an autogenerated message for OBS integration: This bug (1094323) was mentioned in https://build.opensuse.org/request/show/618015 Factory / python-cluster
The jupyter packages at the very least pretty much have to use pip. They aren't simple python packages, they also include configuration files, javascript code, and assets that are really hard to install reliably and can change randomly between releases in undocumented ways. Using pip guarantees that everything that is needed to run the code is installed and installed in the right place. In more and more cases this includes files pulled in specifically for the wheels that aren't part of the github releases (for example npm packages), which makes it effectively impossible to use the github releases directly. If there is a problem with how pip is installed stuff it really needs to be fixed. More and more packages are going for wheel-only releases, and some are dropping setup.py entirely. Whether we like it is not, we are going to need to be able to deal with wheels.
so there are 2 problems: 1. the random tmp path in .pyc files shown in comment 0 that can be solved with a %python_expand find %{buildroot}%{$python_sitelib} -name \*.pyc | xargs rm Since (according to my measurements) .pyc files dont give a significant performance boost, we could just opt to omit them, but if someone insists on having them, we can probably call py_compile to create them again. 2. /usr/lib/python3.6/site-packages/PsyLab-0.4.7.12.dist-info/RECORD generated by pip install contains entries of .pyc files in indetermistic filesystem order. I guess, there is a os.listdir, os.walk or glob.glob call somewhere that needs a sort() or sorted() added, but finding the right line can take a while. I guess we dont even need the RECORD file, because RPM metadata already keeps track of installed files. => %python_expand rm %{buildroot}%{$python_sitelib}/*%{version}*/RECORD but actually, it would be nicer, if upstream pip would just do both right
As of pip 10, RECORD should be deterministic [1], and for my with tumbleweed it looks like it is in sorted order by file type. I am communicating with upstream about the pyc file issue[2] . [1] https://github.com/pypa/pip/pull/4667 [2] https://github.com/pypa/pip/issues/4371
Made a working pip patch for the RECORD ordering: https://github.com/pypa/pip/pull/5525
That pull request has been merged and it should be available in Tumbleweed's version.
While RECORD files are indeed reproducible now, .pyc files are still not with the same type of diff as in comment 0 IMHO, there is no good reason for them to contain a random path under /tmp/ because that will not exist at runtime, so if it was used for anything, it would be broken. And if it is not used for anything, it does not need to be in there. Is there already an upstream pip bug tracking that?
On python-jupyter_imatlab_kernel I found a nice solution to this bug: We need to change the pip install call to use --no-compile instead of --compile and let the normal openSUSE python scripts handle the compilation instead.
(In reply to Bernhard Wiedemann from comment #10) > On python-jupyter_imatlab_kernel I found a nice solution to this bug: > We need to change the pip install call to use --no-compile instead of > --compile > and let the normal openSUSE python scripts handle the compilation instead. You mean that the macro %python3_install \ %{_python_use_flavor python3} \ %__python3 %{py_setup} %{?py_setup_args} install \\\ -O1 --skip-build --force --root %{buildroot} --prefix %{_prefix} in /etc/rpm/macros.python_all should be changed by adding --no-compile to the setup.py install command? We can do that, I suppose.
(In reply to Matej Cepl from comment #11) > (In reply to Bernhard Wiedemann from comment #10) > > On python-jupyter_imatlab_kernel I found a nice solution to this bug: > > We need to change the pip install call to use --no-compile instead of > > --compile > > and let the normal openSUSE python scripts handle the compilation instead. > > You mean that the macro > > %python3_install \ > %{_python_use_flavor python3} \ > %__python3 %{py_setup} %{?py_setup_args} install \\\ > -O1 --skip-build --force --root %{buildroot} --prefix %{_prefix} > > in /etc/rpm/macros.python_all should be changed by adding --no-compile to > the setup.py install command? > > We can do that, I suppose. Nope, what Berhnard means is really pip call (also sidenote we should probably provide these as macros). From spec: %python_expand pip%{$python_bin_suffix} install --root %{buildroot} --prefix %{_prefix} --no-deps %{SOURCE0}
This is an autogenerated message for OBS integration: This bug (1094323) was mentioned in https://build.opensuse.org/request/show/698361 Factory / jupyter-imatlab
Is https://github.com/openSUSE/python-rpm-macros/commit/2ed22b611eba what you wanted?
Bernhard, what’s missing from this bug, once the changes in python-rpm-macros have been made?
I'm still not satisfied. 1: the new macros are only used in 1 package atm: python-tox However, what is worse: 2: I patched python-jupyter-require.spec with %install -%python_expand pip%{$python_bin_suffix} install --root=%{buildroot} %{SOURCE0} +cp -a %{SOURCE0} . +%pyproject_install And that still resulted in .pyc files that have a /tmp/pip-install-XXXXX random path embedded, because the pyproject_install macro misses the --no-compile flag. python-tox is similarly affected.
(In reply to Bernhard Wiedemann from comment #16) > And that still resulted in .pyc files that have a /tmp/pip-install-XXXXX > random path embedded, because the pyproject_install macro misses the > --no-compile flag. So, https://github.com/openSUSE/python-rpm-macros/pull/37 would be fixing this bug?
Works for me. IMHO those .pyc files did not belong into binary rpms anyway.
At least related to this: do you have any idea what's wrong with https://github.com/sdispater/poetry/issues/1645#issuecomment-559891651 ?
I submitted updates to all affected packages Only one left is python-ipyscales which is already reproducible, because it compiles .pyc files itself.
This is an autogenerated message for OBS integration: This bug (1094323) was mentioned in https://build.opensuse.org/request/show/788450 15.2 / tensorflow2 https://build.opensuse.org/request/show/788451 Backports:SLE-15-SP2 / tensorflow2
This is an autogenerated message for OBS integration: This bug (1094323) was mentioned in https://build.opensuse.org/request/show/787674 Factory / tensorflow2
This is an autogenerated message for OBS integration: This bug (1094323) was mentioned in https://build.opensuse.org/request/show/788978 15.2 / tensorflow2