Bugzilla – Bug 746704
sudo doesn't run commands
Last modified: 2012-06-05 09:51:27 UTC
Sudo doesn't work in factory. For about 10 days. It just does nothing: $ sudo echo ahoj $ If one straces the process, they can see: write(2, "sudo", 4sudo) = 4 write(2, ": ", 2: ) = 2 write(2, "must be setuid root", 19must be setuid root) = 19 write(2, "\n", 1 Otherwise the error cannot be seen? Why? # rpm -V sudo 5S.T..... c /etc/sudoers # ll /usr/bin/sudo -rwsr-xr-x 1 root root 79768 1. úno 14.08 /usr/bin/sudo
The "must be setuid root" is caused by running sudo in strace From man strace: BUGS Programs that use the setuid bit do not have effective user ID privileges while being traced. sudo then fails on this check: if (geteuid() != 0) errorx(1, _("must be setuid root"));
I've found some similar problems on systemd mailing list: http://lists.freedesktop.org/archives/systemd-devel/2012-February/004461.html
(In reply to comment #2) > I've found some similar problems on systemd mailing list: > > http://lists.freedesktop.org/archives/systemd-devel/2012-February/004461.html This is very likely it!
I don't have the issue on my Factory vm, with systemd from Factory or Base:System. could it be a pam issue caused by systemd pam module ?
(In reply to comment #4) > I don't have the issue on my Factory vm, with systemd from Factory or > Base:System. This is not reproducible on console. Only from xterm. > could it be a pam issue caused by systemd pam module ? I don't know. The only thing I see in messages is: sudo: xslaby : TTY=pts/11 ; PWD=/home/xslaby ; USER=root ; COMMAND=/bin/echo ahoj systemd-logind[2172]: New session c8 of user root. systemd-logind[2172]: Removed session c8.
works fine in a xterm and gnome-terminal here too. which display manager and which desktop environment are you using ?
(In reply to comment #6) > works fine in a xterm and gnome-terminal here too. > > which display manager and which desktop environment are you using ? KDM+KDE4 and KDM+xfce4
https://bugs.freedesktop.org/show_bug.cgi?id=45670 KDE 4.8
could it be a bug in kdm ? works from regular console... and above is stated that it does in xterm and gnome-terminal...
(In reply to comment #9) > could it be a bug in kdm ? It looks like that. With xdm it works.
strange, I don't have the issue with kdm here (running either GNOME or KDE, with either xterm, gnome-terminal or konsole).
(In reply to comment #11) > strange, I don't have the issue with kdm here (running either GNOME or KDE, > with either xterm, gnome-terminal or konsole). Wild crazy guess, may be due to different kernel versions ? Im running 3.3.0rc3
Ok , works in xdm, gdm.. xterm, whatever not kdm...cc KDE team.
please test with Factory kernel. I'm using a Factory only VM, nothing else.
(In reply to comment #14) > please test with Factory kernel. I'm using a Factory only VM, nothing else. Yes, 3.2.0 works. Even 3.2.0 vanilla. But if I build my own 3.2, it doesn't work. Neither 3.3-rc3 vanilla nor default kernel works. I believe this is a kernel configuration change problem.
(In reply to comment #15) > (In reply to comment #14) > > please test with Factory kernel. I'm using a Factory only VM, nothing else. > > Yes, 3.2.0 works. Even 3.2.0 vanilla. But if I build my own 3.2, it doesn't > work. Neither 3.3-rc3 vanilla nor default kernel works. I believe this is a > kernel configuration change problem. Great, I 'm not yet crazy it seems :-) can you take a look at it then ?
same problem here on plain factory: kernel-desktop 3.2.0, kdm and konsole.
still no problem on my VM with update to date factory with both kernel-default and kernel-desktop 3.2.0-2 packages (with kdm and konsole). So there is something more to trigger this bug. Is anybody having this issue running on 32bits ? (my VM is).
could people check their /etc/systemd/systemd-logind.conf (everything should be commented there) ? and also, check /etc/pam.d/sudo and /etc/pam.d/common-session(-pc) (those should contains session optional pam_systemd.so)
(In reply to comment #19) > could people check their /etc/systemd/systemd-logind.conf (everything should be > commented there) ? $ grep -v ^# /etc/systemd/systemd-logind.conf [Login] > and also, check /etc/pam.d/sudo and /etc/pam.d/common-session(-pc) (those > should contains session optional pam_systemd.so) # cat /etc/pam.d/sudo ... session include common-session # cat /etc/pam.d/common-session-pc ... session optional pam_systemd.so
My pam configuration is also correct.. Im using x86_64 version. Who is in charge of checking kernel configs ? this still looks like a kernel bug..
I'm not sure it is a kernel issue, since coolo has the issue with kernel-desktop 3.2.0 and I don't with the same kernel.
Today I did a clean factory install in one of the failing boxes, the result - 3.2.0-2-desktop x86_64 - systemd-42-2.1.x86_64 sudo failure is only reproducible with kdm, I switched to xdm and it works again..
I'm also experiencing this bug in a Factory VM in Konsole on KDE. I just did zypper up, to see if it would solve the issue, but it's still present. Let me know if there's anything I can do to help debug this.
I have the same problem inside running tmux session, which survives restart of X (blame Gnome 3 for that ;-)). Running sudo -s or sudo zypper lr, or sudo /usr/bin/build does not do anything inside the tmux session. Doing the same from xterm/gnome-terminal/konsole/whatever works well. I've exported all environment variables from normal session into tmux, but it did not helped. My config looks like in #comment20, no changes from default. $ rpm -q systemd kernel-desktop gdm systemd-42-2.1.x86_64 kernel-desktop-3.3.rc5-1.4.x86_64 gdm-3.2.1.1-3.1.x86_64 That's what my messages says Mar 12 15:20:49 zelva sudo: mvyskocil : TTY=pts/6 ; PWD=/home/mvyskocil/work/OBS/Java:packages/jakarta-commons-net ; USER=root ; COMMAND=/usr/bin/less /var/log/messages Mar 12 15:20:49 zelva systemd[1]: Got D-Bus request: org.freedesktop.DBus.NameOwnerChanged() on /org/freedesktop/DBus Mar 12 15:20:49 zelva systemd-logind[735]: New session 2 of user mvyskocil. Mar 12 15:20:49 zelva systemd-logind[735]: Removed session 2. Mar 12 15:20:49 zelva systemd[1]: Got D-Bus request: org.freedesktop.DBus.NameOwnerChanged() on /org/freedesktop/DBus I made strace from unsuccessful and successful run and it seems that in buggy state sudo ends **after** PAM:session_open call. In both cases the sequences of syscalls is the same, but just before second call of getdents on NETLINK socket used for pam, the process is interrupted ... close(8) = 0 --- {si_signo=SIGHUP, si_code=SI_USER, si_pid=735, si_uid=0, si_value={int=2552531996, ptr=0x3bae198248c1c}} (Hangup) --- <... select resumed> ) = ? ERESTARTNOHAND (To be restarted) --- {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=31995, si_status=SIGHUP, si_utime=0, si_stime=0} (Child exited) --- write(8, "\21", 1) = 1 rt_sigreturn(0x8) = -1 EINTR (Interrupted system call) the normal run continues running execve close(8) = 0 getdents(3, /* 0 entries */, 32768) = 0 close(3) = 0 execve("/usr/bin/zypper", ["zypper", "lr"], [/* 19 vars */]) = 0 1426 32029 brk(0) = 0x763000 However I did not found what's the descriptor 8 (which close might be a root cause)
Created attachment 480953 [details] output of strace -f -o sudo-fail.out -s 4096 zypper lr from tmux session
Created attachment 480954 [details] output of strace -f -o sudo-correct.out -s 4096 zypper lr from gnome-terminal
write(8, "\21", 1) = 1 hrmmm.. now the question is from where FD 8 comes from.. and why EINTR is not handled...
Good catch, I did not see there's a write(8) after close(8). However I know what is the fd 8, but I did not find descriptor 7 ;-) fcntl(7, F_DUPFD_CLOEXEC, 3) = 8 But the code below does not make a sense, so I assume sudo is in very strange state after EINTR close(8) = 0 write(8, "\21", 1) = 1 rt_sigreturn(0x8) = -1 EINTR (Interrupted system call) select(8, [3 7], [], NULL, NULL) = 1 (in [7]) read(7, "\21", 1)
(In reply to comment #29) > Good catch, I did not see there's a write(8) after close(8). However I know > what is the fd 8, but I did not find descriptor 7 ;-) > > fcntl(7, F_DUPFD_CLOEXEC, 3) = 8 > > But the code below does not make a sense, so I assume sudo is in very strange > state after EINTR > > close(8) = 0 > write(8, "\21", 1) = 1 > rt_sigreturn(0x8) = -1 EINTR (Interrupted system call) > select(8, [3 7], [], NULL, NULL) = 1 (in [7]) > read(7, "\21", 1) Question is ..in what piece of the puzzle write() is interrupted.. unless I am missing something EINTR is a temporary error condition and hence has to be handled with TEMP_FAILURE_RETRY( write(.... ) ) ..however Im afraid that wont solve the problem, as fd 8 is closed *before* write...
I would say we are focused on a wrong part of a trace - the most important is the fact systemd-logind sent HUP to the process for some reason. I've found only logind-session.c:622 [1], where it can happen, but don't know why it decides there is a time to close a session. http://cgit.freedesktop.org/systemd/systemd/tree/src/login/logind-session.c#n622
please test systemd >= 44-222.1 from Base:System, it contains a potential fix for this problem.
No change with: # rpm -q systemd --changelog|head * Čt bře 22 2012 fcrozat@suse.com - Update fixppc.patch with upstream patches - Add comments from upstream in 0001-util-never-follow-symlinks-in-rm_rf_children.patch. - Add logind-logout.patch: it should fix sudo / su with pam_systemd (bnc#746704).
(In reply to comment #32) > please test systemd >= 44-222.1 from Base:System, it contains a potential fix > for this problem. I don't see my issue even with systemd-43 from openSUSE:Factory, as Gnome3 still enforces me to do some restarts, all tmux sessions now behaves correctly despite the detach/attach.
could people with the bug try to add "session required pam_loginuid.so" to their /etc/pam.d/sudo file, before "session include common-session" line ?
(In reply to comment #35) > could people with the bug try to add > "session required pam_loginuid.so" to their /etc/pam.d/sudo file, before > "session include common-session" line ? Sure: no change...
Same with me with systemd-44 (including a potential fix in comment#32). I see no change in behavior, strace output nor in /var/log/messages.
I see the problem with a KDE x86_64 system installed with the Live CD Build 0318, which I think is 12.2 MS3.
Same here with Factory from Monday, running osc build crashed X (using KDM, konsole)
Btw. what happens if you remove pam_systemd from pam.d files (uncomment it)? Does that workaround the problem?
I can now run osc build again without crashes after removing the pam_systemd (no need to login/logout). Interesting, man pam_systemd points out that killing of user processes is disabled by default - and looking at the configs it is indeed. Frederic, is there a way to check which flags are really in use? Perhaps somebody changed the default without updating the manual...
you can try to add debug=1 on pam_systemd line in pam configuration
My X server is now killed by running sudo -s ..
I've been able to reproduce the X crash one time, after enabling debug, but so far, that's it.. If you have a way to reproduce the crash reliably, I'm all ears..
This also happens with LightDM and Terminal or xterm.
On my system I had yesterday and the day before (I paste from /var/log/messages): X server running, remote login via ssh: Apr 25 20:14:08 byrd sshd[2251]: Accepted keyboard-interactive/pam for aj from 10.203.0.15 po rt 41638 ssh2 Apr 25 20:14:08 byrd systemd-logind[634]: New session 130 of user aj. And then I run remotely sudo: Apr 25 20:36:25 byrd sudo: aj : TTY=pts/27 ; PWD=/home/aj/build/osc-branches/my-factory-packages/linux-glibc-devel ; USER=root ; COMMAND=/usr/bin/vi /usr/include/asm-x86/unistd_32.h Apr 25 20:36:25 byrd systemd-logind[634]: Removed session 2. Apr 25 20:36:28 byrd dbus-daemon[926]: **** /proc/self/mountinfo changed Apr 25 20:36:29 byrd kdm: :0[4656]: pam_systemd(xdm:session): Failed to release session: No such file or directory Apr 25 20:36:29 byrd kdm: :0[4656]: pam_close_session() failed: Cannot make/remove an entry for the specified session Next morning, I only had kdm running, login on X server: Apr 26 08:48:17 byrd systemd-logind[634]: New session 184 of user aj. Apr 26 08:48:17 byrd systemd-logind[634]: Linked /tmp/.X11-unix/X0 to /run/user/aj/X11-displa Apr 26 08:48:18 byrd checkproc: checkproc: can not get session id for process 4890! Apr 26 08:48:38 byrd pulseaudio[31375]: [pulseaudio] pid.c: Stale PID file, overwriting. First tries to run osc build - nothing happened: Apr 26 09:16:17 byrd sudo: aj : TTY=pts/27 ; PWD=/home/aj/build/osc-branches/my-factory-packages/linux-glibc-devel ; USER=root ; COMMAND=/usr/bin/build --root=/abuild/osc/buildroot_openSUSE_Factory-x86_64 --rpmlist=/tmp/rpmlist.kMTKbQ --dist=/home/aj/build/osc-branches/my-factory-packages/linux-glibc-devel/.osc/_buildconfig-openSUSE_Factory-x86_64 --arch=x86_64 --norootforbuild --changelog --jobs=8 --debug /home/aj/build/osc-branches/my-factory-packages/linux-glibc-devel/linux-glibc-devel.spec Apr 26 09:16:17 byrd systemd-logind[634]: New session 2 of user aj. Apr 26 09:16:17 byrd systemd-logind[634]: Removed session 2. .... Apr 26 09:19:33 byrd sudo: aj : TTY=pts/27 ; PWD=/home/aj/build/osc-branches/my-factory -packages/linux-glibc-devel ; USER=root ; COMMAND=/usr/bin/build --root=/abuild/osc/buildroot _openSUSE_Factory-x86_64 --rpmlist=/tmp/rpmlist.3CVfs3 --dist=/home/aj/build/osc-branches/my- factory-packages/linux-glibc-devel/.osc/_buildconfig-openSUSE_Factory-x86_64 --arch=x86_64 -- norootforbuild --clean --changelog --jobs=8 --debug /home/aj/build/osc-branches/my-factory-packages/linux-glibc-devel/linux-glibc-devel.spec Apr 26 09:19:33 byrd systemd-logind[634]: New session 2 of user aj. Apr 26 09:19:33 byrd systemd-logind[634]: Removed session 2. ... Second try to build - this time different package: Apr 26 09:21:40 byrd sudo: aj : TTY=pts/10 ; PWD=/home/aj/build/osc-branches/my-factory-packages/glibc ; USER=root ; COMMAND=/usr/bin/build --root=/abuild/osc/buildroot_openSUSE_Factory-x86_64 --rpmlist=/tmp/rpmlist.SIQTVE --dist=/home/aj/build/osc-branches/my-factory-packages/glibc/.osc/_buildconfig-openSUSE_Factory-x86_64 --arch=x86_64 --norootforbuild --changelog --jobs=8 --debug /home/aj/build/osc-branches/my-factory-packages/glibc/glibc.spec Apr 26 09:21:40 byrd systemd-logind[634]: Removed session 184. Apr 26 09:21:42 byrd dbus-daemon[926]: **** /proc/self/mountinfo changed Apr 26 09:21:43 dbus-daemon[926]: last message repeated 2 times Apr 26 09:21:43 byrd kdm: :0[3924]: pam_systemd(xdm:session): Failed to release session: No s uch file or directory Apr 26 09:21:43 byrd kdm: :0[3924]: pam_close_session() failed: Cannot make/remove an entry f or the specified session Apr 26 09:21:45 byrd polkitd(authority=local): Unregistered Authentication Agent for unix-ses sion:/org/freedesktop/ConsoleKit/Session3 (system bus name :1.1982, object path /org/kde/Poli cyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus) Apr 26 09:21:45 byrd dbus-daemon[926]: **** /proc/self/mountinfo changed Apr 26 09:21:45 byrd dbus-daemon[926]: **** /proc/self/mountinfo changed Apr 26 09:21:47 byrd console-kit-daemon[4684]: rmdir: failed to remove `/var/run/dbus/at_cons ole/aj': Directory not empty Apr 26 09:21:48 byrd udevd[307]: RUN+="socket:..." support will be removed from a future udev release. Please remove it from: /etc/udev/rules.d/71-multipath.rules:7 and use libudev to su bscribe to events. Apr 26 09:21:49 byrd udevd[307]: RUN+="socket:..." support will be removed from a future udev release. Please remove it from: /etc/udev/rules.d/71-multipath.rules:7 and use libudev to su bscribe to events. Apr 26 09:21:50 byrd acpid: 1 client rule loaded Apr 26 09:22:07 byrd systemd-logind[634]: New session 187 of user aj. Apr 26 09:22:07 byrd systemd-logind[634]: Linked /tmp/.X11-unix/X0 to /run/user/aj/X11-displa y. Apr 26 09:22:08 byrd checkproc: checkproc: can not get session id for process 31038! so, sudo killed the X server and I logged in again. Note, there's no "new session" this time, it kills the X server session...
I've pushed a package with additional debug in it to http://download.opensuse.org/repositories/home:/fcrozat:/debug_sudo/openSUSE_Factory logs welcome ;)
The above URL is not valid currently. Did you disable publishing, gave a wrong URL - or is it still building?
should be available now (obs was slow..)
ok, found a 100% reliable way to have the session termination (even under tty): - login as user - su -l - logout as root - sudo -s => terminate session from systemd-logind logs, it looks like session created by su -l (or su) is not closed at logout. Investigating why..
This is an autogenerated message for OBS integration: This bug (746704) was mentioned in https://build.opensuse.org/request/show/116331 Factory / systemd
I've disabled patch logind-logout.patch in Factory systemd (and Base:System) since it is causing the session crashes. It means we are back to comment 1, and I need a reproducible test case. I'm interested with /var/log/messages (with debug=1 in pam_systemd line in /etc/pam.d/common-auth) when sudo is failing.
I've been able to reproduce tmux bug (and also with screen) but I'm not 100% sure it is the same bug as the one reported by coolo since coolo doesn't use any of those tools. To reproduce tmux/screen bug, you need to create a tmux/screen session, detach it, logout physically, login again (so the logind session id is different) and re-attach tmux/screen session. The session id is still the old one and any su or sudo call might fail (for su calls, you see a killed... message at logout). If you don't use screen / tmux and are still having the bug, when you get the bug, please give output of systemd-loginctl and echo $XDG_SESSION_ID and /var/log/messages pam_systemd output (with debug=1 on pam_systemd line in common-session).
I would also need the value of /proc/self/sessionid when you get the bug (and also say which kernel and arch you are running)
ok, I've been able to reproduce the issue reliably, with autologin with kdm. Could people with this bug try to edit /etc/pam.d/xdm-np and add session required pam_loginuid.so before the line: session include common-session This should fix the "sudo" bug (not the tmux/screen bug)
(In reply to comment #55) > ok, I've been able to reproduce the issue reliably, with autologin with kdm. > > Could people with this bug try to edit /etc/pam.d/xdm-np > > and add > > session required pam_loginuid.so > > > before the line: > session include common-session > > This should fix the "sudo" bug (not the tmux/screen bug) That worked, thank you very much for your work ;) isnt supposed that pam_loginuid should be included in common-session before pam_systemd or I am missing something here ?
(In reply to comment #55) > ok, I've been able to reproduce the issue reliably, with autologin with kdm. > > Could people with this bug try to edit /etc/pam.d/xdm-np > > and add > > session required pam_loginuid.so > > > before the line: > session include common-session > > This should fix the "sudo" bug (not the tmux/screen bug) Unfortuantely it worked only once... sudo -s worked.. then issued sudo rm -rf directorytoclean and the X session crashed again... :(
the X session crash should be fixed by disabling logind-logout.patch (I'm not sure this has reached Factory yes). loginuid shouldn't be part of common-session, but we should fix xdm-np.
(In reply to comment #58) > the X session crash should be fixed by disabling logind-logout.patch (I'm not > sure this has reached Factory yes). loginuid shouldn't be part of common- > session, but we should fix xdm-np. done. SR #120615
There has been new sudo 1.8.5 in Factory for about a week. Its PAM session is finally managed in one process, instead of opening the session in one process, than fork and then close the PAM session in the child process. It fixed problems with some PAM modules, such as pam_mount. Could you give it a try?
I no longer have the problem after xdm-np was fixed.
Anyone with the tmux/screen bug?
there is still one bug remaining, when sudo (or su -l) are using in a screen / tmux session, when they have been detached and re-attach, since the audit session id is no longer valid. Maybe we should track this bug in a separate bug report ?
Hmm, I thought sudo 1.8.5 was part of Factory but it hasn't been checked-in yet. I've installed it from Base:System and I can no longer reproduce the tmux / screen bug, even after re-applying upstream patch in systemd which was causing X session to crash.
closing bug as fixed, with latest Factory