Bug 904097 - Use of more than one VT produces silent system crash
Use of more than one VT produces silent system crash
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Other
13.2
x86-64 openSUSE 13.2
: P5 - None : Critical (vote)
: ---
Assigned To: Takashi Iwai
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-11-05 19:24 UTC by Stakanov Schufter
Modified: 2018-07-03 20:52 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
output of dmesg after crash (70.21 KB, text/plain)
2014-11-08 00:02 UTC, Stakanov Schufter
Details
output of var/log/messages (appears from an event tiggered with keyboard (266.12 KB, text/plain)
2014-11-08 21:19 UTC, Stakanov Schufter
Details
output of journalctl (example for sata freeze) (10.76 KB, text/plain)
2014-11-08 21:21 UTC, Stakanov Schufter
Details
Kernel trace pre and post system freeze (920.97 KB, text/plain)
2014-12-01 23:25 UTC, Bruno Pesavento
Details
output of crash with both patches (160 bytes, text/plain)
2014-12-04 12:32 UTC, Stakanov Schufter
Details
output demesg of 04/12/2014 (90 bytes, text/plain)
2014-12-04 12:33 UTC, Stakanov Schufter
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stakanov Schufter 2014-11-05 19:24:28 UTC
Lenovo X201. 
Create two or more users in kde.
Work in one. Leave the other(s) open. Wait for a while (like watch a TG in stream) or read in the browser. You move the mouse all normal. You do: change VT with alt+ctrl+Fx (following my settings that is F7 and F8 with two users etc. So switch from the VT where you are to the other. 

Result: immediate, repeatable, stable - the system crashes silently. Just a white cursor on the left high corner of the VGA screen, HDMI is constant off. And even this only if you pull the hdmi cable and plug it in again or if you detach the device from the ultra base docking station. No key can wake up the machine. The ventilator continuous to run at speed as before. 
If you reset the machine no reaction. Closing lid and reopen: no reaction. Alt-ctrl-canc: no reaction. You have to hard reset with the power button. Then you will find that of course data has gone, programs have not been shut down correctly (i.e. firefox) so it did crash. 

I have no clue where the error could be found. So if you need logs..tell me what to look for. There are some ACPI messages in /var/log/messages. But now evident warning or something that I could recover as crash log. 

This is a very crippling bug.
Comment 1 Stakanov Schufter 2014-11-05 19:46:10 UTC
Update: O.K. Does not seem ACPI. It seems connected with this bug:
http://bugzilla.opensuse.org/show_bug.cgi?id=865337

You can trigger this bug any time with:
open VT 1, 2 and three.
now switch between the VTs in fast order. The first three switches between users you do them and then .... boom. System crash. So no waiting necessary (although that makes it dead sure: wait for 10 minutes without changing user so the first change will put into "the working dead" status the system.
Comment 2 Stakanov Schufter 2014-11-06 08:26:31 UTC
I have now also occasional freezes on the system. Controlled the ram but it is good. Controlled smart, disk gives not sign for error. But I found a "cut here" section of when the freezes occur in /var/log/messages that I hope may help.

------------[ cut here ]------------
6/11/2014 09:21:08			2014-11-06T09:19:49.330626+01:00 arabafenice kernel: [ 1625.864078] WARNING: CPU: 0 PID: 8755 at ../drivers/gpu/drm/i915/intel_display.c:3324 intel_crtc_wait_for_pending_flips+0x165/0x170 [i915]()
6/11/2014 09:21:08			2014-11-06T09:19:49.330629+01:00 arabafenice kernel: [ 1625.864083] Modules linked in: rfcomm fuse xt_pkttype xt_LOG xt_limit af_packet ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables bnep qcserial usb_wwan usbserial usblp wacom ecb btusb bluetooth 6lowpan_iphc arc4 iwldvm mac80211 iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic iwlwifi cfg80211 snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm thinkpad_acpi snd_seq snd_seq_device snd_timer intel_powerclamp coretemp kvm crct10dif_pclmul joydev pcspkr serio_raw snd intel_ips lpc_ich mfd_core i2c_i801 shpchp e1000e mei_me ptp soundcore mei pps_core thermal wmi rfkill ac battery tpm_tis tpm acpi_cpufreq processor dm_crypt crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd i915 i2c_algo_bit drm_kms_helper drm video button dm_mirror dm_region_hash dm_log dm_mod sg
6/11/2014 09:21:08			2014-11-06T09:19:49.330633+01:00 arabafenice kernel: [ 1625.864178] CPU: 0 PID: 8755 Comm: Xorg Not tainted 3.16.6-2-desktop #1
6/11/2014 09:21:08			2014-11-06T09:19:49.330635+01:00 arabafenice kernel: [ 1625.864181] Hardware name: LENOVO 3680W1J/3680W1J, BIOS 6QET70WW (1.40 ) 10/11/2012
6/11/2014 09:21:08			2014-11-06T09:19:49.330636+01:00 arabafenice kernel: [ 1625.864184]  0000000000000009 ffffffff8161ab03 0000000000000000 ffffffff8105bad7
6/11/2014 09:21:08			2014-11-06T09:19:49.330638+01:00 arabafenice kernel: [ 1625.864188]  0000000000000000 ffff88022e7c5000 ffff88022e6d8210 ffff88022e7da000
6/11/2014 09:21:08			2014-11-06T09:19:49.330640+01:00 arabafenice kernel: [ 1625.864193]  ffff88022e7da000 ffffffffa010abc5 0000000000000000 ffff8801c813c090
6/11/2014 09:21:08			2014-11-06T09:19:49.330642+01:00 arabafenice kernel: [ 1625.864197] Call Trace:
6/11/2014 09:21:08			2014-11-06T09:19:49.330644+01:00 arabafenice kernel: [ 1625.864218]  [<ffffffff8100519e>] dump_trace+0x8e/0x350
6/11/2014 09:21:08			2014-11-06T09:19:49.330646+01:00 arabafenice kernel: [ 1625.864225]  [<ffffffff81005506>] show_stack_log_lvl+0xa6/0x190
6/11/2014 09:21:08			2014-11-06T09:19:49.330647+01:00 arabafenice kernel: [ 1625.864231]  [<ffffffff81006c01>] show_stack+0x21/0x50
6/11/2014 09:21:08			2014-11-06T09:19:49.330649+01:00 arabafenice kernel: [ 1625.864239]  [<ffffffff8161ab03>] dump_stack+0x49/0x6a
6/11/2014 09:21:08			2014-11-06T09:19:49.330651+01:00 arabafenice kernel: [ 1625.864248]  [<ffffffff8105bad7>] warn_slowpath_common+0x77/0x90
6/11/2014 09:21:08			2014-11-06T09:19:49.330653+01:00 arabafenice kernel: [ 1625.864285]  [<ffffffffa010abc5>] intel_crtc_wait_for_pending_flips+0x165/0x170 [i915]
6/11/2014 09:21:08			2014-11-06T09:19:49.330655+01:00 arabafenice kernel: [ 1625.864449]  [<ffffffffa010d5a0>] intel_crtc_disable_planes+0x30/0x1a0 [i915]
6/11/2014 09:21:08			2014-11-06T09:19:49.330657+01:00 arabafenice kernel: [ 1625.864604]  [<ffffffffa010db25>] ironlake_crtc_disable+0x45/0x920 [i915]
6/11/2014 09:21:08			2014-11-06T09:19:49.330659+01:00 arabafenice kernel: [ 1625.864761]  [<ffffffffa010ee17>] intel_crtc_update_dpms+0x67/0x90 [i915]
6/11/2014 09:21:08			2014-11-06T09:19:49.330660+01:00 arabafenice kernel: [ 1625.864918]  [<ffffffffa0121d42>] intel_crt_dpms+0x62/0xb0 [i915]
6/11/2014 09:21:08			2014-11-06T09:19:49.331564+01:00 arabafenice kernel: [ 1625.865220]  [<ffffffffa007c086>] drm_mode_obj_set_property_ioctl+0x396/0x3b0 [drm]
6/11/2014 09:21:08			2014-11-06T09:19:49.331574+01:00 arabafenice kernel: [ 1625.865336]  [<ffffffffa007c0ce>] drm_mode_connector_property_set_ioctl+0x2e/0x40 [drm]
6/11/2014 09:21:08			2014-11-06T09:19:49.331576+01:00 arabafenice kernel: [ 1625.865439]  [<ffffffffa006b8c7>] drm_ioctl+0x1c7/0x5b0 [drm]
6/11/2014 09:21:08			2014-11-06T09:19:49.331578+01:00 arabafenice kernel: [ 1625.865459]  [<ffffffff811c9d27>] do_vfs_ioctl+0x2e7/0x4c0
6/11/2014 09:21:08			2014-11-06T09:19:49.331580+01:00 arabafenice kernel: [ 1625.865484]  [<ffffffff811c9f81>] SyS_ioctl+0x81/0xa0
6/11/2014 09:21:08			2014-11-06T09:19:49.331582+01:00 arabafenice kernel: [ 1625.865494]  [<ffffffff8162182d>] system_call_fastpath+0x1a/0x1f
6/11/2014 09:21:08			2014-11-06T09:19:49.331583+01:00 arabafenice kernel: [ 1625.865504]  [<00007fafb3dfa397>] 0x7fafb3dfa396
6/11/2014 09:21:08			2014-11-06T09:19:49.331585+01:00 arabafenice kernel: [ 1625.865507] ---[ end trace 9abe1ceab4944ca8 ]---

drm...seems video to me.
Comment 3 Takashi Iwai 2014-11-06 19:17:53 UTC
So, your guess is that it's a systemd-related problem?

If the machine hangs, does the machine react to ping or the remote login?
If you can login remotely, you'll have a good chance to collect kernel logs.
Comment 4 Stakanov Schufter 2014-11-07 11:22:26 UTC
it would be my guess. Further I have found out the following:
if you use the keyboard combination in rapid sequence alt+ctrl+Fx then it crashes.
If you go with the software switch of the plasmoid eYaSDP (enhanced yet another shutdown plasmoid), then ... it doesn't happen. 
It did not occur to me that there would be a difference between switching VT via KDE or via alt-ctrl-Fx. Is there? Or is it a matter of speed (you are slower with the mouse)?

I will have maybe the possibility to "ping" it when I am in another location next week. Here I have only my laptop, would not know how to test that here.
Comment 5 Takashi Iwai 2014-11-07 12:01:49 UTC
You may try the following:
- Run "sysctl kernel.sysrq=1"
  (and/or edit /etc/sysctl.conf and reboot for the next boot)
- Trigger Alt-SysRq-9 once.  This will change the loglevel to 9 (full).
- When the problem happens, trigger Alt-SysRq-T, Alt-SysRq-S, Alt-SysRq-U and Alt-SysRq-B.  This will dump the full stack traces, sync the disk, umount the disks and reboot.
- Check the previous kernel messages.

If the stalling place is what the kernel warning below indiciates, you might need to wait 1 minute for timeout.  If the stack traces can be caught by the action above, we can see a bit more details.
Comment 6 Takashi Iwai 2014-11-07 12:02:34 UTC
Also, it's worth to try later kernel versions, 3.17.x and 3.18-rc, available in OBS Kernel:stable and Kernel:HEAD repos.
Comment 7 Stakanov Schufter 2014-11-07 21:40:55 UTC
I tried to get the dump. As long as I do not any error, there must be a setting that erases all the last kernel log, I get only the kernel log of the current boot, be it from CLI or from ksyslog. I tried Alt-Sysrq+9 then the alt-sysrq T, S, U, B respectively. So it does not happen anything and the machine does not reboot. Maybe I am doing something completely wrong. Is this alt+ctrl+sysrq+9 or just alt+sysrq+9?
To reboot the machine reacts on alt-ctrl-sysrq-B. But the log shows only the current boot log of the kernel (/var/log/messages) that is where the dump should take place. 
Forgive me these answers, never did this so doing myself a favor and a culture. For what is the status it is exactly as the bug described above with the monitor switching on and off form standby to try to switch on (with black screen) with the described "bop" sound of the speakers.
Comment 8 Takashi Iwai 2014-11-07 23:24:07 UTC
Alt or Ctl-Alt don't matter much.  When the key combo works, you should see the relevant message.  Try to start with Alt-sysrq-9.  You shoud see in dmesg output such as:
  SysRq : Changing Loglevel
  Loglevel set to 9

Then, try alt-sysrq-w.  It'll show many texts in dmesg.  Check "journalctl -x -b 0".  You can try other sysrq stuff, too.  Invalid alt-sysrq combo (e.g. alt-sysrq-minus) shows the help in dmesg.

On some laptop keyboards, it's often difficult to get proper sysrq key.  If so, it'd be better to handle with a USB keyboard.

Once when these get ready, try to reproduce the problem and get sysrq stuff.
After reboot, check the output of "journalctl -x -b -1".  This will give you all logs in the previous boot.
Comment 9 Stakanov Schufter 2014-11-08 00:02:28 UTC
Created attachment 612869 [details]
output of dmesg after crash

I tried as you said and was able to get the combo work. I join you the output post crash of dmesg. But when I tried: 
journalctl -x -b -1
Failed to look up boot -1: Cannot assign requested address
suggestions?
Comment 10 Takashi Iwai 2014-11-08 19:48:21 UTC
Well, the message after reboot doesn't help, unfortunately.  We need the kernel logs while the bug happens.

If you upgraded the system, you may still have /var/log/messages.  This might contain the previous boot messages.  Judge from the date/time and cut out.
Also, if you run with "journalctl -x", it'll print all messages.  You can cut out the relevant part again judging from the date/time.

But, at best, it'd be best if you can remote-login while the VT is frozen.  You can trigger directly alt-sysrq by writing the proc file /proc/sysrq-trigger, i.e.
  # echo t > /proc/sysrq-trigger
Comment 11 Stakanov Schufter 2014-11-08 21:19:43 UTC
Created attachment 612885 [details]
output of var/log/messages (appears from an event tiggered with keyboard

This is from the /var/log/messages at the moment of screen crash and the machine staying "black screen dead but alive" afterwards up to reboot. I did cut when the time stamp changed after reboot.
Second attachment is from the systemd command you gave me. 
Finally FYI, I tried now the kernel from stable, 3.17.2 and freezes and crashing of screen with VT seems to have stopped. I updated kernel firmware too. I will be able to confirm after trying a bit. 
However, if you see something that allows to fix a bug, I am more then available to drop back to original kernel and crash the poor beast until it asks for mercy. You just tell what you need, I do. 

Current system is now:
uname -a
Linux arabafenice.site 3.17.2-2.g3788128-desktop #1 SMP PREEMPT Wed Nov 5 15:04:15 UTC 2014 (3788128) x86_64 x86_64 x86_64 GNU/Linux
and...up to now...seems to be stable with this one. (conditional still required I guess).
Comment 12 Stakanov Schufter 2014-11-08 21:21:33 UTC
Created attachment 612886 [details]
output of journalctl (example for sata freeze)

This seems to have stopped too after kernel update. Kernel was naturally 3.16.6-2
Comment 13 Takashi Iwai 2014-11-09 08:57:57 UTC
(In reply to Stakanov Schufter from comment #11)
> Created attachment 612885 [details]
> output of var/log/messages (appears from an event tiggered with keyboard
> 
> This is from the /var/log/messages at the moment of screen crash and the
> machine staying "black screen dead but alive" afterwards up to reboot. I did
> cut when the time stamp changed after reboot.

The output is cut off, and doesn't contain the important bits, unfortunately.
Maybe you can try alt-sysrq-w instead of alt-sysrq-t.

> Second attachment is from the systemd command you gave me. 
> Finally FYI, I tried now the kernel from stable, 3.17.2 and freezes and
> crashing of screen with VT seems to have stopped. I updated kernel firmware
> too. I will be able to confirm after trying a bit. 
> However, if you see something that allows to fix a bug, I am more then
> available to drop back to original kernel and crash the poor beast until it
> asks for mercy. You just tell what you need, I do. 
 
OK, could you clarify which kernel version caused the problem exactly?
The rpm -qi shows the git commit ID, too.
Comment 14 Takashi Iwai 2014-11-09 08:58:36 UTC
(In reply to Stakanov Schufter from comment #12)
> Created attachment 612886 [details]
> output of journalctl (example for sata freeze)
> 
> This seems to have stopped too after kernel update. Kernel was naturally
> 3.16.6-2

This looks irrelevant frm this bug itself.  If it matters, please open another bug report.  Thanks.
Comment 15 Takashi Iwai 2014-11-18 14:22:16 UTC
Could you test i915 KMP in OBS home:tiwai:bnc904097/i915 repo?
  http://download.opensuse.org/repositories/home:/tiwai:/bnc904097/standard/

This contains a few fixes from upstream.  It might not fix the very first trigger of stall, but it shouldn't happen too frequently with these fixes.
Comment 16 Stakanov Schufter 2014-11-18 15:19:00 UTC
Thank you. I have it installed now. Which kernel do you want me to test with it? I think 3.16.6-2.1 from the standard repos right?
Comment 17 Takashi Iwai 2014-11-18 15:22:13 UTC
This should work with openSUSE 3.16.x kernels as long as kABI is kept.
After installing it, check "/sbin/modinfo i915 | grep filename" output.
If everything is OK, the path should contain "updates/" or "weak-updates/".
Comment 18 Stakanov Schufter 2014-11-18 21:29:43 UTC
Gives /lib/modules/3.16.6-2-desktop/updates/drivers/gpu/drm/i915/i915.ko

so, I will check for performance and stability. It will be some time to get you the results on the ultrabase as i am on travel and I have only the laptop. In the meanwhile: no news are good news, if you do not have a direct feedback after this, it means that (standalone without ultrabase) everythings works fine. 
Once I have a good impression on function with the ultrabase and multiple external monitors in combination with VT swtiching via function keys, I will update the info here. 
Cheers.
Comment 19 Stakanov Schufter 2014-11-18 23:39:29 UTC
Ok. I tried and it crashes really fast (two fast switches of VT and ....). Tomorrow I will try to give you the complete logfile concerning the problem with the method as of above. 
With the kernel 3.17 it is instead really stable. No problems at all. 
Will get back to you soon. 

P.S. it is only the video part crashing. I have noted that the WLAN e.g. continues to be connected when it happens. So it is purely a video issue it seems.
Comment 20 Takashi Iwai 2014-11-19 13:44:26 UTC
It looks like that one upstream patch triggers kernel BUG, so I disabled 3.18 fixes and added another one from 3.17.  Please try the new package.  The changelog entry should appear like:

* Wed Nov 19 2014 tiwai@suse.de
- use other patchsets

* Tue Nov 18 2014 tiwai@suse.de
- test fix
Comment 21 Stakanov Schufter 2014-12-01 16:39:29 UTC
O.K. Sorry for the slow reply. 
I tested this now, the current status is O.K. with 
3.16.6-2-desktop #1 SMP PREEMPT Mon Oct 20 13:47:22 UTC 2014 (feb42ea) x86_64 x86_64 x86_64 GNU/Linux
and your fixes applied. If you change lightning fast a lot of time between the VTs it is still possible to crash it. But during normal operation (e.g. change VT, check mail, change back to other VT, this works stable AFAIK. 

If you wish I can try to give you the output of a blackscreen but it is now very artificial (jumping 10 times in less then 30 seconds from VT to VT). I do not think that this is normal operation. So in normal use conditions it is not happening with the latest fix.
Comment 22 Takashi Iwai 2014-12-01 16:42:32 UTC
OK, at least, it's good to hear that something gets improved.

Could you re-confirm that you get the crash again after uninstalling the KMP (and initrd creation)?  This is just to make sure that the fix really comes from the KMP.

About the fast-switching crash: this might be a different issue from yours.  Do you get a similar back trace in that case, too?
Comment 23 Stakanov Schufter 2014-12-01 16:56:32 UTC
I will proceed as required to uninstall the patch and report back. 

On the other issue it will be the best that I do provide you with the back trace, as I am not really good enough to judge, although the easiest is to do that tomorrow or after tomorrow as there I have the complete use case again with docking station compared with "naked notebook".
 
If with the patches and with docking station it will crash I will provide backtrace. 

If with patch uninstalled it crashes, I provide info. 

If with fast switch it is possible to produce the backtrace I will provide that too. 
Hope you can live with another 48 hours of doubt. 

BTW: with 3.17 this function is sound. So this problem came up between the two versions, or this was introduced with 3.16.x and then was corrected in 3.17.x
Comment 24 Bruno Pesavento 2014-12-01 23:25:51 UTC
Created attachment 615540 [details]
Kernel trace pre and post system freeze

Hi, since I see no answer to your request, I attach the kernel trace I got in what seems the same bug, or at least gives the same end result: black screen, unresponsive keyboard, only way out forcing power down.
Feel free to discard if you think this is something different.
This happens in OS 13.2 X86_64 Gnome, Core2Duo and Intel 965GM laptop (HP 6510b).
The attached trace was on Kernel 3.16.6-2-desktop with your i915-kmp-desktop-3.16_k3.16.6_2-4.1.x86_64 but I saw no visible difference on 3.17.3 or 3.17.4 from Kernel:stable.

To reproduce: Login user 1000 (grabs VT7 as usual); switch user;
Login user 1001 (grabs VT2 or VT3, I expected VT8), it works as usual; Logout (system freezes as described in this bug).
Reproducible: always.
Same freeze trying to switch to VT7 again instead of logging out user 1001.

Sometimes the system freezes gracefully enough to record a kernel trace according to your hint; attached is the best record I got out of some 20 tries.
@23:09:13 is a kernel trace with user 1001 working as usual.
@23:10:54 user 1001 (mmouse) logs out.
@23:11:04 is a kernel trace with the system frozen; I hope it helps (almost meaningless to me...)

It seems that gdm-Xorg-:1[1870] on VT2 segfaults on logout of user 1001 and control is never returned to the gdm-Xorg-:0[851] serving user 1000 on VT7.
The system keyboard is not exactly "dead", since sysrq commands get through, as do commands from wifi key etc. as originally reported.
Ready to do more testing if needed.
Comment 25 Bruno Pesavento 2014-12-01 23:33:03 UTC
(In reply to Stakanov Schufter from comment #23)
> BTW: with 3.17 this function is sound. So this problem came up between the
> two versions, or this was introduced with 3.16.x and then was corrected in
> 3.17.x

Sorry, apparently I missed comments 21 to 23. Maybe I witnessed something different, discard my notes.
Comment 26 Takashi Iwai 2014-12-02 16:04:46 UTC
(In reply to Bruno Pesavento from comment #25)
> (In reply to Stakanov Schufter from comment #23)
> > BTW: with 3.17 this function is sound. So this problem came up between the
> > two versions, or this was introduced with 3.16.x and then was corrected in
> > 3.17.x
> 
> Sorry, apparently I missed comments 21 to 23. Maybe I witnessed something
> different, discard my notes.

In your case, the problem might be irrelevant with the stall in page flip.
As you already noticed, there was a crash of X server.  This might blocking the further usage of graphics and consoles.

Did you try to remote-login while testing it?  I guess the system is still alive and remotely available but no graphics and VT are controllable.
Comment 27 Bruno Pesavento 2014-12-02 21:58:21 UTC
(In reply to Takashi Iwai from comment #26)
> (In reply to Bruno Pesavento from comment #25)
 
> In your case, the problem might be irrelevant with the stall in page flip.
> As you already noticed, there was a crash of X server.  This might blocking
> the further usage of graphics and consoles.
> 
> Did you try to remote-login while testing it?  I guess the system is still
> alive and remotely available but no graphics and VT are controllable.

Thanks for your clues, definitely an Xorg issue unrelated to page flip.
Updating to the latest xorg-server and booting with safe settings fixed the problem. I'll join/open another bug report if I find something useful.
Comment 28 Stakanov Schufter 2014-12-04 12:32:13 UTC
Created attachment 615891 [details]
output of crash with both patches

I have bad news. After 7 flips it crashes. When it comes up again, the logs are lost apparently (I have attached what was possible) and the graphics of the second monitor does not work anymore. Only after a complete hardware reset you can reuse the monitor again. Do you want me to install the patch of today and see if it changes? How can I make sysctl or /var/log/messages persist? I tried as we said above but it does not seem to work. There is a "sending sigterm to sysctl" message during boot. 
There is a warning message about graphics turbo that cannot be activated. 
After a complete hard reset it is gone however.
Comment 29 Stakanov Schufter 2014-12-04 12:33:23 UTC
Created attachment 615892 [details]
output demesg of 04/12/2014
Comment 30 Takashi Iwai 2014-12-04 14:05:26 UTC
Yes, could you try the latest KMP again?  This contains a fix of hanging drm_read().  See the changelog of rpm whether it contains today's change.

There are different causes leading to X stall.  One is the vblank page flip hang in i915 driver and another is some race of drm_read and stall of X.  Let's see what you're seeing the second one I fixed today...
Comment 31 Bruno Pesavento 2014-12-09 16:00:01 UTC
(In reply to Takashi Iwai from comment #30)
> Yes, could you try the latest KMP again?  This contains a fix of hanging
> drm_read().  See the changelog of rpm whether it contains today's change.
> 
> There are different causes leading to X stall.  One is the vblank page flip
> hang in i915 driver and another is some race of drm_read and stall of X. 
> Let's see what you're seeing the second one I fixed today...

This latest KMP adds stability to a fix for the crash described in comment #27, likely thanks to the drm_read fix.
That #27 problem with 965GM is fixed by explicitly loading the dri2 module _before_ glx gets loaded, or disabling dri entirely.
But without your last KMP, Xorg still crashes occasionally on VT switch.
I got similar stability with kernel 3.17.6, maybe because it includes a similar patch in i915.ko?
Comment 32 Bernhard Wiedemann 2014-12-12 10:00:34 UTC
This is an autogenerated message for OBS integration:
This bug (904097) was mentioned in
https://build.opensuse.org/request/show/264975 13.2 / kernel-source
Comment 33 Bruno Pesavento 2014-12-13 16:05:29 UTC
I tested kernel-desktop-3.16.7-18.1.g4643cc3.x86_64 and the VT-switch problem described in comment #27 an #31 is gone, with default settings:
no need for explicit dri2 loading, no need for extra xorg.conf file.
Good work, thanks
Comment 34 Takashi Iwai 2014-12-15 10:06:26 UTC
OK, I close this bug now with a hope that it got fixed in the upcoming update kernel.  If you still see the issue, feel free to reopen.
Comment 35 Swamp Workflow Management 2014-12-21 12:15:28 UTC
openSUSE-SU-2014:1678-1: An update that solves 8 vulnerabilities and has 22 fixes is now available.

Category: security (important)
Bug References: 665315,856659,897112,897736,900786,902346,902349,902351,902632,902633,902728,903748,903986,904013,904097,904289,904417,904539,904717,904932,905068,905100,905329,905739,906914,907818,908163,908253,909077,910251
CVE References: CVE-2014-3673,CVE-2014-3687,CVE-2014-3688,CVE-2014-7826,CVE-2014-7841,CVE-2014-8133,CVE-2014-9090,CVE-2014-9322
Sources used:
openSUSE 13.2 (src):    kernel-docs-3.16.7-7.2, kernel-obs-build-3.16.7-7.3, kernel-obs-qa-3.16.7-7.2, kernel-obs-qa-xen-3.16.7-7.2, kernel-source-3.16.7-7.1, kernel-syms-3.16.7-7.1