Bug 903874 - Illegal instruction on Core-i5 after s2ram with microcode 0x1c
Summary: Illegal instruction on Core-i5 after s2ram with microcode 0x1c
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Other (show other bugs)
Version: 13.2
Hardware: x86-64 openSUSE 13.2
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Borislav Petkov
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-04 18:06 UTC by Michael Karbach
Modified: 2015-02-25 15:52 UTC (History)
15 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
dmidecode.txt (22.33 KB, text/plain)
2014-11-21 06:44 UTC, Michael Karbach
Details
core file of a crashed "okular" binary after s2ram (3.26 MB, application/x-core)
2014-12-11 21:31 UTC, Joerg Reuter
Details
dmesg after wake-up (72.33 KB, text/plain)
2014-12-18 21:05 UTC, Joerg Reuter
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Karbach 2014-11-04 18:06:56 UTC
Upgrading from OS-13.1 to OS-13.2 on two different Computers, I observed on terminal:

 /sbin/yast2: line 440:  8269 Illegal instruction     $ybindir/y2base $module "$@" "$SELECTED_GUI" $Y2_GEOMETRY $Y2UI_ARGS

while using YaST.

I am also observing this 'Illegal instruction' sometimes by issuing a zypper command. Immediately after repeating the command, it mostly works. I think the same behavior/error appears by starting apps as a user, which sometimes simply do not start, only if one starts again, e.g. often upon starting okular.

This I have observed on two different computers.     


Michael
Comment 1 Bernhard Wiedemann 2014-11-05 08:01:59 UTC
Could you please give the output from cat /proc/cpuinfo

I remember that there was a problem with ruby and old processors
because of CPU detection happening at compile-time.

can you also reproduce this by calling yast2 manually?
Comment 2 Michael Karbach 2014-11-05 08:04:49 UTC
Here the output of: 

su-wptnb8 /root: cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670S CPU @ 3.10GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 1010.648
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6186.22
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670S CPU @ 3.10GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 899.968
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 4
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6186.22
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670S CPU @ 3.10GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 926.730
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 4
apicid          : 4
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6186.22
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670S CPU @ 3.10GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 991.273
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 6
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6186.22
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
Comment 3 Michael Karbach 2014-11-05 08:10:28 UTC
The problem is, it happens not all the time. As an observation, it happens often after a while not using this command. And indeed, I always start yast2 from terminal.

But as I mentioned, I have the feeling, that other programs are as well affected. Every once a while starting an app (e.g. firefox, okular) those programs do not start only the KDE-start-icon is jumping and only a second klick do really start the program.
Comment 4 Marcus Rückert 2014-11-05 08:43:08 UTC
the ruby problem was about SSE2. his machine has SSE2. and we patched the ruby build, to not include the SSE2 instructions. this smells more like broken memory/cpu.
Comment 5 Michael Karbach 2014-11-05 08:47:19 UTC
If it is a broken cpu/memory, it must be related in some sense to OS-13.2, because I am running this computer since OS-13.1 without any issue and it happened immediately after upgrading.
Comment 6 Martin Vidner 2014-11-05 15:39:04 UTC
FWIW, the SSE2 bug was https://bugzilla.opensuse.org/show_bug.cgi?id=889714

You may want to install memtest and reboot to run it from the boot menu:

sudo zypper in memtest86+
Comment 7 Michael Karbach 2014-11-05 16:46:53 UTC
Ok, I have made the memtest, and it completed without errors.
Comment 8 Bernhard Wiedemann 2014-11-05 18:44:52 UTC
What can also be done to further trace this down:
issue on the shell before the crashing commands:
ulimit -c unlimited

On crash this will produce a "core" file which you can attach here
and one can use it (with debuginfo packages) to locate the faulty instruction
Comment 9 Michael Karbach 2014-11-06 05:51:11 UTC
OK, I will try it with ulimit, here is another command with this issue:

/sbin/service: line 68: 15282 Illegal instruction systemctl "$2" --full "$1.service"

which has appeared while using zypper up.
Comment 10 Michael Karbach 2014-11-19 15:58:31 UTC
May be it is related to suspend/resume cycles? 

I have observed, if I do not suspend the computer, those errors actually do not appear. My computer is 24/7 on and after 7 days without suspending, everything works pretty well. As soon as I resume, the error appears within a few hours/minutes (depends on workload). This is more or less reproduce able.

M
Comment 11 Bernhard Wiedemann 2014-11-21 06:26:16 UTC
Is that suspend-to-RAM or suspend-to-disk?
Can you try out the other variant to see if that also causes this problem?
The command is s2ram or s2disk (needing some setup in /etc/suspend.conf)

My guess is that the problem is somewhere
between the hardware, ACPI and the kernel
maybe with some corruption of memory happening.

To give us a better idea about the hardware, please attach the output of
dmidecode
Comment 12 Michael Karbach 2014-11-21 06:44:42 UTC
Created attachment 614512 [details]
dmidecode.txt
Comment 13 Michael Karbach 2014-11-21 06:45:13 UTC
It is s2ram where problem appears. 

I have:


su-wptnb8 /root: free
             total       used       free     shared    buffers     cached
Mem:      16371812   11054944    5316868     274136     544536    6035040
-/+ buffers/cache:    4475368   11896444
Swap:      4192252          0    4192252

Attached is the dmidecode-output as a file.
Comment 14 Takashi Iwai 2014-11-21 10:33:21 UTC
Could you enable the core dump generation and analyze where it crashes?  This should be the first step.
Comment 15 Joerg Reuter 2014-12-11 21:10:55 UTC
I see the same issue on my home machine, similar CPU, same microcode level and stepping:

cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
stepping        : 3
microcode       : 0x1c

Definitely corelated with suspend to RAM. I'll enable core dump writing now.
Comment 16 Joerg Reuter 2014-12-11 21:31:49 UTC
Created attachment 616798 [details]
core file of a crashed "okular" binary after s2ram

Funny enough, it will start flawlessly on the second attempt...
Comment 17 Michael Karbach 2014-12-12 07:24:57 UTC
This behaviour actually does happen for all commands you issue. On console you have to type sometimes two or three times and eventually it works. Ans sometimes you have two click more on okular or other apps.
Comment 18 Takashi Iwai 2014-12-13 18:19:22 UTC
(In reply to Joerg Reuter from comment #15)
> I see the same issue on my home machine, similar CPU, same microcode level
> and stepping:
> 
> cpu family      : 6
> model           : 60
> model name      : Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
> stepping        : 3
> microcode       : 0x1c
> 
> Definitely corelated with suspend to RAM. I'll enable core dump writing now.

Hrm, relevant with microcode?  What if you downgrade/upgrade ucode-intel package?

I suspected drm/i915 problem at first, but it was about S4 in the past...
Comment 19 Joerg Reuter 2014-12-13 23:18:58 UTC
I've now removed the ucode-intel package, recreated the initrd (IIRC the microcode gets loaded from there, but I may be mistaken), rebooted, lo and behold: microcode 0x16 as loaded by the BIOS does NOT seem to show this behaviour after each of two suspended sessions.

So this appears to be microcode related, indeed.
Comment 20 Takashi Iwai 2014-12-14 08:25:51 UTC
Oh, I didn't expect so much my blind shot really hits.

For checking whether it's a kernel side regression or a microcode regression, could you try the following?

- Test openSUSE 13.1 kernel and SLE12 kernel with the latest ucode-intel (20140913) enabled

- Downgrade to openSUSE 13.1-Update ucode-intel package (20130906):
  http://download.opensuse.org/update/13.1/x86_64/ucode-intel-20130906-6.2.x86_64.rpm

BTW, yes, recreating initrd is needed (unless it's triggered automatically by rpm).  You can check the output of lsinitrd whether it includes the early microcode or not.
Comment 21 Joerg Reuter 2014-12-17 18:34:58 UTC
I finally found the time to test it:

- With ucode-intel-20130906-6.2 the microcode level is still 0x16, thus no
  issues after resume (yes, tested.)

- With ucode-intel-20140913-1.1 / microcode level 0x1c, each
  - kernel-desktop-3.11.10-21.1 / oS 13.1
  - kernel-default-3.12.28-4.6 / SLE 12
  - kernel-desktop-3.16.6-2.1 / oS 13.2
  shows segfaults after s2ram.
Comment 22 Takashi Iwai 2014-12-17 19:26:35 UTC
Thanks Joerg.

So, this looks really like a regression of CPU microcode, at least, for certain model/revision.

Keve, could you ask your team to investigate this?
Comment 23 Andi N Kleen 2014-12-17 19:33:21 UTC
Can you please use gdb on any of the segfaulting programs and dump the faulting instruction?


x/i $pc

Thanks.
Comment 24 Joerg Reuter 2014-12-17 20:14:12 UTC
Each and every segfault I've seen happens at:

(gdb) x/i $pc
=> 0x7fa42aa9312b <__lll_lock_elision+75>:      xbeginq 0x7fa42aa93131 <__lll_lock_elision+81>
(gdb)
Comment 25 Joerg Reuter 2014-12-17 20:15:34 UTC
(__lll_lock_elision() is a function of /lib64/libpthread.so.0)
Comment 26 Joerg Reuter 2014-12-17 20:21:28 UTC
*google-di-google* https://lkml.org/lkml/2014/9/18/218 (via https://bugzilla.redhat.com/show_bug.cgi?id=1146967) *headdesk*
Comment 27 Andi N Kleen 2014-12-17 21:24:32 UTC
Ok this is a known problem.

What happens is that the ucode update disables TSX, so disables a bit in the CPUID. It also makes the TSX instructions fault.
 
But you have a TSX enabled platform and glibc pthread is using TSX. It only checks the CPUID at the beginning.

So if you load the ucode update after you started a program using pthreads it has already checked TSX CPUID and may be using it, and suddenly it gets disabled, so it gets these faults.

The recommended way to handle it is to load this update only with the early microcode update from the initrd. This does the update when nothing is running that could be already using TSX.

Alternatively you can disable TSX in the glibc. The first option is better.
Comment 28 Joerg Reuter 2014-12-17 21:48:24 UTC
In this case, the new microcode got loaded from the initrd already, though. Why is glibc using affected instructions after wake-up from suspend to RAM all of a sudden?
Comment 29 Andi N Kleen 2014-12-17 22:05:56 UTC
Just to clarify you start the programs immediately after s2ram and they fault?
If you start them like 30s later does it still happen?

If yes:

Somehow you must be racing with a microcode update.
When the CPU comes out of S3 the microcode has been reset, so it gets updated.
It could be that some of the CPUs being onlined update later.
Comment 30 Fenghua Yu 2014-12-17 23:15:56 UTC
Could you please upload dmesg after failure?
Comment 31 Joerg Reuter 2014-12-18 21:05:00 UTC
Created attachment 617916 [details]
dmesg after wake-up

Sorry that it takes a while, I'm reproducing this on my workstation at home.

BTW, I was still getting segfaults seven minutes after wake-up...
Comment 32 Andi N Kleen 2014-12-19 00:24:27 UTC
And the processes that segfault minutes later also have been started minutes later, or could they have started before or during the suspend?
Comment 33 Joerg Reuter 2014-12-19 10:04:58 UTC
As far as I can see only newly started processes are affected (the crashing sshd is that of a new SSH session, the sshd process that listens to incoming connections keeps working, as well as ssh sessions started before suspend)
Comment 34 Bernhard Wiedemann 2014-12-19 13:39:11 UTC
So how and when does glibc autotect if it can use TSX / xbeginq ?

And how and when is microcode re-loaded after resume from S3?

Because it apears as if glibc thinks, it is OK to use TSX,
but when it actually does, the CPU disagrees and the process gets a SIGILL.
Comment 35 Takashi Iwai 2014-12-19 17:29:01 UTC
(In reply to Bernhard Wiedemann from comment #34)
> So how and when does glibc autotect if it can use TSX / xbeginq ?

It checks CPUID (RTM bit) AFAIK.
 
> And how and when is microcode re-loaded after resume from S3?

The microcode is reloaded at enable_nonboot_cpu().  It invokes load_ucode_ap() in cpu_init() for each secondary CPUs, and this loads the early microcode in the end.  You can see the dmesg output:

% grep microcode dmesg-after-wakeup.txt 
[    0.000000] CPU0 microcode updated early to revision 0x1c, date = 2014-07-03
[    0.057385] CPU1 microcode updated early to revision 0x1c, date = 2014-07-03
[    0.073484] CPU2 microcode updated early to revision 0x1c, date = 2014-07-03
[    0.089577] CPU3 microcode updated early to revision 0x1c, date = 2014-07-03
[    0.779579] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x1c
[    0.779585] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x1c
[    0.779589] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x1c
[    0.779595] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x1c
[    0.779633] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[  106.931202] CPU1 microcode updated early to revision 0x1c, date = 2014-07-03
[  106.945339] CPU2 microcode updated early to revision 0x1c, date = 2014-07-03
[  106.959480] CPU3 microcode updated early to revision 0x1c, date = 2014-07-03

The last three entries should correspond to the microcode application at resume.

> Because it apears as if glibc thinks, it is OK to use TSX,
> but when it actually does, the CPU disagrees and the process gets a SIGILL.

Yes, that's the puzzling part.  All look fine only judging from the log, but something is missing...
Comment 36 Fenghua Yu 2014-12-19 17:37:47 UTC
[  106.931202] CPU1 microcode updated early to revision 0x1c, date = 2014-07-03
[  106.945339] CPU2 microcode updated early to revision 0x1c, date = 2014-07-03
[  106.959480] CPU3 microcode updated early to revision 0x1c, date = 2014-07-03
[  121.706665] do_trap: 129 callbacks suppressed

Illegal instr happens 15 seconds after microcode updated on CPU1-CPU3. So it's unlikely to be a race condition issue.

Does S4 suspend to disk hit the issue?
Comment 37 Andi N Kleen 2014-12-19 17:45:01 UTC
Perhaps not all cores get correctly updated.

Could you install msr-tools and run 

sudo /usr/sbin/rdmsr  -a 0x8b

before and after the suspend? The number should be always the same and show the same signature as grep microcode /proc/cpuinfo
Comment 38 Joerg Reuter 2014-12-19 21:38:44 UTC
Hibernate / suspend to disk doesn't work on this machine (swap partition too small.)

rdmsr -a 0x8b output before suspend:

1c00000000
1c00000000
1c00000000
1c00000000

after suspend:

1600000000
1c00000000
1c00000000
1c00000000

Woops.
Comment 39 Takashi Iwai 2014-12-20 08:39:32 UTC
(In reply to Joerg Reuter from comment #38)
> Hibernate / suspend to disk doesn't work on this machine (swap partition too
> small.)
> 
> rdmsr -a 0x8b output before suspend:
> 
> 1c00000000
> 1c00000000
> 1c00000000
> 1c00000000
> 
> after suspend:
> 
> 1600000000
> 1c00000000
> 1c00000000
> 1c00000000
> 
> Woops.

So the microcode update is needed for the boot CPU at S3 resume, too?  The dmesg showed only the microcode uploads of non-boot CPUs at resume indeed.

Adding more relevant people into the loop...
Comment 40 Borislav Petkov 2014-12-20 11:36:19 UTC
This should all be fixed upstream. It is only in Linus' tree currently
so if you're feeling particularly brave, you could try one of the
vanilla KOTDs:

http://kernel.opensuse.org/packages/vanilla
Comment 41 Joerg Reuter 2014-12-21 19:33:33 UTC
The current KotD doesn't even properly suspends my machine (screen goes dark, USB device get sent to sleep, but power LED stays on), thus I can't test resume with it...
Comment 42 Borislav Petkov 2014-12-21 22:56:22 UTC
I'm not surprised.

Ok, I'll have backports for 3.16 (oS 13.2) soon, you could test them
then ontop of the oS kernel instead. I'll let you know.
Comment 43 Borislav Petkov 2015-01-08 17:50:33 UTC
Ok, I've backported the upstream fixes, Joerg, can you test please:

http://beta.suse.com/private/bpetkov/kernel-3.16.7_default+-6.x86_64.rpm

?

Thanks.
Comment 44 Joerg Reuter 2015-01-11 21:36:12 UTC
Looks good, after three resumes so far all CPUs got their microcode updated correctly:

lycaon:~ # rdmsr -a 0x8b
1c00000000
1c00000000
1c00000000
1c00000000
lycaon:~ #
Comment 45 Borislav Petkov 2015-01-12 11:44:54 UTC
Good, thanks for testing. Closing.
Comment 46 Ryan Kingsbury 2015-02-23 00:35:48 UTC
I'm experiencing this bug on an up-to-date 13.2 x64 system with a very similar Haswell processor, despite the fact that it was fixed upstream. Could something be wrong with my configuration to cause this, or is it a possible regression?

kernel 3.16.7-7-desktop
ucode-intel 20140913-4.1

Output of cat /proc/cpuinfo:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 3699.093
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6798.47
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 2200.039
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 4
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6798.47
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 2203.757
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 4
apicid          : 4
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6798.47
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz
stepping        : 3
microcode       : 0x1c
cpu MHz         : 2305.625
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 6
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips        : 6798.47
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
Comment 47 Borislav Petkov 2015-02-23 08:53:54 UTC
(In reply to Ryan Kingsbury from comment #46)
> I'm experiencing this bug on an up-to-date 13.2 x64 system with a very
> similar Haswell processor, despite the fact that it was fixed upstream.
> Could something be wrong with my configuration to cause this, or is it a
> possible regression?

More likely the backported patches haven't trickled down to the kernel
update package. You can try the KOTD from here in the meantime:

http://kernel.opensuse.org/packages/openSUSE-13.2
Comment 48 Michael Karbach 2015-02-23 09:18:27 UTC
For my system the latest offical SuSE-patch resolves the problem and everything is working.
Comment 49 Ryan Kingsbury 2015-02-24 18:40:33 UTC
Hmm...latest official updates definitely do not solve it for me. I reinstalled kernel-desktop and ucode-intel just to be sure.

How can I determine whether the patches have actually made it into the official updates? This bug is very disruptive for my workflow and I don't want to wait around for an update if it's actually a different issue.
Comment 50 Michael Karbach 2015-02-24 18:45:27 UTC
Sory, for me it is also not solved. I have mad a mistake. The error usually appears for me only after I have used the computer for a while and a lot of memory is used. So the bug actually appears, right yesterday morning after a resume. 

Anyway, so far I have only applied the official OpenSuSE-13.2 patches, may be they are not yet included.
Comment 51 Borislav Petkov 2015-02-24 20:04:47 UTC
As I said already in comment#47, the patches are not in
the official updates yet. And as I said in that same
comment, you should try the Kernel Of The Day here:

http://kernel.opensuse.org/packages/openSUSE-13.2

There are instructions on that page how to install it.