Bug 913996 - Applying Critical Updates Causes System to no longer boot (ucode-intel / ucode-amd problem?)
Applying Critical Updates Causes System to no longer boot (ucode-intel / ucod...
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Basesystem
13.2
64bit openSUSE 13.2
: P2 - High : Critical with 8 votes (vote)
: ---
Assigned To: Borislav Petkov
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-01-21 00:58 UTC by Philip Brown
Modified: 2016-01-29 13:12 UTC (History)
13 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
dmesg output after a successful boot (76.92 KB, text/plain)
2015-01-29 17:01 UTC, James Moe
Details
/proc/cpuinfo output (3.52 KB, text/plain)
2015-01-29 17:02 UTC, James Moe
Details
Requested log files (123.03 KB, text/plain)
2015-01-29 18:24 UTC, Philip Brown
Details
dmesg (180.60 KB, text/x-log)
2015-01-29 19:55 UTC, Charles-David Hebert
Details
cpuinfo (1.76 KB, text/plain)
2015-01-29 19:56 UTC, Charles-David Hebert
Details
dmesg-KOD (177.09 KB, text/x-log)
2015-01-30 13:33 UTC, Charles-David Hebert
Details
cpuinfo (1.77 KB, text/plain)
2015-03-05 00:52 UTC, Brian Richter
Details
Desktop CPU info (3.52 KB, text/plain)
2015-03-05 15:38 UTC, Brian Richter
Details
Output of command journalctl -a | grep -i microcode (6.77 KB, text/plain)
2015-04-16 18:54 UTC, Peter Kirchgeßner
Details
Output of dmesg command after reloading microcode in running kernel (58.24 KB, text/plain)
2015-05-06 18:56 UTC, Peter Kirchgeßner
Details
/proc/cpuinfo (3.43 KB, text/plain)
2015-05-11 23:26 UTC, James Moe
Details
cpuinfo for a failing machine (3.52 KB, text/plain)
2015-05-12 12:55 UTC, Adam Liebermann
Details
cpuinfo_pk.txt where setpci-command gives 00400010 (1.76 KB, text/plain)
2015-05-12 16:51 UTC, Peter Kirchgeßner
Details
pkg log (Guest) (1.02 KB, text/plain)
2015-08-22 19:35 UTC, Ashfaqur Rahman
Details
pkg log (Host) (6.15 KB, text/plain)
2015-08-22 19:37 UTC, Ashfaqur Rahman
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Philip Brown 2015-01-21 00:58:44 UTC
- H/W: VirtualBox 4.3.20 r96997 with snippets from log:
00:00:03.959144 Log opened 2015-01-20T22:01:28.803982000Z
00:00:03.959146 Build Type: release
00:00:03.959149 OS Product: Windows XP Professional
00:00:03.959151 OS Release: 5.1.2600
00:00:03.959152 OS Service Pack: 3
00:00:04.664908 DMI Product Name: A740GM-M
00:00:04.671200 DMI Product Version: 8.x                   
00:00:04.671220 Host RAM: 1919MB total, 1061MB available
00:00:04.671225 Package type: WINDOWS_32BITS_GENERIC
00:00:04.804741 Guest OS type: 'OpenSUSE'
00:00:04.818327 File system of 'openSUSE-13.2-DVD-i586.iso' (DVD) is ntfs

- successfully run installation
- after installation run
ZYPPER='zypper --no-cd --non-interactive --no-gpg-checks'
$ZYPPER refresh
$ZYPPER update --auto-agree-with-licenses

- if I reboot at this point system hangs with:
Loading initial ramdisk...

Do not know update causing problem, but problem only occurred within last week
(today is 1/20/2015).  Probably within last 3 days.

Will attempt to gather more info if possible, but system fails after installing
updates.

Not sure how to get zypper/YaST to only install critical updates from a specific date.
Comment 1 Philip Brown 2015-01-21 13:56:29 UTC
Tracked this down:

openSUSE periodially pushes updates out to user machines.  When a user machine detects the update, it pops up a message saying updates are available and the  user can then install the update or not.  One such update is:
    recommended update for patterns-openSUSE...
    (openSUSE-2015-33)
    (patterns-openSUSE-apparmor Ver 20141007-5.1)
If you install this update and then reboot the system, the boot locks with the message:
    Loading initial ramdisk...
and the system is effectively bricked.

Tried to recover system using Rescue System, but while zypper rm appears to work, system can still no longer boot.  Only way I know out of this is a complete regen.

Recommend: Assuming other people bump into this as well, remove the update from the repositories and then change the priority of this bug as desired.
Comment 2 Marcus Meissner 2015-01-21 14:04:10 UTC
can you 

zypper rpm ucode-intel ucode-amd
mkinitrd

and then try rebooting

the update is for bringing in the above microcode update packages, but they should not break the system.
Comment 3 Philip Brown 2015-01-21 15:00:05 UTC
Not quite sure what you want here.

In rescue mode I do not have access to the internet.  (VirtualBox).  Do you want me to reinstall 13.2 and then do the commands you listed?

What is zypper rpm?  rescue did not recognize this.  Did you mean zypper in?
Comment 4 Marcus Meissner 2015-01-21 15:37:36 UTC
zypper rm    

or rpm -e ucode-untel ucode-amd
Comment 5 Philip Brown 2015-01-22 04:29:38 UTC
Hope I did what you wanted here.

Did a clean regen of system
zypper in ucode-intel ucode-amd
mkinitrd

reboot

system locks up with
Loading initial ramdisk...
Comment 6 Marcus Meissner 2015-01-22 06:30:40 UTC
No, I meant deinstalling aka removing these two ucode packages.

(as a update some days brought them)

rpm -e ucode-intel ucode-amd
mkinitrd
Comment 7 Ladislav Slezák 2015-01-23 10:16:06 UTC
Um, I could not reproduce the problem, I have installed all updates in my testing VM. The updates installed ucode-intel and ucode-amd packages (were not installed before) but the VM still boots normally.

BTW my host system has Intel i7-2600 CPU, maybe it is CPU model dependent?

I guess you need to manually find out the problematic package. You could try to remove the ucode packages after installing updates (before rebooting).


Anyway, this is not an Yast issue, changing the component to Basesystem...
Comment 8 James Moe 2015-01-23 17:22:24 UTC
I have two computers with this issue. Worse, the Rescue System does not boot.

opensuse 13.2
amd athlon x4 630 CPU
linux 3.16.7-7-desktop x86_64

I updated some relatively unused computers as usual last Friday (16-Jan2015), a once a week endeavor. Today I tried to start one. It gets to the GRUB boot screen, times out to the default, starts loading, and... froze at "Loading initial ramdisk..." Uh oh.

Since this has happened to TWO computers, I feel it is not a hardware issue. It happens for the recovery option, and the previous option, linux v3.16.2.
I swapped memory sticks around. No change.
I used a Seagate tool, SeaTool, to test the disks. It is based on FreeDOS. It worked and ran the drive diagnostics; they passed.

The installation disk Rescue System does not load either. It loads the ramdisk but at some point quietly stops working.

I can start a 13.2 Installation Upgrade. It goes through everything, reboots, and... freezes. Since the network was not enabled, no updates were performed after the installation.

The failure does not affect all our computers; so far, two of four, including this one which was booted yesterday after the most recent updates.

boot: yes
- amd 2 core CPU; ati graphics
- amd 4 core CPU; nvidia graphics

boot: no
- amd 4 core CPU; ati graphics
- amd 4 core CPU; ati graphics
Comment 9 Jerry La Croix 2015-01-24 04:40:30 UTC
I have same problem after i updated my TOSHIBA i5 64bit dual core cpu, i try to bootup and it goes to a screen that says at the end ----

Eneter passswod (with more)

and it has 
control - D

now after trt'in t fix my system the home directory is messup... So a have to reinstall OpenSuSe 13.2 64bit again very madding!!!!!

DeathRidder
Comment 10 James Moe 2015-01-25 05:15:44 UTC
I did a fresh installation on a spare disk drive. It booted normally. I then ran "zypper up" to update the various modules. It froze at boot time at "Loading initial ramdisk...".

Why is this update still available?
Comment 11 Martin Pluskal 2015-01-25 10:26:54 UTC
(In reply to James Moe from comment #10)
> I did a fresh installation on a spare disk drive. It booted normally. I then
> ran "zypper up" to update the various modules. It froze at boot time at
> "Loading initial ramdisk...".
Could you try booting with debug and nomodeset (in bootloader hit "e" to edit bootloader entry, append "debug nomodeset"  to line containing linux so that it looks like i.e "linux   /boot/vmlinuz-3.18.3-1-desktop ... resume=/dev/sda1 splash=silent quiet showopts debug nomodeset", and boot system (ctrl+x)? I wonder if you see any message. Thanks
Comment 12 Martin Pluskal 2015-01-25 10:54:15 UTC
I tried reproducing described issue, I installed openSUSE-13.2 i586 in virtualbox (running on 64 bit core-i7 machine);
1) on fresh installation, installed ucode-amd and ucode-intel => vm reboots fine
2) installed all updates => vm reboots fine

Just to clarify, your issue occur in 32 bit vbox vm and on two physical machines with amd athlon X4?
Comment 13 Philip Brown 2015-01-25 18:38:42 UTC
Comments - 01/25/2015

1.  The problem identified herein is that after applying:
       recommended update for patterns-openSUSE...
       (openSUSE-2015-33)
       (patterns-openSUSE-apparmor Ver 20141007-5.1)
    the system can no longer boot and hangs with the message:
       Loading initial ramdisk...

    If this is not the problem you are experiencing, you should
    open a new bug report.

2.  The following procedure will recover the system:
    (Thanks to Marcus Meissner)

    Boot into Rescue System (from the installation media or other)
      mount /dev/sda2 /mnt
        (change /dev/sda2 as appropriate for your system)
      mount /dev/sda1 /mnt/boot
      mount --bind /proc /mnt/proc
      mount --bind /sys /mnt/sys
      mount --bind /dev /mnt/dev
      chroot /mnt

      rpm -e ucode-intel ucode-amd
      mkinitrd
      exit

      reboot

3.  The system this problem was initially reported against is:
      AMD Phenom X4 9750 Quad Core Processor
      ECS A740GM-M Motherboard
      Crucial 2048MB PC6400 DDR2 Memory
      Windows XP SP3
      Oracle VirtualBox 4.3.20 r96997

4.  I moved beyond the original problem by regenning on a new
    VBox instance, but retaining the original instance to
    support debugging this problem.

5.  I also have a Toshiba Satellite A105-S2081 laptop running
    stand-alone openSUSE 13.2 (no VBOX, no alternate OS).
    I applied the same patch to that system and it worked with
    no problem.

6.  Per your request: Could you try booting with debug and
    nomodeset (in bootloader hit "e" to edit bootloader entry,
    append "debug nomodeset"  to line containing linux so that
    it looks like i.e "linux /boot/vmlinuz-3.18.3-1-desktop ...
    resume=/dev/sda1 splash=silent quiet showopts debug nomodeset",
    and boot system (ctrl+x)?

    This produced no messages - just rebooted to the same
       Loading initial ramdisk...
    message and locked.

Thanks for all your support,
Phil Brown
Comment 14 James Moe 2015-01-26 05:21:59 UTC
Ok, "rpm -e ucode-intel ucode-amd; mkinitrd; " worked. I now have both systems up and running again. 

For one system I could get Recuse System to boot; I used Philip Brown's nicely detailed method to undo the microcode issue. The other system I installed fresh on a new disk; the new disk is so much faster (at least 3x!) I am willing to stay with it, and rebuild the system.

Ack!
However. Every time I run Yast::Software Management, the "ucode-*" packages are restored into the set. There does not seem to be any way to prevent it. I presume this applies to zypper as well.

Will there be a notice here when it is once again safe to perform updates?
Comment 15 Ladislav Slezák 2015-01-26 08:15:40 UTC
(In reply to James Moe from comment #14)

> However. Every time I run Yast::Software Management, the "ucode-*" packages
> are restored into the set.

For now you can lock the ucode-* packages so they are not installed by accident.

Start the Yast software management and set the packages to state "taboo" (by right-clicking on them), or use "zypper addlock -t package ucode-intel ucode-amd".
Comment 16 Martin Pluskal 2015-01-26 08:57:32 UTC
I have tried reproducing issue on AMD A6-4400M (x86_64) opensuse, same steps as in comment#12 and still was not able to reproduce issue.
Comment 17 James Moe 2015-01-26 18:14:22 UTC
I do not know what the secret sauce is that causes some computers to freeze and others not. Of the four computers here, two had the problem; the other two boot smoothly.

All four have the same CPU (amd athlon x4 640) and motherboard (asus m3a78-em). There are slight variations in memory type (one ECC, the rest non-ECC) and quantity (4 - 8 GB), and disk drives. I note that the two that do boot have SCSI adapters and drives; the two that did not boot have SATA drives.
Comment 18 Philip Brown 2015-01-26 18:37:20 UTC
In reply to James Moe comment:

Au contraire -
You have 3 machines identified herein which are bricking.  All have AMD quad cores, although different chips, and different motherboards.  The commonality seems to be AMD quad core; but that's about it.
Comment 19 Thomas Renninger 2015-01-27 15:01:42 UTC
Sounds similar to what Boris fixed some time ago.
AMD machines not booting when firmware updated is applied...
Comment 20 Borislav Petkov 2015-01-27 15:45:12 UTC
Hmm,

so microcode loader fixes got backported to oS13.2 kernel recently.
Can anyone on this bug who experiences the issue try using the oS13.2
kernel-of-the-day from here:

http://kernel.opensuse.org/packages/openSUSE-13.2

and try booting with regenerated ucode-intel or ucode-amd packages in
the initrd?

If that still fails, does adding "dis_ucode_ldr" to the kernel command
line fix the booting of the machine?

Also, can you guys upload full dmesg from the working boots, i.e. boot
with "ignore_loglevel debug" on the kernel command line and collect
dmesg into a file and upload it?

Just to make sure - this bug is 32-bit but people see the issue on
64-bit installs too, correct?

Thanks.
Comment 21 Charles-David Hebert 2015-01-28 02:43:57 UTC
In partial response to comment 20, this bug also affects the 64-bit installs.
I run AMD Athlon(tm) II X2 B24 CPU.
Comment 22 Borislav Petkov 2015-01-28 08:45:31 UTC
(In reply to Charles-David Hebert from comment #21)
> In partial response to comment 20, this bug also affects the 64-bit installs.
> I run AMD Athlon(tm) II X2 B24 CPU.

And, can you give the kernel-of-the-day a try?
Comment 23 Philip Brown 2015-01-28 13:52:37 UTC
In response to Comment # 20 from Borislav Petkov:

I have a separate VM I left setup to debug this problem.  It currently has openSUSE 13.2 installed, the single patterns update which caused this problem, and the subsequent rpm -e which enabled this to boot.

If you want me to try anything for you, please be VERY SPECIFIC.  Provide me with either the sh lines you want me to run or a script, and what you want me to send back to you and how to obtain it.  I do not know how to do the things you requested in the SUSE environment.  And this is from someone who is very familiar with GNU and has retargeted/rehosted it.  However, this is the first time I am playing with openSUSE and it is different.

Thanks,
--Phil Brown
Comment 24 Charles-David Hebert 2015-01-28 15:21:03 UTC
In response to comment 22 and 20 of Borislav Petkov.

I need specifications to try out what you propose in comment 20. Is the following procedure ok?

1.) I will do a full regen of the 13.2 64-bit system.

2.) I will install all the updates including the problematic ones.

3.) Install kernel of the day: "zypper ar http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
    Kernel:openSUSE-13.2"
then
      "zypper in --from Kernel:openSUSE-13.2 kernel-desktop"

4.) In command line: "zypper rm ucode-intel ucode-amd"
then
                      "mkinitrd"

Try reboot.

5.) If that does not work, how do I add the line "dis_ucode_ldr" to the kernel line command? Do I boot in rescue mode and type that in the CLI?

Thank You
Comment 25 Martin Pluskal 2015-01-28 15:28:49 UTC
See comment#11 or just before rebooting edit "/etc/default/grub", append it to look like: GRUB_CMDLINE_LINUX_DEFAULT=" resume=/dev/sda1 ... dis_ucode_ldr" and run "grub2-mkconfig -o /boot/grub2/grub.cfg" - and then reboot.
Comment 26 Borislav Petkov 2015-01-28 17:16:30 UTC
(In reply to Charles-David Hebert from comment #24)
> In response to comment 22 and 20 of Borislav Petkov.
> 
> I need specifications to try out what you propose in comment 20. Is the
> following procedure ok?
> 
> 1.) I will do a full regen of the 13.2 64-bit system.
> 
> 2.) I will install all the updates including the problematic ones.
> 
> 3.) Install kernel of the day: "zypper ar
> http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
>     Kernel:openSUSE-13.2"
> then
>       "zypper in --from Kernel:openSUSE-13.2 kernel-desktop"
> 
> 4.) In command line: "zypper rm ucode-intel ucode-amd"
> then
>                       "mkinitrd"
> 
> Try reboot.
> 
> 5.) If that does not work, how do I add the line "dis_ucode_ldr" to the
> kernel line command? Do I boot in rescue mode and type that in the CLI?

Actually, it should be even simpler:

* Install system *without* problematic updates, i.e. step 1)

* Install kernel of the day as in step 2)

* Install problematic updates

With the KOTD, the updated microcode should be loaded and applied fine.

Thanks.
Comment 27 Charles-David Hebert 2015-01-29 03:15:38 UTC
(In reply to Borislav Petkov from comment #26)

Installing the system, the the kernel of the day and then the problematic patches worked fine in Virtualbox on the problematic hardware. However, on the real system, it did not fix the problem.

Still on reboot, freezes on "loading initial ramdisk...".
Comment 28 Borislav Petkov 2015-01-29 11:12:04 UTC
(In reply to Charles-David Hebert from comment #27)
> (In reply to Borislav Petkov from comment #26)
> 
> Installing the system, the the kernel of the day and then the problematic
> patches worked fine in Virtualbox on the problematic hardware. However, on
> the real system, it did not fix the problem.
> 
> Still on reboot, freezes on "loading initial ramdisk...".

Ok, does adding "dis_ucode_ldr" on the kernel command line fix the boot?

Also, what exact type is the real system? Can you boot a known-good
kernel on it with "debug ignore_loglevel log_buf_len=16M" on the kernel
command line, do

dmesg > dmesg.log

as root and upload it here.

Also please do

cat /proc/cpuinfo > cpuinfo.txt

and upload it too.

Thanks.
Comment 29 Charles-David Hebert 2015-01-29 13:42:44 UTC
(In reply to Borislav Petkov from comment #28)

No problem for the cpuinfo and the dmseg commands, but how I need a little walkthrough for entering stuff in the kernel command line, never done that before, sorry.

I won't have any choice but to try that on yet another regen, because other kernels are not available, nor rescue mode when one follows the steps of comment 26.

1.) Regen 2.) Kernel of the day 3.) Updates (all) 4.) Add to the kernel 5.) reboot

right?
Thanks.
Comment 30 James Moe 2015-01-29 17:01:24 UTC
Created attachment 621376 [details]
dmesg output after a successful boot
Comment 31 James Moe 2015-01-29 17:02:09 UTC
Created attachment 621377 [details]
/proc/cpuinfo output
Comment 32 Borislav Petkov 2015-01-29 17:09:35 UTC
> 1.) Regen 2.) Kernel of the day 3.) Updates (all) 4.) Add to the kernel 5.)
> reboot
> 
> right?

Almost:

1) Regen

2) Edit /etc/default/grub as root and add at the end of the line starting
with GRUB_CMDLINE_LINUX_DEFAULT the command line parameters.

It should look like this after you've edited it.

GRUB_CMDLINE_LINUX_DEFAULT=" ... debug ignore_loglevel log_buf_len=16M"

Also, remove the "quiet" option from that same line.

3) Install kernel of the day: zypper in ...

4) Reboot

Now, when you're back in the boot loader, you have two kernels: the
default one which got installed and the kernel of the day aka KOTD.

5) The default one you simply boot by pressing Enter, when the box is up
you collect dmesg and /proc/cpuinfo and upload them as requested.

6) Then you reboot the machine a second time, this time into the KOTD.
Now it should freeze with "loading initial ramdisk...". Do that once to
confirm we're on the right track. Then reboot again.

7) Back in the boot loader boot menu you select "Advanced options for
openSUSE", press Enter, select the KOTD, press "e" to edit the kernel
command line, look for the line starting with

linux	/boot/vmlinuz....


go to the end of that line by pressing End, and type in " dis_ucode_ldr"

The preceding empty space is to separate this command line parameter
from the previous one but you know what I mean. Then press F10 to boot
that kernel.

I think this should be it. Ask if anything is unclear.

Thanks.
Comment 33 Borislav Petkov 2015-01-29 17:17:03 UTC
(In reply to James Moe from comment #31)
> Created attachment 621377 [details]
> /proc/cpuinfo output

Uh, goody, good old b0rked Greyhound:

[    0.031080] smpboot: CPU0: AMD Athlon(tm) II X4 630 Processor (fam: 10, model: 05, stepping: 02)

So James, does using the kernel of the day (comment #20) fix the issue
on your box?
Comment 34 Philip Brown 2015-01-29 18:24:59 UTC
Created attachment 621384 [details]
Requested log files
Comment 35 Philip Brown 2015-01-29 18:29:43 UTC
This didn't quite do what you expected.  Below is your request marked up.  Pls note by comments:

1) Regen
   *** In order to access your comments I:
       mkdir /media
       mkdir -p /media/llphil_d
       mount -t vboxsf D_DRIVE /media/llphil_d

2) Edit /etc/default/grub as root and add at the end of the line starting
with GRUB_CMDLINE_LINUX_DEFAULT the command line parameters.

It should look like this after you've edited it.

GRUB_CMDLINE_LINUX_DEFAULT=" ... debug ignore_loglevel log_buf_len=16M"

Also, remove the "quiet" option from that same line.

3) Install kernel of the day:
zypper ar http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
    Kernel:openSUSE-13.2
zypper in --from Kernel:openSUSE-13.2 kernel-desktop

4) Reboot

   *** At this point vboxsf no longer works.  Not sure why.
       To get around problem I:
       mount -t cifs //10.0.0.1/ds$ /media/llphil_d ....
       and that worked.  Note that cifs is unstable if drive
       is shared out from vbox.

Now, when you're back in the boot loader, you have two kernels: the
default one which got installed and the kernel of the day aka KOTD.

5) The default one you simply boot by pressing Enter, when the box is up
you collect dmesg and /proc/cpuinfo and upload them as requested.
   dmesg > dmesg.log
   cat /proc/cpuinfo > cpuinfo.txt

   *** Uploaded these

6) Then you reboot the machine a second time, this time into the KOTD.
Now it should freeze with "loading initial ramdisk...". Do that once to
confirm we're on the right track. Then reboot again.

   *** I assume that what you meant is the following:
       Boot
       Select Advanced options...
       Select first kernel in list which had some strange name I
         hadn't seen before.
       At this point the system booted successfully.  I stopped here

7) Back in the boot loader boot menu you select "Advanced options for
openSUSE", press Enter, select the KOTD, press "e" to edit the kernel
command line, look for the line starting with

linux    /boot/vmlinuz....

go to the end of that line by pressing End, and type in " dis_ucode_ldr"
Comment 36 Borislav Petkov 2015-01-29 18:55:46 UTC
Ok, so you're playing with virtual box or whatever that thing is.
Because booting the kernel there is very funny:

[    0.066163] smpboot: weird, boot CPU (#0) not listed by the BIOS
[    0.066184] smpboot: SMP motherboard not detected
[    0.067000] smpboot: SMP disabled

so this vbox thing is doing something wrong.

I wonder why people are even using it - why can't you take a nice shiny
kvm+qemu and do all the virtualization fun with it. It works like a
charm for such purposes. :-D

Anyway, now you have to school me how exactly to use that vbox to
reproduce your issue. So feel free to send me a step-by-step thing,
private mail is fine too :-)

(In reply to Philip Brown from comment #35)
> This didn't quite do what you expected.  Below is your request marked up. 
> Pls note by comments:
> 
> 1) Regen
>    *** In order to access your comments I:
>        mkdir /media
>        mkdir -p /media/llphil_d
>        mount -t vboxsf D_DRIVE /media/llphil_d
> 
> 2) Edit /etc/default/grub as root and add at the end of the line starting
> with GRUB_CMDLINE_LINUX_DEFAULT the command line parameters.
> 
> It should look like this after you've edited it.
> 
> GRUB_CMDLINE_LINUX_DEFAULT=" ... debug ignore_loglevel log_buf_len=16M"
> 
> Also, remove the "quiet" option from that same line.
> 
> 3) Install kernel of the day:
> zypper ar
> http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
>     Kernel:openSUSE-13.2
> zypper in --from Kernel:openSUSE-13.2 kernel-desktop
> 
> 4) Reboot
> 
>    *** At this point vboxsf no longer works.  Not sure why.
>        To get around problem I:
>        mount -t cifs //10.0.0.1/ds$ /media/llphil_d ....
>        and that worked.  Note that cifs is unstable if drive
>        is shared out from vbox.

Hmm, maybe you should unmount the vbox drive first before booting the
guest... But this is just a stab in the dark - I've never used it.

> Now, when you're back in the boot loader, you have two kernels: the
> default one which got installed and the kernel of the day aka KOTD.
> 
> 5) The default one you simply boot by pressing Enter, when the box is up
> you collect dmesg and /proc/cpuinfo and upload them as requested.
>    dmesg > dmesg.log
>    cat /proc/cpuinfo > cpuinfo.txt
> 
>    *** Uploaded these
> 
> 6) Then you reboot the machine a second time, this time into the KOTD.
> Now it should freeze with "loading initial ramdisk...". Do that once to
> confirm we're on the right track. Then reboot again.
> 
>    *** I assume that what you meant is the following:
>        Boot
>        Select Advanced options...
>        Select first kernel in list which had some strange name I
>          hadn't seen before.
>        At this point the system booted successfully.  I stopped here

Was it something like this:

 ... 3.16.7-38.1.gba87edb.x86_64

?

If that is the new kernel which is *not* the default installed kernel,
then this confirms that it fixes your issue.

But if you want me to take a closer look, you'd have to tell me how
exactly to recreate your environment with vbox.

Thanks.
Comment 37 Charles-David Hebert 2015-01-29 19:55:36 UTC
Created attachment 621395 [details]
dmesg
Comment 38 Charles-David Hebert 2015-01-29 19:56:23 UTC
Created attachment 621396 [details]
cpuinfo
Comment 39 Charles-David Hebert 2015-01-29 20:01:47 UTC
(In reply to Borislav Petkov from comment #32)

After installing the KOD, and applying the updates, step 7 of comment 32 permitted to boot fine in the KOD and in the default kernel (the one that comes with the installation).

The files that I have provided are for before the booting problem. If you want the files after the problematic updates, let me know, and in which kernel.

Thanks.
Comment 40 Borislav Petkov 2015-01-30 12:03:24 UTC
(In reply to Charles-David Hebert from comment #39)
> After installing the KOD, and applying the updates, step 7 of comment 32
> permitted to boot fine in the KOD and in the default kernel (the one that
> comes with the installation).

Good.

> The files that I have provided are for before the booting problem. If you
> want the files after the problematic updates, let me know, and in which
> kernel.

Right, try installing the ucode-amd package now on the KOTD, i.e. this
one: "Linux version 3.16.7-38.gba87edb-desktop" and reboot into it.

Thanks.
Comment 41 Borislav Petkov 2015-01-30 12:04:24 UTC
> Right, try installing the ucode-amd package now on the KOTD, i.e. this
> one: "Linux version 3.16.7-38.gba87edb-desktop" and reboot into it.

... and upload dmesg again please.
Comment 42 Charles-David Hebert 2015-01-30 13:33:02 UTC
Created attachment 621471 [details]
dmesg-KOD
Comment 43 Borislav Petkov 2015-01-30 13:55:46 UTC
(In reply to Charles-David Hebert from comment #42)
> Created attachment 621471 [details]
> dmesg-KOD

Whoops, I forgot to tell you to remove "dis_ucode_ldr" from your command
line as we want to test whether the KOTD boots fine and applies the
microcode early.

Sorry, please retry without "dis_ucode_ldr" on the kernel command line.

Thanks.
Comment 44 Charles-David Hebert 2015-01-30 14:12:35 UTC
(In reply to Borislav Petkov from comment #43)
> (In reply to Charles-David Hebert from comment #42)
> > Created attachment 621471 [details]
> > dmesg-KOD
> 
> Whoops, I forgot to tell you to remove "dis_ucode_ldr" from your command
> line as we want to test whether the KOTD boots fine and applies the
> microcode early.
> 
> Sorry, please retry without "dis_ucode_ldr" on the kernel command line.
> 
> Thanks.
 
There is no booting without the "dis_ucode_ldr" with the KOD
(3.16.7-38.gba87edb-desktop) or the default kernel. I have tried yet another time just know.
Comment 45 Borislav Petkov 2015-01-31 11:07:02 UTC
(In reply to Charles-David Hebert from comment #44)
> There is no booting without the "dis_ucode_ldr" with the KOD
> (3.16.7-38.gba87edb-desktop) or the default kernel. I have tried yet another
> time just know.

Hmm, so I did install openSUSE 13.2 32-bit on an AMD laptop I have here
and the KOTD from today (kernel-desktop-3.16.7-40.1.g22eab43.i686.rpm)
works fine - microcode gets updated.

[    0.000000] Linux version 3.16.7-40.g22eab43-desktop (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212064] (SUSE Linux) ) #1 SMP PREEMPT Thu Jan 29 19:48:17 UTC 2015 (22eab43)
...
[    1.733487] microcode: updated early to new patch_level=0x05000029
[    1.737969] microcode: CPU0: patch_level=0x05000029
[    1.738000] microcode: CPU1: patch_level=0x05000029

So it is strange why it still fails in your case. I'm going to try to
find a box similar to yours to try to reproduce it there.

In the meantime, you could try to retrace your steps and check whether
something's missing/wrong in your installation.

Then, you could also try running a 64-bit version of 13.2 to see whether
it still happens there. Independent of this, I'd switch to 64-bit anyway
like the rest of the world, unless you have a valid reason not do to
so. FWIW, 64-bit linux on x86 is much much much more widely tested than
32-bit.

HTH.
Comment 46 Borislav Petkov 2015-02-04 12:43:41 UTC
(In reply to Philip Brown from comment #35)
> 3) Install kernel of the day:
> zypper ar
> http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
>     Kernel:openSUSE-13.2
> zypper in --from Kernel:openSUSE-13.2 kernel-desktop
> 
> 4) Reboot
> 
>    *** At this point vboxsf no longer works.  Not sure why.
>        To get around problem I:
>        mount -t cifs //10.0.0.1/ds$ /media/llphil_d ....
>        and that worked.  Note that cifs is unstable if drive
>        is shared out from vbox.

I tried this here with your howto (thanks, btw) and it failed because
vbox has PAE disabled by default but the kernel is enabling it during
boot.

So I went and enabled it under Settings->System->Processor->Extended
Features [] Enable PAE/NX.

It boots fine then.

HTH.
Comment 47 Philip Brown 2015-02-06 01:53:51 UTC
I already had PAE/NX enabled.

It's even in the writeup I emailed you separately
(just checked it in my sent items folder.)
Comment 48 Borislav Petkov 2015-02-06 11:00:26 UTC
(In reply to Philip Brown from comment #47)
> I already had PAE/NX enabled.
> 
> It's even in the writeup I emailed you separately
> (just checked it in my sent items folder.)

Hmm, so the guest booted fine when I enabled PAE/NX. The only difference
then might be that you're using windoze as the host where virtual box
is running and I used virtual box on a linux machine, i.e. the linux
client.

We could try to debug this further but I don't think it is high in
priority on anyone's todo-list :-)

You've confirmed that the updated kernel fixes the microcode loader
issue for you - although you shouldn't need it at all when running
openSUSE as a guest. So while waiting for the updated kernel to appear
in the distribution channels, you can run the KOTD or, alternatively,
boot with "dis_ucode_ldr".

HTH.
Comment 49 Brian Richter 2015-03-04 20:25:25 UTC
I'm also having this same issue on my HP Pavilion DV4 at home. I'm going to try everything listed here and reply back with my findings.
Comment 50 Borislav Petkov 2015-03-04 20:27:47 UTC
(In reply to Brian Richter from comment #49)
> I'm also having this same issue on my HP Pavilion DV4 at home. I'm going to
> try everything listed here and reply back with my findings.

Just try the kernel of the day here:

http://kernel.opensuse.org/packages/openSUSE-13.2

Thanks.
Comment 51 Brian Richter 2015-03-04 20:36:50 UTC
(In reply to Borislav Petkov from comment #50)
> (In reply to Brian Richter from comment #49)
> > I'm also having this same issue on my HP Pavilion DV4 at home. I'm going to
> > try everything listed here and reply back with my findings.
> 
> Just try the kernel of the day here:
> 
> http://kernel.opensuse.org/packages/openSUSE-13.2
> 
> Thanks.

Currently I'm sitting at the loading initial ramdisk.... 

I will go ahead and regen 
then install the KOTD 
then apply all the updates needed 
Do you want me to grab any dmesg or cpuinfo prior to installing the KOTD and then possibly after install it and then also after i apply the updates (assuming i'm not bricked again)
Comment 52 Borislav Petkov 2015-03-04 21:09:42 UTC
(In reply to Brian Richter from comment #51)
> Currently I'm sitting at the loading initial ramdisk.... 

That's the broken kernel after the updates, correct?

> I will go ahead and regen
> then install the KOTD
> then apply all the updates needed
> Do you want me to grab any dmesg or cpuinfo prior to installing the KOTD and
> then possibly after install it and then also after i apply the updates
> (assuming i'm not bricked again)

Just boot into a working kernel, install the KOTD as the URL above says
and boot into that KOTD kernel. It should boot fine... (famous last
words :-))

Thanks.
Comment 53 Brian Richter 2015-03-05 00:42:34 UTC
(In reply to Borislav Petkov from comment #52)
> (In reply to Brian Richter from comment #51)
> > Currently I'm sitting at the loading initial ramdisk.... 
> 
> That's the broken kernel after the updates, correct?
> 
> > I will go ahead and regen
> > then install the KOTD
> > then apply all the updates needed
> > Do you want me to grab any dmesg or cpuinfo prior to installing the KOTD and
> > then possibly after install it and then also after i apply the updates
> > (assuming i'm not bricked again)
> 
> Just boot into a working kernel, install the KOTD as the URL above says
> and boot into that KOTD kernel. It should boot fine... (famous last
> words :-))
> 
> Thanks.

My steps. 

1. do a clean installation 
2. install the Kernel Of The Day 
zypper ar http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
    Kernel:openSUSE-13.2
zypper in --from Kernel:openSUSE-13.2 kernel-desktop
	kernel installed  3.16.7-51.gea5ed9c-desktop 
3. Reboot using the updated kernel: worked 
4. Run all the security updates and recommended updates that are listed (165) 
5. Reboot 
6. Failed to load initial ramdisk again. 
	- booting to recovery mode wouldn't work and also the old kernel didn't work. 
7. I then added dis_ucode_ldr to the kernel command line and it worked. 

For the time being i'm going to add it to the bootloader config to always run this way until we can find a permanent fix.
Comment 54 Brian Richter 2015-03-05 00:52:08 UTC
Created attachment 625451 [details]
cpuinfo
Comment 55 Borislav Petkov 2015-03-05 11:59:25 UTC
(In reply to Brian Richter from comment #53)
> 2. install the Kernel Of The Day 
> zypper ar
> http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
>     Kernel:openSUSE-13.2
> zypper in --from Kernel:openSUSE-13.2 kernel-desktop
> 	kernel installed  3.16.7-51.gea5ed9c-desktop 
> 3. Reboot using the updated kernel: worked 
> 4. Run all the security updates and recommended updates that are listed
> (165) 
> 5. Reboot 
> 6. Failed to load initial ramdisk again.

Could it be that the update overwrote your KOTD kernel which you
installed previously? When you reboot after applying the updates, which
kernel are you booting into?

Thanks.
Comment 56 Brian Richter 2015-03-05 13:43:10 UTC
(In reply to Borislav Petkov from comment #55)
> (In reply to Brian Richter from comment #53)
> > 2. install the Kernel Of The Day 
> > zypper ar
> > http://download.opensuse.org/repositories/Kernel:/openSUSE-13.2/standard \
> >     Kernel:openSUSE-13.2
> > zypper in --from Kernel:openSUSE-13.2 kernel-desktop
> > 	kernel installed  3.16.7-51.gea5ed9c-desktop 
> > 3. Reboot using the updated kernel: worked 
> > 4. Run all the security updates and recommended updates that are listed
> > (165) 
> > 5. Reboot 
> > 6. Failed to load initial ramdisk again.
> 
> Could it be that the update overwrote your KOTD kernel which you
> installed previously? When you reboot after applying the updates, which
> kernel are you booting into?
> 
> Thanks.

It doesn't look like it overwrote it, it's still pointing at the 3.16.7-51 that was pulled down. in the boot options I'm also making sure to specify it as well. 
May i ask what the ramifications are of adding dis_ucode_ldr to the boot option? Let me know what else you want like from that machine.
Comment 57 Philip Brown 2015-03-05 14:37:52 UTC
FWIW - You know that the bit about using KOTD never solved the original problem either.  The only way I got around it was to disable the ucode updates in zypper.
Comment 58 Brian Richter 2015-03-05 14:52:19 UTC
(In reply to Philip Brown from comment #57)
> FWIW - You know that the bit about using KOTD never solved the original
> problem either.  The only way I got around it was to disable the ucode
> updates in zypper.

just wanted to try what was mentioned. As stated I was able to get around it without disabling the ucode in zypper by just adding the dis_ucode_ldr to the kernel command line. 

i'm actually performing another install right now on a desktop pc. Going to see if it affects that as well. 

Thanks Philip.
Comment 59 Brian Richter 2015-03-05 15:38:14 UTC
(In reply to Brian Richter from comment #58)
> (In reply to Philip Brown from comment #57)
> > FWIW - You know that the bit about using KOTD never solved the original
> > problem either.  The only way I got around it was to disable the ucode
> > updates in zypper.
> 
> just wanted to try what was mentioned. As stated I was able to get around it
> without disabling the ucode in zypper by just adding the dis_ucode_ldr to
> the kernel command line. 
> 
> i'm actually performing another install right now on a desktop pc. Going to
> see if it affects that as well. 
> 
> Thanks Philip.

No issues running a fresh install on my desktop pc. i ran all the updates after the install and didn't run into an issue. I didn't use the KOTD just ran the default updates and it worked. 

Just my laptop is having the issue.
Comment 60 Brian Richter 2015-03-05 15:38:56 UTC
Created attachment 625558 [details]
Desktop CPU info
Comment 61 Borislav Petkov 2015-03-06 09:15:34 UTC
(In reply to Brian Richter from comment #56)
> It doesn't look like it overwrote it, it's still pointing at the 3.16.7-51
> that was pulled down. in the boot options I'm also making sure to specify it
> as well.

Let me make sure I understand you correctly here:

you're booting the 3.16.7-51 KOTD kernel and it fails. When you add
"dis_ucode_ldr" to that same kernel, it boots fine.

Correct?

> May i ask what the ramifications are of adding dis_ucode_ldr to the boot
> option?

It disables the early microcode loader.

> Let me know what else you want like from that machine.

Please boot the original kernel with which you've installed it with

"log_buf_len=16M ignore_loglevel"

on its command line, catch full dmesg and upload it here please.

Thanks.
Comment 62 Borislav Petkov 2015-03-06 12:11:37 UTC
FWIW, today's KOTD kernel-default-3.16.7-52.1.gccae9ec.x86_64.rpm boots just fine on an AMD laptop here:

[    2.259098] microcode: updated early to new patch_level=0x05000029
[    2.265954] microcode: CPU0: patch_level=0x05000029
[    2.266048] microcode: CPU1: patch_level=0x05000029
[    2.266433] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    8.999619] [drm] Loading PALM Microcode
Comment 63 Brian Richter 2015-03-06 15:26:40 UTC
(In reply to Borislav Petkov from comment #61)
> (In reply to Brian Richter from comment #56)

>you're booting the 3.16.7-51 KOTD kernel and it fails. When you add
> "dis_ucode_ldr" to that same kernel, it boots fine.
> Correct? 

That is correct. After performing the system upgrades even with the KOTD kernel i'm unable to load EITHER kernel without adding the dis_ucode_ldr to the kernel line. The original kernel no longer boots without adding the dis_ucode_ldr to the line. 

So my understanding would be that when the upgrade to the ucode-amd or ucode-intel happens it breaks regardless of what Kernel i have tried. I had followed the steps of using the rescue cd and performing the  rpm -e ucode-intel ucode-amd on a system that was stuck and i was able to boot just fine without having to add the dis_ucode_ldr to the kernel line. 

I can go ahead and perform a regen on the spare harddrive i was working on and prior to performing the upgrades i can add the  "log_buf_len=16M ignore_loglevel" to the kernel line. My question to you regarding DMESG is when do i perform that? do i perform that after the machine is booted up and open up a terminal to type it in? I want to make sure i capture the dmesg in the correct place. my understanding of dmesg is that i would need to capture it DURING the boot process not after. if so i'll have to figure out how to do this. 

All that said, after performing the upgrade to the ucode-amd and ucode-intel i'm not sure i can ever capture a proper dmesg because no kernel works after the upgrade without adding the dis_ucode_ldr to the kernel line.
Comment 64 Borislav Petkov 2015-03-09 17:38:36 UTC
Hrrm, so we did find a dv4 but with an Intel CPU and early microcode loading
works just fine there:

[    0.000000] Linux version 3.16.7-7-desktop (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212
064] (SUSE Linux) ) #1 SMP PREEMPT Wed Dec 17 18:00:44 UTC 2014 (762f27a)

...

[    0.000000] DMI: Hewlett-Packard HP Pavilion dv4 Notebook PC/18F4, BIOS F.01P02 05/15/2012

...

[    0.000000] CPU0 microcode updated early to revision 0x29, date = 2013-06-12
[    0.088957] CPU2 microcode updated early to revision 0x29, date = 2013-06-12
[    2.444200] microcode: CPU0 sig=0x206a7, pf=0x10, revision=0x29
[    2.444207] microcode: CPU1 sig=0x206a7, pf=0x10, revision=0x29
[    2.444218] microcode: CPU2 sig=0x206a7, pf=0x10, revision=0x29
[    2.444225] microcode: CPU3 sig=0x206a7, pf=0x10, revision=0x29

I still need to find a dv4 with an AMD CPU first. :-\
Comment 65 Borislav Petkov 2015-03-17 16:06:24 UTC
Ok, not a dv4 but with an AMD CPU close to the dv4 one:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II N370 Dual-Core Processor
stepping        : 3
microcode       : 0x10000c8

[    0.000000] Linux version 3.16.7-7-desktop (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212064] (SUSE Linux) ) #1 SMP PREEMPT Wed Dec 17 18:00:44 UTC 2014 (762f27a)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.7-7-desktop root=UUID=6902c02b-5689-446d-9b82-f2beb45d5f56 resume=/dev/sda1 splash=silent quiet showopts

[    1.418395] microcode: CPU0: patch_level=0x010000c8
[    1.418404] microcode: CPU1: patch_level=0x010000c8
[    1.418498] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    3.099020] [drm] Loading RS780 Microcode

It is a stepping 3 one which gets a different microcode. It doesn't load
the microcode from the initrd because it already has the latest one in
the BIOS while on Brian's machine it should load it because he has older
microcode than what's in the patch file:

microcode	: 0x1000098

The patch file should have revision 0x010000c7 for the CPU in the dv4,
AFAICT.

Continuing searching...
Comment 66 Borislav Petkov 2015-03-17 18:14:47 UTC
Installed oS13.2 on another AMD box:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 10
model name      : AMD Phenom(tm) II X6 1055T Processor
stepping        : 0
microcode       : 0x10000dc

Updated box with zypper, installed latest microcode package and rebooted
box:

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.16.7-7-default (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212064] (SUSE Linux) ) #1 SMP Wed Dec 17 18:00:44 UTC 2014 (762f27a)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.7-7-default root=UUID=5c45b3b1-3fcb-4abc-ab77-cbfb944d5e20 resume=/dev/sda1 splash=silent quiet showopts
...
[    1.208685] microcode: updated early to new patch_level=0x010000dc
...
[    1.214617] microcode: CPU0: patch_level=0x010000dc
[    1.214633] microcode: CPU1: patch_level=0x010000dc
[    1.214658] microcode: CPU2: patch_level=0x010000dc
[    1.214674] microcode: CPU3: patch_level=0x010000dc
[    1.214689] microcode: CPU4: patch_level=0x010000dc
[    1.214702] microcode: CPU5: patch_level=0x010000dc
[    1.214775] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Previous microcode version was 0x010000bf so microcode did get updated.
So it all seems to work to me.
Comment 67 Takashi Iwai 2015-03-23 15:26:09 UTC
I tested a few HP laptops with AMD CPU, too, but couldn't see any problem.

The most similar one has originally the following in /proc/cpuinfo:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 6
model name	: AMD Athlon(tm) II P340 Dual-Core Processor
stepping	: 3
microcode	: 0x10000b6

After update with ucode-amd, it became:

microcode	: 0x10000c8

The kernel message shows:

[    0.000000] Linux version 3.16.7-7-desktop (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212064] (SUSE Linux) ) #1 SMP PREEMPT Wed Dec 17 18:00:44 UTC 2014 (762f27a)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.7-7-desktop root=UUID=7d88c9c2-21fe-4cfa-8a1b-45ee3b2f23e9 resume=/dev/sda1 splash=silent quiet showopts
....
[    0.548736] Unpacking initramfs...
[    1.614490] microcode: updated early to new patch_level=0x010000c8
....
[    1.622607] microcode: CPU0: patch_level=0x010000c8
[    1.622618] microcode: CPU1: patch_level=0x010000c8
[    1.622709] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
Comment 68 Takashi Iwai 2015-03-23 15:33:03 UTC
FWIW, the machines I've tested are:

* HP Compaq 321
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 6
model name	: AMD Athlon(tm) II P340 Dual-Core Processor
stepping	: 3
microcode	: 0x10000c8

* HP Compaq 6535s
vendor_id	: AuthenticAMD
cpu family	: 17
model		: 3
model name	: AMD Turion(tm)X2 Ultra DualCore Mobile ZM-80
stepping	: 1
microcode	: 0x2000032

* HP ProBook 4436s
vendor_id	: AuthenticAMD
cpu family	: 18
model		: 1
model name	: AMD A6-3400M APU with Radeon(tm) HD Graphics
stepping	: 0
microcode	: 0x3000027

* HP ProBook 4446s
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 16
model name	: AMD A6-4400M APU with Radeon(tm) HD Graphics
stepping	: 1
microcode	: 0x6001119
Comment 69 Peter Kirchgeßner 2015-04-15 23:40:02 UTC
Just as additional information. My openSUSE 13.2 stopped also with the message "loading initial ramdisk", after the kernel-desktop-3.16.7-7.1.x86_64.rpm has been installed three days ago (March 15th). Switching off the graphics modes during boot and using the kernel-desktop-3.16.7-7 (recovery mode), it started booting, but stopped with message "switch to broadcast mode on CPU 0". With the hints in comment 13 (uninstall ucode-intel ucode-amd and rebuild initrd), I could get the kernel running. Thank you for that hint.
My cpuinfo:
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 240e Processor
stepping        : 2
microcode       : 0x100009f

On that machine only openSUSE is installed.
Now I have set ucode-intel and code-amd to taboo in yast.
Comment 70 Borislav Petkov 2015-04-16 07:53:22 UTC
(In reply to Peter Kirchgeßner from comment #69)
> Just as additional information. My openSUSE 13.2 stopped also with the
> message "loading initial ramdisk", after the
> kernel-desktop-3.16.7-7.1.x86_64.rpm has been installed three days ago
> (March 15th). Switching off the graphics modes during boot and using the
> kernel-desktop-3.16.7-7 (recovery mode), it started booting, but stopped
> with message "switch to broadcast mode on CPU 0". With the hints in comment
> 13 (uninstall ucode-intel ucode-amd and rebuild initrd), I could get the

Hmm, that's interesting.

Ok, question: has this machine ever upgraded microcode successfully
before?

Can you do

journalctl -a | grep -i microcode

Thanks.
Comment 71 Peter Kirchgeßner 2015-04-16 18:54:34 UTC
Created attachment 631345 [details]
Output of command journalctl -a | grep -i microcode
Comment 72 Peter Kirchgeßner 2015-04-16 19:49:34 UTC
Because I reinstalled the system completely from DVD yesterday, I have no old logs anymore. But after installation from DVD I worked for two hours with openSUSE (kernel-desktop-3.16.6). Then I let work the online update to install the more than 200 packages. After that booting kernel-desktop-3.16.7 stopped with "loading intial ramdisk".
Then I found that I can boot the older kernel and learned how to remove the ucode-intel and ucode-amd. So from that state of the ystem I can give you some information:

Output of journalctl -a | grep -i microcode is attached as journalctl_microcode.txt

But you need some time info because everything was done within 4 hours. So here are some excerpts from /var/log/zypp/history to see what happened when:

Kernel installed from DVD:
# 2015-04-15 19:23:13 kernel-desktop-3.16.6-2.1.x86_64.rpm installed ok
...
# *** Generating early-microcode cpio image ***

Then I worked with the system installing software with yast.
At about 21:11 I let start the online update. Up to that no ucode-amd or ucode-intel-package appeared in the history file. Then

2015-04-15 21:35:20|install|ucode-amd|20141122git-5.1
...
2015-04-15 21:35:40|install|ucode-intel|20140913-4.1
...
# 2015-04-15 21:36:38 kernel-desktop-3.16.7-7.1.x86_64.rpm installed ok
...
# *** Generating early-microcode cpio image ***
...

2015-04-15 21:36:38|install|kernel-desktop|3.16.7-7.1|x86_64|
...

Then the history file shows how ucode-amd is tried to install, which seems to create an error:

# 2015-04-15 21:38:09 Output of ucode-amd-20141122git-5.1.noarch.rpm %posttrans script:
#     /usr/lib/module-init-tools/regenerate-initrd-posttrans: line 46: mkinitrd: command not found
# 2015-04-15 21:38:09 ucode-amd-20141122git-5.1.noarch.rpm %posttrans script failed (returned 127)
# 2015-04-15 21:38:11 /var/adm/update-scripts/fonts-config-20140604-3.5.1-reconfigure-fonts-cjk executed
# 2015-04-15 21:38:11 /var/adm/update-scripts/fonts-config-20140604-3.5.1-reconfigure-fonts executed
# 2015-04-15 21:38:12 New update message /var/adm/update-messages/mariadb-10.0.13-2.6.1
# 2015-04-15 21:38:12 Error sending update message notification.

These have been the last install messages from the online update. After that I  have tried to boot kernel-desktop-3.16.7-7 without success. Until I booted kernel-desktop-3.16.6-2 again at 22:10. This worked.
The next messages in the zypper history-file belongs to the manual removal of ucode-amd and ucode-intel:

2015-04-16 00:24:13|remove |ucode-amd|20141122git-5.1|noarch|root@spica|
2015-04-16 00:24:13|remove |ucode-intel|20140913-4.1|x86_64|root@spica|

So the relevant microcode-entries in the journalctl-log are from 19:28, when the system installed from the DVD was booted. Then at 22:10, when I bootet again kernel 3.16.6-2. The first successfull boot of 3.16.7-7 was at 00:47.


If you need more information, please ask. I have saved the zypp-history and the full journalctl-output if more information is needed.
Comment 73 Peter Kirchgeßner 2015-04-16 20:32:42 UTC
And one more question to understand the background.
I was able to boot kernel 3.16.6-2 while ucode-amd was installed (was it really installed? It's postrans-script reported an error).
I was not able to boot kernel 3.16.7-7 while ucode-amd was installed.
So it seems to me that the new CPU-instructions used with that kernel (see https://software.opensuse.org/package/kernel-desktop) along with ucode-amd make some AMD-CPUs hangup. Is that conclusion right?

Thank you for looking about the issue.
Peter Kirchgeßner
Comment 74 Borislav Petkov 2015-04-18 04:13:12 UTC
(In reply to Peter Kirchgeßner from comment #73)
> And one more question to understand the background.
> I was able to boot kernel 3.16.6-2 while ucode-amd was installed (was it
> really installed? It's postrans-script reported an error).
> I was not able to boot kernel 3.16.7-7 while ucode-amd was installed.
> So it seems to me that the new CPU-instructions used with that kernel (see
> https://software.opensuse.org/package/kernel-desktop) along with ucode-amd
> make some AMD-CPUs hangup. Is that conclusion right?

Yeah, there's something fishy there but I can't put my finger on it yet.

We have tried microcode update on a bunch of AMD laptops we could get
our hands on (see earlier bugzilla entries) and we couldn't reproduce
the issue. So there's something else I'm missing... Hmm.

But let's do things one step at a time:

@Michal: can you please take a look at comment 72 and especially:

# 2015-04-15 21:38:09 Output of ucode-amd-20141122git-5.1.noarch.rpm %posttrans script:
#     /usr/lib/module-init-tools/regenerate-initrd-posttrans: line 46: mkinitrd: command not found
# 2015-04-15 21:38:09 ucode-amd-20141122git-5.1.noarch.rpm %posttrans script failed (returned 127)

shouldn't mkinitrd be a precondition for ucode-amd or is it that oS13.2
is using dracut or whatever else there's going on?

Thanks.
Comment 75 Peter Kirchgeßner 2015-04-18 05:30:29 UTC
This is how the script /usr/lib/module-init-tools/regenerate-initrd-posttrans looks on my machine around line 46:


if test -e "$dir/all"; then
        rm "$dir"/*
        run_mkinitrd_setup
        mkinitrd
        exit
fi

Here mkinitrd is called without its path /sbin. Other places in the script call it as /sbin/mkinitrd.
Comment 76 Borislav Petkov 2015-04-18 15:53:33 UTC
(In reply to Peter Kirchgeßner from comment #75)
> This is how the script
> /usr/lib/module-init-tools/regenerate-initrd-posttrans looks on my machine
> around line 46:
> 
> 
> if test -e "$dir/all"; then
>         rm "$dir"/*
>         run_mkinitrd_setup
>         mkinitrd
>         exit
> fi
> 
> Here mkinitrd is called without its path /sbin. Other places in the script
> call it as /sbin/mkinitrd.

Hmm, I see another invocation further down which is done without
"/sbin/" prepended either.

Ok, try adding "/sbin/" before those two and install *only* the
ucode-amd package - *not* the new kernel, i.e. without installing
kernel-desktop-3.16.7-7.

Then boot into the old kernel and once the box is up, do:

echo 1 > /sys/devices/system/cpu/microcode/reload

Then do:

dmesg | grep -i microcode

and paste the output from that command here. I want to see whether
microcode gets applied at all with the old kernel.

Once /usr/lib/module-init-tools/regenerate-initrd-posttrans is fixed,
you can try installing the new kernel and see whether it'll boot.
Judging by the current observations though, it should hang.

Thanks.
Comment 77 Peter Kirchgeßner 2015-04-19 19:37:52 UTC
Ok, give me some time. I first have to do a backup of my system, because this is my one and only workhorse.
Comment 78 Peter Kirchgeßner 2015-04-21 19:43:48 UTC
Because kernel-desktop-3.16-7-7 was already installed, I installed a minimal system not using graphic desktop from the openSUSE 13.2 DVD. Not using any graphics during boot (noplymouth, text mode). I fixed the regenerate-initrd-posttrans script and installed ucode-amd.

kernel-default-3.16.6-2 works with ucode-amd.

dmesg | grep -i microcode gave the following:

[    1.249243] microcode: updated early to new patch_level=0x010000c7
[    1.255070] microcode: CPU0: patch_level=0x010000c7
[    1.255167] microcode: CPU1: patch_level=0x010000c7
[    1.255324] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Then I installed kernel kernel-default-3.16.7-7 which also worked with ucode-amd.

dmesg | grep -i microcode gave the following:

[    1.246063] microcode: updated early to new patch_level=0x010000c7
[    1.251879] microcode: CPU0: patch_level=0x010000c7
[    1.251976] microcode: CPU1: patch_level=0x010000c7
[    1.252133] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Then I installed the optimized kernel-desktop-3.16.7-7.
This stopped booting after these messages:
process: System has AMD C1E enabled
process: Switch to broadcast mode on CPU0
Comment 79 Borislav Petkov 2015-04-21 20:10:37 UTC
(In reply to Peter Kirchgeßner from comment #78)
> ...
> Then I installed the optimized kernel-desktop-3.16.7-7.
> This stopped booting after these messages:

Ha, that's very strange. So the two flavors differ in ways which should
not have any effect on microcode loading AFAICT...

> process: System has AMD C1E enabled
> process: Switch to broadcast mode on CPU0

... but I could just as well be missing something.

Can you try booting that same kernel-desktop-3.16.7-7 flavor but add
"dis_ucode_ldr" on the kernel command line before doing so, to see if it
boots fine then.

Thanks!
Comment 80 Peter Kirchgeßner 2015-04-22 18:42:27 UTC
In the meantime I installed my working system new with kernel-desktop-3.16.6-2 from the openSUSE DVD. So I did the following tests, which I hope show the same that you asked for:
I added dis_ucode_ldr to the kernel boot parameters.
Installed ucode-amd.
Installed kernel-desktop-3.16.7-21 (which is the current version offered for online updates)
Both kernel-desktop-3.16.6-2 and 3.16.7-21 bootet fine.
Removed dis_ucode_ldr from the kernel boot parameters.
None of the kernel-desktop-3.16.6-2 and 3.16.7-21 bootet.
They both finished with 

process: Switch to broadcast mode on CPU0

So I have to revise my statement, that at some point kernel-desktop-3.16.6-2 worked with ucode-amd, while 3-16.7 did not work with ucode-amd. It seems that both do not work with ucode-amd. At least on my machine. Tests have been done with the corrected regenerate-initrd-posttrans.
Comment 81 Borislav Petkov 2015-05-04 17:04:10 UTC
So basically 13.2 has never applied microcode successfully on that box,
correct? Not even the installer kernel works.

Ok, here's what you can try:

* copy /lib/firmware/amd-ucode somewhere safe

* remove the ucode-amd package

* install the latest kernel-desktop or -default. dracut should not find
the microcode, pay attention to this message:

*** Generating early-microcode cpio image ***
*** Constructing AuthenticAMD.bin ****

it should *not* be printed.

* remove "dis_ucode_ldr" from the kernel parameters

* boot the new kernel

* The machine should be up, when it is, move the stashed
/lib/firmware/amd-ucode folder back in its place, i.e.
/lib/firmware/amd-ucode

* As root, do

echo 1 > /sys/devices/system/cpu/microcode/reload

* Send me dmesg

If this works out, this should show that microcode gets updated late on
that machine.

Thanks.
Comment 82 Peter Kirchgeßner 2015-05-06 18:53:20 UTC
(In reply to Borislav Petkov from comment #81)
> So basically 13.2 has never applied microcode successfully on that box,
> correct? Not even the installer kernel works.
> 
If "never applied microcode successfully" means applying microcode *and* the kernel works afterwards, you are right. Up to now it seems to me that applying the microcode implies hangup of the kernel.

Following your instructions, I did this:

Installed package ucode-amd.
Copied /lib/firmware/amd-ucode (containing 4 files) to a different name.
Uninstalled package ucode-amd.
Run mkinitrd and saw the messages
*** Generating early-microcode cpio image ***
*** Constructing AuthenticAMD.bin ****
Because you wrote this should not be printed,I renamed the remaining /lib/firmware/amd-ucode (containing two files) to a different name.
Run mkinitrd. This showed only
*** Generating early-microcode cpio image ***
dis_ucode_ldr is not on the kernel parameters. So I rebooted.
Kernel 3.16.7-21-desktop bootet properly.
Restored the amd-ucode-directory that kept the 4 files.
Did as root echo 1 > /sys/devices/system/cpu/microcode/reload
The kernel still works.
The following has been added to dmesg:
[  182.862248] microcode: CPU0: new patch_level=0x010000c7
[  182.862288] microcode: CPU1: new patch_level=0x010000c7

Earlier in the log file you can find
[    1.323063] microcode: CPU0: patch_level=0x0100009f
[    1.323160] microcode: CPU1: patch_level=0x0100009f

The complete dmesg is attached as dmesg.after.
Comment 83 Peter Kirchgeßner 2015-05-06 18:56:09 UTC
Created attachment 633492 [details]
Output of dmesg command after reloading microcode in running kernel
Comment 84 Adam Liebermann 2015-05-11 18:07:25 UTC
in my opinion the updated (ucode-intel and ucode-amd) should be blocked because it'll crash a lot of systems and users will turn away.
Comment 85 Adam Liebermann 2015-05-11 18:08:24 UTC
(In reply to Adam Liebermann from comment #84)
> in my opinion the updated (ucode-intel and ucode-amd) should be blocked
> because it'll crash a lot of systems and users will turn away.

I mean until the issue is resolved..
Comment 86 Borislav Petkov 2015-05-11 19:00:35 UTC
No need, the observed failure happens only with specific AMD CPU models
which are not being sold anymore. And there's also the "dis_ucode_ldr"
parameter which can be used as a temporary workaround.
Comment 87 Borislav Petkov 2015-05-11 20:28:06 UTC
Ok, all people with the AMD machines where microcode fails applying - according
to my notes they are:

Charles-David Hebert
James Moe
Brian Richter
Peter Kirchgeßner

Can you guys all do

# setpci -s 18.3 0x188.l

as root and upload the result in here?

Thanks a lot.
Comment 88 James Moe 2015-05-11 20:53:08 UTC
$ setpci -s 18.3 0x188.l
00400000
Comment 89 James Moe 2015-05-11 20:58:25 UTC
On 2015-05-03 I upgraded our main server. It has the same motherboard as the other systems (asus m3a78-em) with the AMD CPU 630 4-core. It has exhibited the same microcode issue as the other systems but with a more restrictive twist.

Disabling the modules ucode-intel and ucode-amd in the Software Manager was not sufficient to prevent a catatonic system. It is a necessary requirement to include "dis_ucode_ldr" in the kernel parameters at boot time (GRUB2).
Comment 90 Borislav Petkov 2015-05-11 21:55:24 UTC
(In reply to James Moe from comment #89)
> On 2015-05-03 I upgraded our main server. It has the same motherboard as the
> other systems (asus m3a78-em) with the AMD CPU 630 4-core.

Can get me /proc/cpuinfo from that box and do the setpci command on it too?

Thanks.
Comment 91 James Moe 2015-05-11 23:26:23 UTC
Created attachment 633876 [details]
/proc/cpuinfo

CPU info for asus m3a78-em + AMD CPU.
Comment 92 James Moe 2015-05-11 23:27:36 UTC
$ setpci -s 18.3 0x188.l
00400000

$ cat /proc/cpuinfo 
((See attachment))
Comment 93 Borislav Petkov 2015-05-12 08:03:40 UTC
Interesting.

Can you also read out the microcode version?

You'd need to install the package msr-tools, and then as root do:

# modprobe msr
# rdmsr 0x8b

Paste the output here too please.

Thanks.
Comment 94 Adam Liebermann 2015-05-12 12:54:20 UTC
(In reply to Borislav Petkov from comment #86)
> No need, the observed failure happens only with specific AMD CPU models
> which are not being sold anymore. And there's also the "dis_ucode_ldr"
> parameter which can be used as a temporary workaround.

only few people will find this bug and workaround, the others will turn away and say that opensuse is not stable. but i guess you have your policies :) thanks anyway for your work, I appreciate the workaround and hopefully the fix later on :)

setpci -s 18.3 0x188.l
00400010

rdmsr 0x8b
10000af

cpuinfo -> see attachment
Comment 95 Adam Liebermann 2015-05-12 12:55:03 UTC
Created attachment 633960 [details]
cpuinfo for a failing machine
Comment 96 Borislav Petkov 2015-05-12 13:52:25 UTC
(In reply to Adam Liebermann from comment #94)
> (In reply to Borislav Petkov from comment #86)
> > No need, the observed failure happens only with specific AMD CPU models
> > which are not being sold anymore. And there's also the "dis_ucode_ldr"
> > parameter which can be used as a temporary workaround.
> 
> only few people will find this bug and workaround, the others will turn away
> and say that opensuse is not stable.

Why so negative?

Don't you think that once we've rootcaused this, we'll make sure there
is a proper fix and it goes to all *linux* distros, not only SUSE?

> but i guess you have your policies :)
> thanks anyway for your work, I appreciate the workaround and hopefully the
> fix later on :)
> 
> setpci -s 18.3 0x188.l
> 00400010

Oh, so your box is also affected - I didn't catch that.

So did that box upgrade microcode successfully at all, at any point
in time? Do you have old logs from it perhaps? Older distros, older
openSUSE, whatever...

> rdmsr 0x8b
> 10000af
> 
> cpuinfo -> see attachment

Thanks for reporting this!
Comment 97 Adam Liebermann 2015-05-12 14:17:04 UTC
(In reply to Borislav Petkov from comment #96)
> (In reply to Adam Liebermann from comment #94)
> > (In reply to Borislav Petkov from comment #86)
> > > No need, the observed failure happens only with specific AMD CPU models
> > > which are not being sold anymore. And there's also the "dis_ucode_ldr"
> > > parameter which can be used as a temporary workaround.
> > 
> > only few people will find this bug and workaround, the others will turn away
> > and say that opensuse is not stable.
> 
> Why so negative?
Well, i didn't find this workaround for a few months and therefore thought that it's just suse not working again. And it's just that I tried installing a few times again and again and there was no resolution. Now I see that the problem is known and the workaround is not distributed automatically.. Just a bit of frustration. Again, I like the product and I value your work and that's also the reason I am here :)

> 
> Don't you think that once we've rootcaused this, we'll make sure there
> is a proper fix and it goes to all *linux* distros, not only SUSE?
Yes, I think so. Didn't know it also affects other distros though. For me it was just a system breaking update, which generally is very bad IMO :)
> 
> > but i guess you have your policies :)
> > thanks anyway for your work, I appreciate the workaround and hopefully the
> > fix later on :)
> > 
> > setpci -s 18.3 0x188.l
> > 00400010
> 
> Oh, so your box is also affected - I didn't catch that.
> 
> So did that box upgrade microcode successfully at all, at any point
> in time? Do you have old logs from it perhaps? Older distros, older
> openSUSE, whatever...
I'm still working on opensuse 13.1, there were no issues. I don't know what exactly gets updated with the microcode, but after a clean install 13.2 worked, after the update it crashed. Installing it via netinst it crashed immediately. No logs, as I wouldn't know which ones you need. I can post whatever you need if you give me the commands.

> 
> > rdmsr 0x8b
> > 10000af
> > 
> > cpuinfo -> see attachment
> 
> Thanks for reporting this!
Comment 98 Adam Liebermann 2015-05-12 14:23:55 UTC
this is from my opensuse 13.1 installation.

journalctl -a | grep -i microcode
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU0: patch_level=0x010000af
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU1: patch_level=0x010000af
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU2: patch_level=0x010000af
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU3: patch_level=0x010000af
May 12 16:20:24 linux-u6y4 kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU0: new patch_level=0x010000c8
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU1: new patch_level=0x010000c8
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU2: new patch_level=0x010000c8
May 12 16:20:24 linux-u6y4 kernel: microcode: CPU3: new patch_level=0x010000c8
Comment 99 Adam Liebermann 2015-05-12 14:33:06 UTC
this is my machine with opensuse 13.2. those two ucode-intel/amd updates are tabooed, I had to start with dis_ucode_ldr, deinstall those updates and initrd..

hope it helps..

journalctl -a | grep -i microcode
May 04 12:08:23 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 04 12:08:23 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 04 12:08:23 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 04 12:08:23 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 04 12:08:23 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 05 18:27:56 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 05 18:27:56 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 05 18:27:56 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 05 18:27:56 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 05 18:27:56 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 11 16:21:43 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 11 16:21:43 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 11 16:21:43 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 11 16:21:43 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 11 16:21:43 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 11 18:40:10 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 11 18:40:10 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 11 18:40:10 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 11 18:40:10 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 11 18:40:10 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 11 19:25:31 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 11 19:25:31 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 11 19:25:31 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 11 19:25:31 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 11 19:25:31 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 11 19:36:39 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 11 19:36:39 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 11 19:36:39 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 11 19:36:39 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 11 19:36:39 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 12 13:15:22 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 12 13:15:22 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 12 13:15:22 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 12 13:15:22 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 12 13:15:22 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 12 13:46:38 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 12 13:46:38 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 12 13:46:38 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 12 13:46:38 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 12 13:46:38 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
May 12 16:25:25 linux-z71q kernel: microcode: CPU0: patch_level=0x010000af
May 12 16:25:25 linux-z71q kernel: microcode: CPU1: patch_level=0x010000af
May 12 16:25:25 linux-z71q kernel: microcode: CPU2: patch_level=0x010000af
May 12 16:25:25 linux-z71q kernel: microcode: CPU3: patch_level=0x010000af
May 12 16:25:25 linux-z71q kernel: microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
Comment 100 Borislav Petkov 2015-05-12 16:48:35 UTC
(In reply to Adam Liebermann from comment #97)
> Well, i didn't find this workaround for a few months and therefore
> thought that it's just suse not working again. And it's just that
> I tried installing a few times again and again and there was no
> resolution. Now I see that the problem is known and the workaround is
> not distributed automatically.. Just a bit of frustration. Again, I
> like the product and I value your work and that's also the reason I am
> here :)

Thanks, that's always nice to hear. :-)

Yeah, so this is a one-time, exceptional thing as *everything* else
works, see comment #68 and the ones before it. We tested it on
everything we could get our hands on and it was fine. I've been running
those changes for a while with the upstream kernel to make sure they're
good to go before backporting them to oS.

But, as it happens according to Murphy, only those boxes are funny. And
I'm working on a solution, it just takes time.

But I certainly understand you - a system shouldn't break from an update
but it seems we've poked the beast too deep this time :-)

> Yes, I think so. Didn't know it also affects other distros though. For

Yeah, paint is still wet about the whole issue but once I have concrete
info, I'll put it here.

> I'm still working on opensuse 13.1, there were no issues. I don't
> know what exactly gets updated with the microcode, but after a clean
> install 13.2 worked, after the update it crashed. Installing it via
> netinst it crashed immediately.

Right, so early microcode update wasn't working on AMD at all. I've
fixed it upstream and we pulled it into openSUSE because it is a Good
Thing(tm) to have. And this issue was unexpected.

> No logs, as I wouldn't know which ones you need. I can post whatever
> you need if you give me the commands.

No need, the other two messages you pasted are enough. It basically
shows that 13.1 did late microcode update (which has always worked fine
but was happening too late) and 13.2 didn't update at all.

The version 0x010000af is the one which your system gets from the BIOS
and 0x010000c8 is the latest one which AMD distributes.

Thanks for now and stay tuned.
Comment 101 Peter Kirchgeßner 2015-05-12 16:51:37 UTC
Created attachment 633995 [details]
cpuinfo_pk.txt where setpci-command gives 00400010

# setpci -s 18.3 0x188.l
00400010
# cat /proc/cpuinfo > /tmp/cpuinfo_pk.txt
Comment 102 Peter Kirchgeßner 2015-05-12 18:35:51 UTC
(In reply to James Moe from comment #89)
...
> Disabling the modules ucode-intel and ucode-amd in the Software Manager was
> not sufficient to prevent a catatonic system. It is a necessary requirement
> to include "dis_ucode_ldr" in the kernel parameters at boot time (GRUB2).

I assume that uninstalling ucode-amd is missing to run an mkinitrd.
When I install ucode-amd using yast, this takes quite a while to run the mkinitrd for the installed kernels. Uninstalling ucode-amd instead is finished within a second. And the system still hangs after reboot, when dis_ucode_ldr is not used. I have to run an mkinitrd manually to be able to boot without dis_ucode_ldr.
Comment 103 James Moe 2015-05-12 20:23:51 UTC
(In reply to Borislav Petkov from comment #93)
> 
> Can you also read out the microcode version?
> 
$ modprobe msr

My workstation:
$ rdmsr 0x8b
10000db

The main server:
$ rdmsr 0x8b
0
Comment 104 Borislav Petkov 2015-05-12 20:35:57 UTC
(In reply to James Moe from comment #103)
> (In reply to Borislav Petkov from comment #93)
> > 
> > Can you also read out the microcode version?
> > 
> $ modprobe msr

Thanks, ...

> My workstation:
> $ rdmsr 0x8b
> 10000db

I'm guessing that's the X4 640, right? Because I was missing the
microcode version from /proc/cpuinfo in comment #91...

> The main server:
> $ rdmsr 0x8b
> 0

This looks strange. So this system doesn't have *any* microcode applied,
if the 0 there is correct. How old is that BIOS, have you tried
upgrading it at some point?

Thanks.
Comment 105 Peter Kirchgeßner 2015-05-14 07:12:02 UTC
(In reply to Borislav Petkov from comment #100)
...
> 
> But, as it happens according to Murphy, only those boxes are funny. And
> I'm working on a solution, it just takes time.
> 
...

Thank you for that. I am sure those CPUs which make problems will stay in use quite a while. The performance of my machine is ok. I don't plan to update it within the next years. And I know some colleagues, who have set up some older machines, which they do no longer use, so that their grand-parents can use them with the internet or some office applications. And these machines are running Linux. So CPUs which are no longer sold are not automatically out of use.
Comment 106 Borislav Petkov 2015-06-04 09:14:45 UTC
Just a quick update: I'm working on a solution with AMD so stay tuned. Once I have the fixes, I'll most likely have test kernels for you guys to try.

Thanks for the patience.

(kill NEEDINFO while at it).
Comment 107 Borislav Petkov 2015-06-13 17:31:04 UTC
Ok guys,

test kernels are at http://beta.suse.com/private/bpetkov/. Both x86_64 and i686 flavors.

Please give them a try.

Thanks.
Comment 108 Rafael M Redondo 2015-06-14 20:56:43 UTC
Exactly the same problem. In three months I installed OpenSUSE 13.2 x64 al least five times, cause this problem was unsolvable. Today, thanks to OpenSUSE forums, I will not reinstall again, after putting "dis_ucode_ldr" in grub line.

# inxi -F -%
System:    Host: htpc Kernel: 3.16.7-21-desktop x86_64 (64 bit) 
           Console tty 0 Distro: openSUSE 13.2 (x86_64) VERSION = 13.2 CODENAME = Harlequin # /etc/SuSE-release is deprecated and will be removed in the future, use /etc/os-release instead
Machine:   Mobo: ASRock model: M3A785GM-LE/128M Bios: American Megatrends version: P1.10 date: 11/26/2009
CPU:       Dual core AMD Athlon II X2 4400e (-MCP-) cache: 2048 KB flags: (lm nx sse sse2 sse3 sse4a svm) 
           Clock Speeds: 1: 1500.00 MHz 2: 2700.00 MHz
Graphics:  Card: NVIDIA GT218 [GeForce 210] 
           X.org: 1.16.1 drivers: nvidia (unloaded: fbdev,nv,vesa,nouveau) tty size: 140x43 Advanced Data: N/A for root out of X
Audio:     Card-1: NVIDIA High Definition Audio Controller driver: snd_hda_intel Sound: ALSA ver: k3.16.7-21-desktop
           Card-2: Advanced Micro Devices [AMD/ATI] SBx00 Azalia (Intel HDA) driver: snd_hda_intel
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller driver: r8169 
           IF: enp2s0 state: up speed: 100 Mbps duplex: full mac: 00:25:22:34:ae:71
Drives:    HDD Total Size: 1000.2GB (104.5% used) 1: /dev/sda WDC_WD10EARS 1000.2GB 
Partition: ID: / size: 15G used: 6.3G (48%) fs: btrfs ID: /tmp size: 15G used: 6.3G (48%) fs: btrfs 
           ID: /home size: 22G used: 18G (86%) fs: ext4 ID: swap-1 size: 4.29GB used: 0.00GB (0%) fs: swap 
Sensors:   Error: You do not have the sensors app installed.
Info:      Processes: 188 Uptime: 0:38 Memory: 327.0/2004.9MB Runlevel: 5 Client: Shell inxi: 1.7.24
Comment 109 Rafael M Redondo 2015-06-14 21:05:18 UTC
This info could be usefull too (better than inxi output):
#cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 4400e Processor
stepping        : 2
microcode       : 0x1000098
cpu MHz         : 800.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save
bogomips        : 5422.73
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 4400e Processor
stepping        : 2
microcode       : 0x1000098
cpu MHz         : 1500.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save
bogomips        : 5422.73
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
Comment 110 Borislav Petkov 2015-06-14 21:41:35 UTC
Try the kernels here please: http://beta.suse.com/private/bpetkov/
Comment 111 Peter Kirchgeßner 2015-06-16 18:34:59 UTC
I installed ucode-amd using Yast.
I installed kernel-desktop-3.16.7-125.1.gf382e20.x86_64.rpm using rpm -i.
I have no dis_ucode_ldr.
The kernel is working. But to me it seems that the AMD-microcode update has not been been applied.
/proc/cpuinfo says 

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 6
model name      : AMD Athlon(tm) II X2 240e Processor
stepping        : 2
microcode       : 0x100009f

mkinitrd says 

*** Constructing AuthenticAMD.bin ****


dmesg | grep -i microcode  says

[    1.307748] microcode: CPU0: patch_level=0x0100009f
[    1.307845] microcode: CPU1: patch_level=0x0100009f
[    1.308013] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Even after an "echo 1 > /sys/devices/system/cpu/microcode/reload"
Comment 112 Borislav Petkov 2015-06-16 18:52:45 UTC
(In reply to Peter Kirchgeßner from comment #111)
> The kernel is working. But to me it seems that the AMD-microcode
> update has not been been applied.

That is correct. The microcode patch level installed by your BIOS
(0x0100009f) is final in the sense that the microcode loader in Linux
should not attempt to install newer and thus overwrite it.

This is what this test kernel contains - a quirk list of microcode
versions which should not be overwritten.

So thanks for testing, I'll add your Tested-by tag to the patch.
Comment 113 Borislav Petkov 2015-07-30 08:46:25 UTC
Fixes queued for the next maint. update. Closing.
Comment 114 Ashfaqur Rahman 2015-08-20 21:02:10 UTC
Hi,

I am only here because the guys at the forum said I should report this. I have never reported any bug before.

https://forums.opensuse.org/showthread.php/509261-VirtualBox-guest-OS-boot-failure-(after-update)

I could not finish reading all the comments above (only up to 25/30), sorry.

Here is my situation

Processor - Athlon X2 250
Motherboard - MSI 880GMA-E335(FX) (BIOS updated)
RAM - 8 GB

Host - openSUSE 13.2_x64
Guest - openSUSE 13.2_x86

All updates have been applied to both Host & Guest. Now Guest Hangs at

----
Loading initial ramdisk...
----

as mentioned in comment 13.

First found this problem about (probably) 6/7 days ago after pulling all updates to both Host & Guest.

Also using "dis_ucode_ldr" as mentioned in comment 20 fixes the issue. 

So far have narrowed the problem to the following pkgs being upgraded.

patterns-openSUSE-enhanced_base 
patterns-openSUSE-enhanced_base_opt 
ucode-amd 
ucode-intel 

Actually the last 2 pkgs are dependencies of the first 2 pkgs (I think).

Thanks
Emon
Comment 115 Borislav Petkov 2015-08-21 04:41:58 UTC
By host you mean virtual box?

Can you try the latest kernel of the day here:

http://kernel.opensuse.org/packages/openSUSE-13.2

Thanks.
Comment 116 Ashfaqur Rahman 2015-08-21 06:41:36 UTC
(In reply to Borislav Petkov from comment #115)
> By host you mean virtual box?
> 

By host I meant the 'Host Operating System' or the 'Physical/Real machine'

By Guest I meant the 'Virtual Operating System' or the 'Virtual Machine'


> Can you try the latest kernel of the day here:
> 
> http://kernel.opensuse.org/packages/openSUSE-13.2
> 

I will try, but where do I install the kernel?

In the Host OS or the Guest OS? or both?

Thanks
Comment 117 Borislav Petkov 2015-08-21 12:40:05 UTC
(In reply to Ashfaqur Rahman from comment #116)
> By host I meant the 'Host Operating System' or the 'Physical/Real machine'
> 
> By Guest I meant the 'Virtual Operating System' or the 'Virtual Machine'

I know what those mean. I wanted to know what kind of virtualization
you're using: kvm, virtual box, xen...?

> > Can you try the latest kernel of the day here:
> > 
> > http://kernel.opensuse.org/packages/openSUSE-13.2
> > 
> 
> I will try, but where do I install the kernel?

You replace the kernel which fails booting. You sad "Guest Hangs" and
that dis_ucode_ldr fixes the issue so it seems you should install this
in the guest...

HTH.
Comment 118 Ashfaqur Rahman 2015-08-21 17:07:26 UTC
(In reply to Borislav Petkov from comment #117)

> I know what those mean. I wanted to know what kind of virtualization
> you're using: kvm, virtual box, xen...?
 

Oh, sorry about that, my bad. I am using VirtualBox (Version 4.3.30)


> You replace the kernel which fails booting. You sad "Guest Hangs" and
> that dis_ucode_ldr fixes the issue so it seems you should install this
> in the guest...
 

Thanks for explaining that.

Added the repo for the kernel in YaST

Updated everything to the latest version

Now the kernel is Kernel-Desktop(3.16.7-97.gec183cc)  

Removed "dis_ucode_ldr"

Guest hangs! Just like before :(
Comment 119 Borislav Petkov 2015-08-22 09:22:44 UTC
(In reply to Ashfaqur Rahman from comment #118)
> Guest hangs! Just like before :(

Hmm, I had already tried virtual box and it did work with the fixed
kernel...

Oh well, I can try to reproduce it but it'll take a while since I'm on
vacation currently.
Comment 120 Ashfaqur Rahman 2015-08-22 15:35:15 UTC
(In reply to Borislav Petkov from comment #119)

> Oh well, I can try to reproduce it but it'll take a while since I'm on
> vacation currently.

OK, thanks.

Enjoy your time, best of luck.

Will be looking forward to your feedback.

BTW: I don't mean to pry on your personal schedule, but any approximate idea/date when you might be able to look into this?
Comment 121 Ashfaqur Rahman 2015-08-22 19:35:48 UTC
Created attachment 644667 [details]
pkg log (Guest)
Comment 122 Ashfaqur Rahman 2015-08-22 19:37:28 UTC
Created attachment 644668 [details]
pkg log (Host)
Comment 123 Ashfaqur Rahman 2015-08-22 19:48:46 UTC
Just updated both (Host+Guest) machines

Everything seems to be working fine now!!

Here is the procedure I followed...

First I Updated Host

Rebooted

Then I restored a snapshot of the Guest which taken just before applying kernel updates as mentioned in comment 118. This snapshot did not have the following pkgs updated...

patterns-openSUSE-enhanced_base 
patterns-openSUSE-enhanced_base_opt 
ucode-amd 
ucode-intel
also no 'dis_ucode_ldr' line

Ran 'zypper update' on Guest, everthing seem to be working fine.

I immediately took snippets from /var/log/zypp/histroy for both machines to see what got changed, and frankly I am a bit confused.

Hope you guys will be able to make better sense of it.

I accidentally attached those logs before this comment; sorry still new to bugzilla.
Comment 124 Ashfaqur Rahman 2015-08-22 20:08:01 UTC
(In reply to Ashfaqur Rahman from comment #123)


This is actually a correction 

 
> First I Updated Host
> 
> Rebooted
> 
> Then I restored a snapshot of the Guest

If you check the logs carefully you will notice that I had actually updated the Guest first.!!

Sorry, was caught up in the excitement of things.

Thanks
Comment 125 Borislav Petkov 2015-08-23 05:39:11 UTC
(In reply to Ashfaqur Rahman from comment #124)
> If you check the logs carefully you will notice that I had actually
> updated the Guest first.!!

So I guess something wasn't regenerated properly with the new kernel
during your first try. initrd needed recreation, probably (I'm only just
guessing...).
Comment 126 Ashfaqur Rahman 2015-08-23 06:37:56 UTC
(In reply to Borislav Petkov from comment #125)
 
> So I guess something wasn't regenerated properly with the new kernel
> during your first try. initrd needed recreation, probably (I'm only just
> guessing...).


I am not sure what you meant by the words "new kernel"...

If you meant "kernel of the day" in the Guest, which I had upgraded; then that kernel is not involved here cos I started the Guest machine from a snapshot, which was taken before upgrading to that kernel (or adding that repo).

Also... why is the initrd being recreated *on the Host* ???

Did the Host kernel change?? It seems only the pkg "virtualbox-guest-kmp-desktop" when upgraded (in the Host), initiated the initrd being recreated!! Is that supposed to happen??

"virtualbox-guest-kmp-desktop" is also the common pkg that got upgraded in both the Host & Guest; and it seems to have fixed all the problems (..just guessing..)

Thanks
Comment 127 Ashfaqur Rahman 2015-08-29 16:20:56 UTC
This is just an update...

I did a clean install of the guest (formatted all partitions to EXT 4).

I am back to square one now

I have learned my lesson finally decide to 'taboo' the two CULPRIT pkgs FOREVER......., or at least until someone can fix this.  

ucode-amd
ucode-intel

These pkgs are not there during the initial install; they are there only as result of performing 'zypper up'
Comment 128 Peter Kirchgeßner 2015-11-05 20:44:11 UTC
FYI:
I just installed openSUSE Leap 42.1 and the system freezed on the first boot.
So I added dis_ucode_ldr again and it worked.
I will keep that boot option for the future.
Comment 129 Borislav Petkov 2015-11-05 21:05:49 UTC
(In reply to Peter Kirchgeßner from comment #128)
> FYI:
> I just installed openSUSE Leap 42.1 and the system freezed on the first boot.
> So I added dis_ucode_ldr again and it worked.
> I will keep that boot option for the future.

Most likely because Leap doesn't have the fixes yet. I'll push them there too, stay tuned.
Comment 130 Borislav Petkov 2015-11-09 12:53:01 UTC
(In reply to Peter Kirchgeßner from comment #128)
> FYI:
> I just installed openSUSE Leap 42.1 and the system freezed on the first boot.
> So I added dis_ucode_ldr again and it worked.
> I will keep that boot option for the future.

Ok, try the kernel here, I've backported the required fixes over:

http://download.opensuse.org/repositories/home:/bpetkov:/leap-ucode/standard/x86_64/

Thanks.
Comment 131 Peter Kirchgeßner 2015-11-28 14:43:59 UTC
(In reply to Borislav Petkov from comment #130)
> (In reply to Peter Kirchgeßner from comment #128)
> > FYI:
> > I just installed openSUSE Leap 42.1 and the system freezed on the first boot.
> > So I added dis_ucode_ldr again and it worked.
> > I will keep that boot option for the future.
> 
> Ok, try the kernel here, I've backported the required fixes over:
> 
> http://download.opensuse.org/repositories/home:/bpetkov:/leap-ucode/standard/
> x86_64/
> 
> Thanks.

Now I found the time to test it.
I used kernel-default-4.1.12-1.1.gab3d45d.x86_64.rpm.
Did the same as described in comment 111 (https://bugzilla.suse.com/show_bug.cgi?id=913996#c111). (ucode-amd installed and no dis_ucode_ldr).
Got the same results (kernel is working).

Thank you!
Comment 132 Borislav Petkov 2015-12-02 16:44:33 UTC
Ok, thanks for letting me know and testing. Fixes will be in the next Leap kernel maintenance release.
Comment 133 Swamp Workflow Management 2016-01-29 13:12:24 UTC
openSUSE-SU-2016:0280-1: An update that solves 10 vulnerabilities and has 18 fixes is now available.

Category: security (important)
Bug References: 865096,865259,913996,950178,950998,952621,954324,954532,954647,955422,956708,957152,957988,957990,958439,958463,958504,958510,958886,958951,959190,959399,960021,960710,961263,961509,962075,962597
CVE References: CVE-2015-7550,CVE-2015-8539,CVE-2015-8543,CVE-2015-8550,CVE-2015-8551,CVE-2015-8552,CVE-2015-8569,CVE-2015-8575,CVE-2015-8767,CVE-2016-0728
Sources used:
openSUSE Leap 42.1 (src):    kernel-debug-4.1.15-8.1, kernel-default-4.1.15-8.1, kernel-docs-4.1.15-8.3, kernel-ec2-4.1.15-8.1, kernel-obs-build-4.1.15-8.2, kernel-obs-qa-4.1.15-8.1, kernel-obs-qa-xen-4.1.15-8.1, kernel-pae-4.1.15-8.1, kernel-pv-4.1.15-8.1, kernel-source-4.1.15-8.1, kernel-syms-4.1.15-8.1, kernel-vanilla-4.1.15-8.1, kernel-xen-4.1.15-8.1