Bug 1127072

Summary: [amdgpu] AMD Radeon VII (Vega20) Support
Product: [openSUSE] openSUSE Tumbleweed Reporter: Filip Vaverka <dxxf>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dxxf, tiwai
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Filip Vaverka 2019-02-26 19:49:47 UTC
It seems to me as Vega20 (not sure if applicable for AMD Radeon VII) is supposed to be working on Linux 4.20+. 
Yet on my system (OpenSUSE Tumbleweed with either kernel 4.20.10 or 5.0.rc8) amdgpu fails to initialize the GPU with following errors:

Feb 26 20:11:48 XXX kernel: amdgpu 0000:10:00.0: Fatal error during GPU init
Feb 26 20:11:48 XXX kernel: amdgpu 0000:10:00.0: amdgpu_device_ip_init failed
Feb 26 20:11:48 XXX kernel: [drm:amdgpu_device_init.cold.33 [amdgpu]] *ERROR* sw_init of IP block <psp> failed -2
Feb 26 20:11:48 XXX kernel: [drm:psp_sw_init [amdgpu]] *ERROR* Failed to load psp firmware!
Feb 26 20:11:48 XXX kernel: amdgpu 0000:10:00.0: psp v11.0: Failed to load firmware "amdgpu/vega20_sos.bin"
Feb 26 20:11:48 XXX kernel: amdgpu 0000:10:00.0: Direct firmware load for amdgpu/vega20_sos.bin failed with error -2

and

Feb 26 20:11:48 XXX kernel: [drm:sdma_v4_0_early_init [amdgpu]] *ERROR* Failed to load sdma firmware!
Feb 26 20:11:48 XXX kernel: [drm:sdma_v4_0_early_init [amdgpu]] *ERROR* sdma_v4_0: Failed to load firmware "amdgpu/vega20_sdma.bin"
Feb 26 20:11:48 XXX kernel: amdgpu 0000:10:00.0: Direct firmware load for amdgpu/vega20_sdma.bin failed with error -2
Comment 1 Takashi Iwai 2019-02-27 16:48:42 UTC
Just to be sure: did you install the kernel-firmware package?  Try the latest one in OBS Kernel:HEAD repo.
Comment 2 Filip Vaverka 2019-02-27 18:38:49 UTC
Although I reinstalled kernel-firmware package multiple times it turns out I'm having issue with "mkinitrd" and/or "dracut" configuration. 
For some reason neither "postinstall" script in kernel-firmware package and direct call to "mkinitrd" runs "dracut" in a way in which searches "/lib/firmware" and installed firmwares are not found.
It seems like running "dracut" manually with "--fwdir=/lib/firmware" fixes the issue. After that only missing firmware is "amdgpu/vega20_ta.bin" which doesn't seem to be an issue.
Comment 3 Takashi Iwai 2019-02-27 18:59:21 UTC
Hrm, then it sounds like a problem of dracut.

Or wait...  Did you install amdgpu-pro stuff?  It broke dracut once due its incorrect dracut config snippet.  It would explain the failure.
Comment 4 Filip Vaverka 2019-02-27 19:07:55 UTC
Thanks, that was probably it! I found some "amdgpu-pro-*" file in "/etc/dracut.conf.d", if that overrided default search path for firmwares needed by "amdgpu" module then that was the issue.
Either way running "mkinitrd" no longer complains about missing firmware files.
Comment 5 Takashi Iwai 2019-03-18 15:06:34 UTC
OK, let's close the bug, then.