Bugzilla – Bug 1185513
ACPI BIOS Error after upgrade to 5.12.0-1-default
Last modified: 2021-06-17 14:36:16 UTC
I recently updated to the 5.12.0-1-default kernel with openSUSE Tumbleweed and now I see the following messages during boot. These were not there before. The system seems to be fully functioning. There are updates to the BIOS I could try (however I'm afraid to as I've been burned updating the BIOS before) however I thought I'd start here first since it was the Linux kernel change (or whatever else came in the update this morning) that prompted it. Are there any thoughts on this? Thank you for your help with this. {{{ > dmesg | grep AE_NOT_FOUND [ 1.149107] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149136] ACPI Error: Aborting method \_PR.PR01._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149187] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149212] ACPI Error: Aborting method \_PR.PR02._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149262] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149287] ACPI Error: Aborting method \_PR.PR03._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149335] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149360] ACPI Error: Aborting method \_PR.PR04._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149408] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149433] ACPI Error: Aborting method \_PR.PR05._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149481] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149506] ACPI Error: Aborting method \_PR.PR06._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149553] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149578] ACPI Error: Aborting method \_PR.PR07._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149626] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149650] ACPI Error: Aborting method \_PR.PR08._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149698] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149723] ACPI Error: Aborting method \_PR.PR09._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149770] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149794] ACPI Error: Aborting method \_PR.PR10._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.149842] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.149867] ACPI Error: Aborting method \_PR.PR11._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) }}}
It also seems as if my CPU is running hotter than normal. {{{ > uptime ; cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq ; lscpu | grep MHz 12:31:54 up 0:35, 4 users, load average: 0.56, 0.57, 0.50 powersave CPU MHz: 3700.000 CPU max MHz: 4700.0000 CPU min MHz: 800.0000 }}} If I repeat that in quick succession, I will see it dip below the 3.7 GHz, however it is staying here far more often than it dips and my machine has been up for 35+ minutes and I am not running anything. I have another openSUSE Tumbleweed machine that was also recently updated. It does not have the errors mentioned in this ticket and it has the following look (which looks about right with the frequency straddling the minimum). Thoughts? {{{ > uptime ; cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq ; lscpu | grep MHz 12:34:52 up 3:38, 1 user, load average: 0.80, 0.87, 0.90 schedutil CPU MHz: 1400.000 CPU max MHz: 4000.0000 CPU min MHz: 1400.0000 }}}
I'm getting very similar errors on my laptop since the update to 5.12 [ 1.167887] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.167918] ACPI Error: Aborting method \_PR.CPU1._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.167989] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168012] ACPI Error: Aborting method \_PR.CPU2._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.168072] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168094] ACPI Error: Aborting method \_PR.CPU3._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.168139] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168161] ACPI Error: Aborting method \_PR.CPU4._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.168205] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168227] ACPI Error: Aborting method \_PR.CPU5._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.168269] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168291] ACPI Error: Aborting method \_PR.CPU6._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.168335] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.168357] ACPI Error: Aborting method \_PR.CPU7._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529)
(In reply to Anthony Agelastos from comment #1) > It also seems as if my CPU is running hotter than normal. > > {{{ > > uptime ; cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq ; lscpu | grep MHz > 12:31:54 up 0:35, 4 users, load average: 0.56, 0.57, 0.50 > powersave > CPU MHz: 3700.000 > CPU max MHz: 4700.0000 > CPU min MHz: 800.0000 > }}} > > If I repeat that in quick succession, I will see it dip below the 3.7 GHz, > however it is staying here far more often than it dips and my machine has > been up for 35+ minutes and I am not running anything. I have another > openSUSE Tumbleweed machine that was also recently updated. It does not have > the errors mentioned in this ticket and it has the following look (which > looks about right with the frequency straddling the minimum). Thoughts? > > {{{ > > uptime ; cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq ; lscpu | grep MHz > 12:34:52 up 3:38, 1 user, load average: 0.80, 0.87, 0.90 > schedutil > CPU MHz: 1400.000 > CPU max MHz: 4000.0000 > CPU min MHz: 1400.0000 > }}} I believe I am mistaken about this. I used snapshots to go back to 5.11 and it is also running a 3.7 GHz rather than 800 MHz. I could've sworn it used to default to the latter more often than the former when it wasn't taxed.
Same problem here: [ 1.821442] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.821478] ACPI Error: Aborting method \_PR.CPU1._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.821629] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.821664] ACPI Error: Aborting method \_PR.CPU2._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.821812] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.821848] ACPI Error: Aborting method \_PR.CPU3._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.821961] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.821993] ACPI Error: Aborting method \_PR.CPU4._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.822062] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.822094] ACPI Error: Aborting method \_PR.CPU5._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.822203] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.822234] ACPI Error: Aborting method \_PR.CPU6._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529) [ 1.822336] ACPI BIOS Error (bug): Could not resolve symbol [\_PR.CPU0._CPC], AE_NOT_FOUND (20210105/psargs-330) [ 1.822367] ACPI Error: Aborting method \_PR.CPU7._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529)
Although the ACPI _CPC parse errors look like a regression in 5.12 kernel, this is likely a red herring about the other symptoms. Adding Joey to Cc for ACPI errors.
Hi Anthony, Hans de Jong, Could you please provide acpidump result? Please use this command: # acpidump > acpidump.raw Please attached acpidump.raw result file on bugzilla. I want to check the acpi table for the _CPC. Thanks!
(In reply to Anthony Agelastos from comment #0) > I recently updated to the 5.12.0-1-default kernel with openSUSE Tumbleweed > and now I see the following messages during boot. These were not there > before. On the other hand, do you mean that this messages did not show in v5.11 kernel? Thanks
Created attachment 848990 [details] acpidump
Created attachment 848991 [details] acpidump
(In reply to Joey Lee from comment #7) > (In reply to Anthony Agelastos from comment #0) > > I recently updated to the 5.12.0-1-default kernel with openSUSE Tumbleweed > > and now I see the following messages during boot. These were not there > > before. > > On the other hand, do you mean that this messages did not show in v5.11 > kernel? > > Thanks I do an (zypper dup) update almost every day, and because this error is displayed quite prominently (twice) during boot I am pretty sure that v5.11 did not have this bug.
Created attachment 849003 [details] Anthony Agelastos' acpidump.raw This is my output from `acpidump`. Please note that I am booted into a snapshot from 5.11 right before 5.12 was released. If I need to run this from 5.12, then please let me know.
(In reply to Joey Lee from comment #6) > Hi Anthony, Hans de Jong, > > Could you please provide acpidump result? Please use this command: > > # acpidump > acpidump.raw > > Please attached acpidump.raw result file on bugzilla. I want to check the > acpi table for the _CPC. > > Thanks! I just uploaded mine. My output is from 5.11. If you need it from 5.12, then please let me know.
(In reply to Joey Lee from comment #7) > (In reply to Anthony Agelastos from comment #0) > > I recently updated to the 5.12.0-1-default kernel with openSUSE Tumbleweed > > and now I see the following messages during boot. These were not there > > before. > > On the other hand, do you mean that this messages did not show in v5.11 > kernel? > > Thanks These messages do not show up in the v5.11 kernel. I am booted into a 5.11 snapshot with the following kernel and it does not have these errors. $ uname -a Linux babeltumble 5.11.16-1-default #1 SMP Thu Apr 22 10:30:16 UTC 2021 (e06d321) x86_64 x86_64 x86_64 GNU/Linux
Three acpidump result show 2 different situations: - robert's and Hans's machines have _real_ firmware bug. They have similar ApHwp tables. But both of them do _NOT_ have \_PR.CPU0._CPC definition in other SSDT tables. So it makes sense that they saw "Could not resolve symbol [\_PR.CPU0._CPC]" I have used acpiexec to parse their SSDT tables and confirmed that their _CPC method is missed in CPU0. I used two different version of ACPICA, version 20210105 corresponding to v5.12 kernel and version 20201113 corresponding to v5.11 kernel. Both of those two ACPICA show the same error: - execute \_PR.PR01._CPC Evaluating \_PR.PR01._CPC Evaluation of \_PR.PR01._CPC failed with status AE_NOT_FOUND So robert's and Hans's machine has firmware bug.
The real strange thing is in Anthony's ACPI tables. ACPI tables of his machine has _CPC method in processor 0 in ssdt15: Scope (\_PR.PR00) { Method (_CPC, 0, NotSerialized) // _CPC: Continuous Performance Control { If ((\_PR.CFGD & 0x01000000)) { Return (CPOC) /* External reference */ } Else { Return (CPC2) /* External reference */ } } } It looks no problem, then I used two version ACPICA to verify the execution of \_PR.PR01._CPC. I can _NOT_ reproduce AE_NOT_FOUND error. The execution of \_PR.PR01._CPC is success: - execute \_PR.PR01._CPC Evaluating \_PR.PR01._CPC Evaluation of \_PR.PR01._CPC returned object 0x18d6a90, external buffer length 3A8 [Package] Contains 21 Elements: [Integer] = 0000000000000015 ... Both of two version ACPICA (version 20210105 for v5.12 kernel and version 20201113 for v5.11 kernel) are success. Hi Anthony, Could you please confirm that you still see the AE_NOT_FOUND parsing error on _CPC with v5.12 kernel? Could you please help to attach dmesg log on bugzilla? Thanks
It's very odd, as I had not experienced these error messages until a couple of updates before. I hope this doesn't affect any important functionality as a BIOS update for my machine is very unlikely.
Created attachment 849033 [details] This contains acpidump and dmesg info for 5.11 and 5.12 kernels. This compressed tarball contains output for `acpidump` and `dmesg` for the 5.11 and 5.12 kernels.
(In reply to Joey Lee from comment #15) > The real strange thing is in Anthony's ACPI tables. ACPI tables of his > machine has _CPC method in processor 0 in ssdt15: > > Scope (\_PR.PR00) > { > Method (_CPC, 0, NotSerialized) // _CPC: Continuous Performance > Control > { > If ((\_PR.CFGD & 0x01000000)) > { > Return (CPOC) /* External reference */ > } > Else > { > Return (CPC2) /* External reference */ > } > } > } > > It looks no problem, then I used two version ACPICA to verify the execution > of \_PR.PR01._CPC. I can _NOT_ reproduce AE_NOT_FOUND error. The execution > of \_PR.PR01._CPC is success: > > - execute \_PR.PR01._CPC > Evaluating \_PR.PR01._CPC > Evaluation of \_PR.PR01._CPC returned object 0x18d6a90, external buffer > length 3A8 > [Package] Contains 21 Elements: > [Integer] = 0000000000000015 > ... > > Both of two version ACPICA (version 20210105 for v5.12 kernel and version > 20201113 for v5.11 kernel) are success. > > Hi Anthony, > > Could you please confirm that you still see the AE_NOT_FOUND parsing error > on _CPC with v5.12 kernel? Could you please help to attach dmesg log on > bugzilla? > > Thanks Hello Joey: I updated from openSUSE Tumbleweed 20210427-0 -> 20210502-0 where the former had 5.11 (5.11.16-1-default) and the new one 5.12 (5.12.0-2-default). The errors persist. I attached a tarball that contains acpidump and dmesg info from when I was in 5.11 and 5.12 to be compared. This was a fresh acpidump on 5.11 in case that helps. Thank you for your help with this ticket; your responses are very much appreciated.
Created attachment 849038 [details] more acpi dumps frpm kernel 5.11 and kernel 5.12 I added my acpi dumps from an ACER Aspire A517-51-5832. I see the errors with kernel 5.12 and not with kernel 5.11 too. The newest available BIOS (EFI) from 2019 is installed.
Hi Anthony, Thanks for your dmesg log and acpidump result. (In reply to Anthony Agelastos from comment #17) > Created attachment 849033 [details] > This contains acpidump and dmesg info for 5.11 and 5.12 kernels. > > This compressed tarball contains output for `acpidump` and `dmesg` for the > 5.11 and 5.12 kernels. Base on dmesg log, I saw that the PmRef Cpu0Hwp SSDT be loaded on v5.11 kernel but not on v5.12 kernel. The table has defined _CPC method. I am looking at why is this Cpu0Hwp SSDT not loaded.
(In reply to Joey Lee from comment #20) > Hi Anthony, > > Thanks for your dmesg log and acpidump result. > > (In reply to Anthony Agelastos from comment #17) > > Created attachment 849033 [details] > > This contains acpidump and dmesg info for 5.11 and 5.12 kernels. > > > > This compressed tarball contains output for `acpidump` and `dmesg` for the > > 5.11 and 5.12 kernels. > > Base on dmesg log, I saw that the PmRef Cpu0Hwp SSDT be loaded on v5.11 > kernel but not on v5.12 kernel. The table has defined _CPC method. > > I am looking at why is this Cpu0Hwp SSDT not loaded. Base on Anthony's SSDT11.dsl table, the dynamic load table command is in GCAP method. The following logic be used to load CPU0HWP and HWPLVT tables. But looks that the logic did not run with v5.12 kernel: Method (_PDC, 1, Serialized) // _PDC: Processor Driver Capabilities { Local0 = CPDC (Arg0) GCAP (Local0) } Method (GCAP, 1, Serialized) { [...snip] If ((OSYS >= 0x07DF)) { If (((CFGD & 0x00400000) && !(SDTL & 0x40))) { If ((\_SB.OSCP & 0x40)) { SDTL |= 0x40 OperationRegion (HWP0, SystemMemory, DerefOf (SSDT [0x0D]), DerefOf (SSDT [0x0E])) // 0x0E = 0x000000BA, CPU0HWP Load (HWP0, HW0) /* \_PR_.PR00.HW0_ */ If ((CFGD & 0x00800000)) { OperationRegion (HWPL, SystemMemory, DerefOf (SSDT [0x13]), DerefOf (SSDT [0x14])) // 0x14 =HWPLVT Load (HWPL, HW2) /* \_PR_.PR00.HW2_ */ } } [...snip] There have two possibilities: - The _PDC did not be called by OSPM drivers. - The _PDC -> GCAP be called, but CFGD doesn't have 0x00400000 or OSCP doesn't have 0x40 The OSCP should be set by OSPM through evaluating _OSC method. But I didn't see any kernel code change in this part. The 0x40 is OSC_SB_CPCV2_SUPPORT. On the other hand, the CFGD is a field in a SystemMemory region. Looks that firmware set the value.
Hi Anthony, Could you please help to add the following kernel parameter and reboot for capturing more acpi debug log? log_buf_len=10M acpi.debug_layer=0x20000010 acpi.debug_level=0x0000021f It will dump many acpi debug log to dmesg. Please help to attach it on bugzilla. It may causes a long boot time. Please try to wait it. If the log is really too many, then I will try other way. Thanks
This is just a guess, it might be that the above-mentioned issue was introduced with kernel 5.12.rc6 and commit b3041510f0fca598e0311a9df82337f811799d6b: Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Tue Mar 23 20:26:52 2021 +0100 ACPI: tables: x86: Reserve memory occupied by ACPI tables commit 1a1c130ab7575498eed5bcf7220037ae09cd1f8a upstream. The following problem has been reported by George Kennedy: Since commit 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()") the following use after free occurs intermittently when ACPI tables are accessed. BUG: KASAN: use-after-free in ibft_init+0x134/0xc49 Read of size 4 at addr ffff8880be453004 by task swapper/0/1 CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1-7a7fd0d #1 Call Trace: dump_stack+0xf6/0x158 print_address_description.constprop.9+0x41/0x60 kasan_report.cold.14+0x7b/0xd4 __asan_report_load_n_noabort+0xf/0x20 ibft_init+0x134/0xc49 do_one_initcall+0xc4/0x3e0 kernel_init_freeable+0x5af/0x66b kernel_init+0x16/0x1d0 ret_from_fork+0x22/0x30 ACPI tables mapped via kmap() do not have their mapped pages reserved and the pages can be "stolen" by the buddy allocator. Apparently, on the affected system, the ACPI table in question is not located in "reserved" memory, like ACPI NVS or ACPI Data, that will not be used by the buddy allocator, so the memory occupied by that table has to be explicitly reserved to prevent the buddy allocator from using it. In order to address this problem, rearrange the initialization of the ACPI tables on x86 to locate the initial tables earlier and reserve the memory occupied by them. The other architectures using ACPI should not be affected by this change. Link: https://lore.kernel.org/linux-acpi/1614802160-29362-1-git-send-email-george.kennedy@oracle.com/ Reported-by: George Kennedy <george.kennedy@oracle.com> Tested-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>R -- Does kernel 5.12.3 from kernel:stable still show these messages?
(In reply to Joey Lee from comment #21) > (In reply to Joey Lee from comment #20) > > Hi Anthony, > > > > Thanks for your dmesg log and acpidump result. > > > > (In reply to Anthony Agelastos from comment #17) > > > Created attachment 849033 [details] > > > This contains acpidump and dmesg info for 5.11 and 5.12 kernels. > > > > > > This compressed tarball contains output for `acpidump` and `dmesg` for the > > > 5.11 and 5.12 kernels. > > > > Base on dmesg log, I saw that the PmRef Cpu0Hwp SSDT be loaded on v5.11 > > kernel but not on v5.12 kernel. The table has defined _CPC method. > > > > I am looking at why is this Cpu0Hwp SSDT not loaded. > > Base on Anthony's SSDT11.dsl table, the dynamic load table command is in > GCAP method. The following logic be used to load CPU0HWP and HWPLVT tables. > But looks that the logic did not run with v5.12 kernel: > > Method (_PDC, 1, Serialized) // _PDC: Processor Driver Capabilities > { > Local0 = CPDC (Arg0) > GCAP (Local0) > } > > Method (GCAP, 1, Serialized) > { > [...snip] > If ((OSYS >= 0x07DF)) > { > If (((CFGD & 0x00400000) && !(SDTL & 0x40))) > { > If ((\_SB.OSCP & 0x40)) > { > SDTL |= 0x40 > OperationRegion (HWP0, SystemMemory, DerefOf (SSDT [0x0D]), DerefOf > (SSDT [0x0E])) // 0x0E = 0x000000BA, CPU0HWP > Load (HWP0, HW0) /* \_PR_.PR00.HW0_ */ > If ((CFGD & 0x00800000)) > { > OperationRegion (HWPL, SystemMemory, DerefOf (SSDT [0x13]), > DerefOf (SSDT [0x14])) // 0x14 =HWPLVT > Load (HWPL, HW2) /* \_PR_.PR00.HW2_ */ > } > } > [...snip] > > There have two possibilities: > > - The _PDC did not be called by OSPM drivers. > - The _PDC -> GCAP be called, but CFGD doesn't have 0x00400000 or OSCP > doesn't have 0x40 > > The OSCP should be set by OSPM through evaluating _OSC method. But I didn't > see any kernel code change in this part. The 0x40 is OSC_SB_CPCV2_SUPPORT. > > On the other hand, the CFGD is a field in a SystemMemory region. Looks that > firmware set the value. I have checked ACPI dump from Robert and Hans. Both of their tables have the same acpi code for loading tables for _CPC. I believe that they got the same problem so those tables are not loaded.
(In reply to Joey Lee from comment #22) > Hi Anthony, > > Could you please help to add the following kernel parameter and reboot for > capturing more acpi debug log? > > log_buf_len=10M acpi.debug_layer=0x20000010 acpi.debug_level=0x0000021f > > It will dump many acpi debug log to dmesg. Please help to attach it on > bugzilla. > It may causes a long boot time. Please try to wait it. If the log is really > too many, then I will try other way. > > Thanks Greetings: I think I was able to do what you wanted. I don't often add kernel parameters so I edited the GRUB entry and then booted it. The following may indicate I did it correctly (or not). I will attach the updated `dmesg` shortly. Thank you for your help with this. If you need me to do it a different way, then please let me know how. {{{ > cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-5.12.0-2-default root=UUID=78e7058a-1a03-4d45-9981-15adfad3c601 splash=silent mitigations=auto quiet log_buf_len=10M acpi.debug_layer=0x20000010 acpi.debug_level=0x0000021f }}}
Created attachment 849344 [details] dmesg with heightened acpi logging This is the output from dmesg that provides additional ACPI logging.
(In reply to Frank Krüger from comment #23) > This is just a guess, it might be that the above-mentioned issue was > introduced with kernel 5.12.rc6 and commit > b3041510f0fca598e0311a9df82337f811799d6b: > > <snip> > -- > > Does kernel 5.12.3 from kernel:stable still show these messages? I just updated to 5.12.3-1-default through the normal means and the messages are still present.
On my laptop the error messages are still present on kernel 5.12.3-1-default.
JFYI: Others are also affected, even with kernel version 5.12.4: https://bugzilla.kernel.org/show_bug.cgi?id=213023
Created attachment 849427 [details] dsdt.aml
Created attachment 849428 [details] ssdt11.aml Add debug log to \_PR.PR00._PDC, \_PR.PR00._OSC and GCAP.
Hi Anthony, I have attached updated DSDT and SSDT11 tables. Could you please help to do the following steps for debugging? I have added some debug log to the above two tables. - Put updated tables to new initrd: mkdir -p kernel/firmware/acpi cp dsdt.aml kernel/firmware/acpi cp ssdt11.aml kernel/firmware/acpi find kernel | cpio -H newc --create > /boot/instrumented_initrd-5.12.0-2-default cat /boot/initrd-4.12.14-94.41-default >>/boot/instrumented_initrd-5.12.0-2-default - Modify /boot/grub2/grub.cfg Add the following kernel parameters (please remove old): acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF Change the booting initrd: < initrdefi /boot/initrd-5.12.0-2-default change to > initrdefi /boot/instrumented_initrd-5.12.0-2-default Then please reboot and capture dmesg log. You should see some "[ACPI Debug]" log in dmesg. Please attach dmesg log on bugzilla. Thanks!
(In reply to Joey Lee from comment #32) > Hi Anthony, > > I have attached updated DSDT and SSDT11 tables. Could you please help to do The updated DSDT and SSDT tables are on comment#30 and comment#31.
(In reply to Joey Lee from comment #32) > Hi Anthony, > > I have attached updated DSDT and SSDT11 tables. Could you please help to do > the following steps for debugging? I have added some debug log to the above > two tables. > > - Put updated tables to new initrd: > > mkdir -p kernel/firmware/acpi > cp dsdt.aml kernel/firmware/acpi > cp ssdt11.aml kernel/firmware/acpi > find kernel | cpio -H newc --create > > /boot/instrumented_initrd-5.12.0-2-default > cat /boot/initrd-4.12.14-94.41-default > >>/boot/instrumented_initrd-5.12.0-2-default > > - Modify /boot/grub2/grub.cfg > > Add the following kernel parameters (please remove old): > acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF > > Change the booting initrd: > < initrdefi /boot/initrd-5.12.0-2-default > change to > > initrdefi /boot/instrumented_initrd-5.12.0-2-default > > Then please reboot and capture dmesg log. You should see some "[ACPI Debug]" > log in dmesg. Please attach dmesg log on bugzilla. > > Thanks! I currently am running a different kernel than what you're showing (see below). Do I modify your instructions for the 5.12.3-1-default kernel? Thank you for your clarification and wonderful help with this ticket. > cat /etc/os-release | grep VERSION_ID VERSION_ID="20210515" > uname -r 5.12.3-1-default
(In reply to Anthony Agelastos from comment #34) > (In reply to Joey Lee from comment #32) > > Hi Anthony, > > > > I have attached updated DSDT and SSDT11 tables. Could you please help to do > > the following steps for debugging? I have added some debug log to the above > > two tables. > > > > - Put updated tables to new initrd: > > > > mkdir -p kernel/firmware/acpi > > cp dsdt.aml kernel/firmware/acpi > > cp ssdt11.aml kernel/firmware/acpi > > find kernel | cpio -H newc --create > > > /boot/instrumented_initrd-5.12.0-2-default > > cat /boot/initrd-4.12.14-94.41-default > > >>/boot/instrumented_initrd-5.12.0-2-default > > > > - Modify /boot/grub2/grub.cfg > > > > Add the following kernel parameters (please remove old): > > acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF > > > > Change the booting initrd: > > < initrdefi /boot/initrd-5.12.0-2-default > > change to > > > initrdefi /boot/instrumented_initrd-5.12.0-2-default > > > > Then please reboot and capture dmesg log. You should see some "[ACPI Debug]" > > log in dmesg. Please attach dmesg log on bugzilla. > > > > Thanks! > > I currently am running a different kernel than what you're showing (see > below). Do I modify your instructions for the 5.12.3-1-default kernel? Thank > you for your clarification and wonderful help with this ticket. > > > cat /etc/os-release | grep VERSION_ID > VERSION_ID="20210515" > > uname -r > 5.12.3-1-default Yes, you can modify my command to use "5.12.3-1-default". e.g. /boot/initrd-5.12.3-1-default Please still keep the original initrd file in /boot folder in case my modified tables has problem because I do not have machine to test it. If the instrumented_initrd-5.12.0-2-default has problem that it causes booting failed. Then you just need to use grub2 UI to modify initrd back to original initrd-5.12.3-1-default for booting. Thanks
Created attachment 849581 [details] This dmesg output contains ACPI Debug statements. This dmesg output contains ACPI Debug statements.
(In reply to Joey Lee from comment #35) > (In reply to Anthony Agelastos from comment #34) > > (In reply to Joey Lee from comment #32) > > > Hi Anthony, > > > > > > I have attached updated DSDT and SSDT11 tables. Could you please help to do > > > the following steps for debugging? I have added some debug log to the above > > > two tables. > > > > > > - Put updated tables to new initrd: > > > > > > mkdir -p kernel/firmware/acpi > > > cp dsdt.aml kernel/firmware/acpi > > > cp ssdt11.aml kernel/firmware/acpi > > > find kernel | cpio -H newc --create > > > > /boot/instrumented_initrd-5.12.0-2-default > > > cat /boot/initrd-4.12.14-94.41-default > > > >>/boot/instrumented_initrd-5.12.0-2-default > > > > > > - Modify /boot/grub2/grub.cfg > > > > > > Add the following kernel parameters (please remove old): > > > acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF > > > > > > Change the booting initrd: > > > < initrdefi /boot/initrd-5.12.0-2-default > > > change to > > > > initrdefi /boot/instrumented_initrd-5.12.0-2-default > > > > > > Then please reboot and capture dmesg log. You should see some "[ACPI Debug]" > > > log in dmesg. Please attach dmesg log on bugzilla. > > > > > > Thanks! > > > > I currently am running a different kernel than what you're showing (see > > below). Do I modify your instructions for the 5.12.3-1-default kernel? Thank > > you for your clarification and wonderful help with this ticket. > > > > > cat /etc/os-release | grep VERSION_ID > > VERSION_ID="20210515" > > > uname -r > > 5.12.3-1-default > > Yes, you can modify my command to use "5.12.3-1-default". e.g. > /boot/initrd-5.12.3-1-default > > Please still keep the original initrd file in /boot folder in case my > modified tables has problem because I do not have machine to test it. If the > instrumented_initrd-5.12.0-2-default has problem that it causes booting > failed. Then you just need to use grub2 UI to modify initrd back to original > initrd-5.12.3-1-default for booting. > > Thanks Greetings: I added the dmesg output within comment#36. Note that I made some mods to your commands (e.g., I changed `cat /boot/initrd-4.12.14-94.41-default >>/boot/instrumented_initrd-5.12.3-1-default` to `cat /boot/initrd-5.12.3-1 >>/boot/instrumented_initrd-5.12.3-1-default`) The debug statements you mention are below. Please let me know if you have any questions, comments, or concerns. {{{ > dmesg | grep 'ACPI Debug' [ 0.170352] ACPI Debug: "_PR.PR00._OSC" [ 0.170730] ACPI Debug: "Method (GCAP, 1, Serialized)" [ 0.171927] ACPI Debug: "If ((OSYS >= 0x07DF))" [ 0.171953] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" [ 0.172999] ACPI Debug: "_PR.PR00._PDC" [ 0.173091] ACPI Debug: "Method (GCAP, 1, Serialized)" [ 0.174092] ACPI Debug: "If ((OSYS >= 0x07DF))" [ 0.174116] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" }}} Kind regards, Anthony
Created attachment 849589 [details] dsdt-new.aml Add debug log to _SB._OSC and 0811b06e-4a27-44f9-8d60-3cbbc22e7b48 UUID code path.
Hi Anthony, Thanks for your debug log. (In reply to Anthony Agelastos from comment #37) [...snip] > mention are below. Please let me know if you have any questions, comments, > or concerns. > > {{{ > > dmesg | grep 'ACPI Debug' > [ 0.170352] ACPI Debug: "_PR.PR00._OSC" > [ 0.170730] ACPI Debug: "Method (GCAP, 1, Serialized)" > [ 0.171927] ACPI Debug: "If ((OSYS >= 0x07DF))" > [ 0.171953] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > [ 0.172999] ACPI Debug: "_PR.PR00._PDC" > [ 0.173091] ACPI Debug: "Method (GCAP, 1, Serialized)" > [ 0.174092] ACPI Debug: "If ((OSYS >= 0x07DF))" > [ 0.174116] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > }}} > The "If ((\_SB.OSCP & 0x40))" debug log did not be printed out. It confirmed that the CPPC 2 Support _OSC Capabilities did not be set. The _SB._OSC is the only method to set OSCP. I have add debug log to dsdt table, but I attached a wrong AML code on comment#30. Hi Anthony, I have attached a new dsdt-new.aml table on comment#38. Could you please help to put it to initrd and test it? I have add more debug log to dsdt. Please use the dsdt-new.aml with ssdt11.sml for testing. And please attach dmesg log. Thanks a lot! Joey Lee
Created attachment 849601 [details] This contains dmesg output with the updated ACPI Debug statements. This contains dmesg output with the updated ACPI Debug statements.
(In reply to Joey Lee from comment #39) > Hi Anthony, > > Thanks for your debug log. > > (In reply to Anthony Agelastos from comment #37) > [...snip] > > mention are below. Please let me know if you have any questions, comments, > > or concerns. > > > > {{{ > > > dmesg | grep 'ACPI Debug' > > [ 0.170352] ACPI Debug: "_PR.PR00._OSC" > > [ 0.170730] ACPI Debug: "Method (GCAP, 1, Serialized)" > > [ 0.171927] ACPI Debug: "If ((OSYS >= 0x07DF))" > > [ 0.171953] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > > [ 0.172999] ACPI Debug: "_PR.PR00._PDC" > > [ 0.173091] ACPI Debug: "Method (GCAP, 1, Serialized)" > > [ 0.174092] ACPI Debug: "If ((OSYS >= 0x07DF))" > > [ 0.174116] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > > }}} > > > > The "If ((\_SB.OSCP & 0x40))" debug log did not be printed out. It confirmed > that the CPPC 2 Support _OSC Capabilities did not be set. > > The _SB._OSC is the only method to set OSCP. I have add debug log to dsdt > table, but I attached a wrong AML code on comment#30. > > Hi Anthony, > > I have attached a new dsdt-new.aml table on comment#38. Could you please > help to put it to initrd and test it? I have add more debug log to dsdt. > Please use the dsdt-new.aml with ssdt11.sml for testing. And please attach > dmesg log. > > Thanks a lot! > Joey Lee Hello Joey: I have repeated the steps with your updated AML file (note I am on kernel 5.12.4-1-default right now). I attached the file within comment#40. The quick `grep` of output is below. Please let me know if you need anything further. {{{ > dmesg | grep 'ACPI Debug' [ 0.171855] ACPI Debug: "_PR.PR00._OSC" [ 0.172231] ACPI Debug: "Method (GCAP, 1, Serialized)" [ 0.173426] ACPI Debug: "If ((OSYS >= 0x07DF))" [ 0.173452] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" [ 0.174382] ACPI Debug: "_SB._OSC" [ 0.174401] ACPI Debug: "If ((Arg0 == ToUUID (0811b06e-4a27-44f9-8d60-3cbbc22e7b48)" [ 0.174406] ACPI Debug: "If ((Arg1 == One))" [ 0.174415] ACPI Debug: "If (OSCP & 0x40)" [ 0.174461] ACPI Debug: "_SB._OSC" [ 0.174478] ACPI Debug: "If ((Arg0 == ToUUID (0811b06e-4a27-44f9-8d60-3cbbc22e7b48)" [ 0.174483] ACPI Debug: "If ((Arg1 == One))" [ 0.174520] ACPI Debug: "_PR.PR00._PDC" [ 0.174609] ACPI Debug: "Method (GCAP, 1, Serialized)" [ 0.175617] ACPI Debug: "If ((OSYS >= 0x07DF))" [ 0.175641] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" }}} Kind regards, Anthony
Hi Anthony, Thanks for your dmesg! It's useful! (In reply to Anthony Agelastos from comment #41) [...snip] > > {{{ > > dmesg | grep 'ACPI Debug' > [ 0.171855] ACPI Debug: "_PR.PR00._OSC" > [ 0.172231] ACPI Debug: "Method (GCAP, 1, Serialized)" > [ 0.173426] ACPI Debug: "If ((OSYS >= 0x07DF))" > [ 0.173452] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > [ 0.174382] ACPI Debug: "_SB._OSC" > [ 0.174401] ACPI Debug: "If ((Arg0 == ToUUID > (0811b06e-4a27-44f9-8d60-3cbbc22e7b48)" > [ 0.174406] ACPI Debug: "If ((Arg1 == One))" > [ 0.174415] ACPI Debug: "If (OSCP & 0x40)" <-- first time _OSC has 0x40 > [ 0.174461] ACPI Debug: "_SB._OSC" > [ 0.174478] ACPI Debug: "If ((Arg0 == ToUUID > (0811b06e-4a27-44f9-8d60-3cbbc22e7b48)" > [ 0.174483] ACPI Debug: "If ((Arg1 == One))" <-- second time _OSC doesn't have 0x40 > [ 0.174520] ACPI Debug: "_PR.PR00._PDC" > [ 0.174609] ACPI Debug: "Method (GCAP, 1, Serialized)" > [ 0.175617] ACPI Debug: "If ((OSYS >= 0x07DF))" > [ 0.175641] ACPI Debug: "If (((CFGD & 0x00400000) && !(SDTL & 0x40)))" > }}} > The _SB._OSC method be called two times. I found 0x40 (CPPC 2 Support) in the first time. But the second time doesn't raise 0x40 CPPC 2 Support bit. That's why the SSDT11 can not be loaded in _PR.PR00._PDC.GCAP . Linux OSPM calls _OSC two times is a new behavior be introduced by 719e1f561afb kernel patch since v5.12-rc1. I believe that 719e1f561afb causes this issue. Base on ACPI spec 6.4, this is a firmware issue. In the spec 6.2.11 _OSC section, it recommends that OSPM should use query mode to call _OSC first, platform should returns what capabilities are supported by platform. Then OSPM calls _OSC again base on platform's return values to set capabilities. In DSDT from your machine, firmware returns that second dword of _SB._OSC is 0x3B. Which means that the 0x40 CPPC 2 Support bit be cleared by firmware. Then OSPM use firmware's return value to call _OSC again to set capabilitites. Firmware clears CPPC2 support bit, so the SSDT11 is not dynamic loaded. Which means no \_PR.PR00._CPC. But other _CPC in PR01-PR11 still call PR00._CPC. So we saw the ACPI Error in your bug description. Before v5.12 kernel, OSPM didn't follow ACPI 6.4 spec's suggestion to apply the two steps behavior. In v5.11, OSPM just calls _OSC one time to set capability. It did not use query mode of _OSC to ask platform's capabilities. So we did not see the ACPI Erro in v5.11 kernel.
(In reply to Joey Lee from comment #42) > Hi Anthony, > > Thanks for your dmesg! It's useful! > > (In reply to Anthony Agelastos from comment #41) > [...snip] > > The _SB._OSC method be called two times. I found 0x40 (CPPC 2 Support) in > the first time. But the second time doesn't raise 0x40 CPPC 2 Support bit. > That's why the SSDT11 can not be loaded in _PR.PR00._PDC.GCAP . ^^^^^^^ SSDT16 > > Linux OSPM calls _OSC two times is a new behavior be introduced by [...snip] > Firmware clears CPPC2 support bit, so the SSDT11 is not dynamic loaded. ^^^^^^ SSDT16 Sorry for my typo.
(In reply to Joey Lee from comment #43) > (In reply to Joey Lee from comment #42) > > Hi Anthony, > > > > Thanks for your dmesg! It's useful! > > > > (In reply to Anthony Agelastos from comment #41) > > [...snip] > > > > The _SB._OSC method be called two times. I found 0x40 (CPPC 2 Support) in > > the first time. But the second time doesn't raise 0x40 CPPC 2 Support bit. > > That's why the SSDT11 can not be loaded in _PR.PR00._PDC.GCAP . > ^^^^^^^ SSDT16 ^^^^^^ SSDT15 Sorry for I confused/typo too many times.
Hi Anthony and anyone's system has the same symptom: I have reverted the 719e1f561afbe patch and built a kernel rpm in my home branch on OBS. Could you please try it? https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE:Factory:bsc1185513/kernel-default/standard Please install this kernel rpm but still keep the old kernel in case there have any unknown problem. And, please note that this is not a real solution. I will raise this issue on upstream after we confirmed that the 719e1f561afbe patch is root cause. Reverting 719e1f561afbe patch will cause that we may lost some USB4 functions. On the other hand, kernel upstream may treats this problem as a firmware issue. We didn't see this ACPI error message that's because v5.11 kernel doesn't follow ACPI spec's recommended behavior.
(In reply to Joey Lee from comment #45) > Hi Anthony and anyone's system has the same symptom: > > I have reverted the 719e1f561afbe patch and built a kernel rpm in my home > branch on OBS. Could you please try it? > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > Factory:bsc1185513/kernel-default/standard > > Please install this kernel rpm but still keep the old kernel in case there > have any unknown problem. > > And, please note that this is not a real solution. I will raise this issue > on upstream after we confirmed that the 719e1f561afbe patch is root cause. > Reverting 719e1f561afbe patch will cause that we may lost some USB4 > functions. > > On the other hand, kernel upstream may treats this problem as a firmware > issue. We didn't see this ACPI error message that's because v5.11 kernel > doesn't follow ACPI spec's recommended behavior. Hello Joey: Thank you for your help with this. I'm glad you were able to drill down to the root cause. Could you (or anyone familiar with this) walk me through the steps for how to install your kernel to test and then to revert the changes afterwards? One method could be manual whilst another could leverage Snapper and do it through snapshots. I am new enough to openSUSE that I don't have enough experience with either of those methods to know precisely how to. So, any guidance would be appreciated. Thank you for your help with this. Kind regards, Anthony
(In reply to Anthony Agelastos from comment #46) > (In reply to Joey Lee from comment #45) > > Hi Anthony and anyone's system has the same symptom: > > > > I have reverted the 719e1f561afbe patch and built a kernel rpm in my home > > branch on OBS. Could you please try it? > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > Factory:bsc1185513/kernel-default/standard > > > > Please install this kernel rpm but still keep the old kernel in case there > > have any unknown problem. > > > > And, please note that this is not a real solution. I will raise this issue > > on upstream after we confirmed that the 719e1f561afbe patch is root cause. > > Reverting 719e1f561afbe patch will cause that we may lost some USB4 > > functions. > > > > On the other hand, kernel upstream may treats this problem as a firmware > > issue. We didn't see this ACPI error message that's because v5.11 kernel > > doesn't follow ACPI spec's recommended behavior. > > Hello Joey: > > Thank you for your help with this. I'm glad you were able to drill down to > the root cause. Could you (or anyone familiar with this) walk me through the > steps for how to install your kernel to test and then to revert the changes > afterwards? One method could be manual whilst another could leverage Snapper > and do it through snapshots. I am new enough to openSUSE that I don't have > enough experience with either of those methods to know precisely how to. So, > any guidance would be appreciated. Thank you for your help with this. > > Kind regards, > Anthony Hello Joey: Also, do you have an idea as to what functionality is impacted with these errors I'm getting? If upstream decides it's a bug in my firmware and that they don't want to change anything, then I need to start evaluating my options. One option is that I can update my firmware to see if that fixes the problem (but I updated firmware on a laptop before and afterwards Linux was unable to access the hard drive thanks to changes in firmware which I could not roll back so I am very hesitant to update firmware in general). However, if these ACPI errors aren't really impacting anything (i.e., if they're cosmetic), then I can also just ignore them. So, any guidance you have regarding what is not working would help me make this decision (assuming the upstream doesn't address this issue... if they do address the issue then this is moot). Thank you for any clarification you can provide here. Kind regards, Anthony Kind regards, Anthony
(In reply to Anthony Agelastos from comment #46) > (In reply to Joey Lee from comment #45) > > Hi Anthony and anyone's system has the same symptom: > > > > I have reverted the 719e1f561afbe patch and built a kernel rpm in my home > > branch on OBS. Could you please try it? > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > Factory:bsc1185513/kernel-default/standard > > > > Please install this kernel rpm but still keep the old kernel in case there > > have any unknown problem. > > > > And, please note that this is not a real solution. I will raise this issue > > on upstream after we confirmed that the 719e1f561afbe patch is root cause. > > Reverting 719e1f561afbe patch will cause that we may lost some USB4 > > functions. > > > > On the other hand, kernel upstream may treats this problem as a firmware > > issue. We didn't see this ACPI error message that's because v5.11 kernel > > doesn't follow ACPI spec's recommended behavior. > > Hello Joey: > > Thank you for your help with this. I'm glad you were able to drill down to > the root cause. Could you (or anyone familiar with this) walk me through the > steps for how to install your kernel to test and then to revert the changes > afterwards? One method could be manual whilst another could leverage Snapper > and do it through snapshots. I am new enough to openSUSE that I don't have > enough experience with either of those methods to know precisely how to. So, > any guidance would be appreciated. Thank you for your help with this. > Normally I just download the kernel-default RPM and use 'rpm -ivh' command to install it on my machine. e.g. # rpm -ivh kernel-default-5.12.6-1.1.g15ed7b8.x86_64.rpm Then reboot system, you should see a boot item on grub2 menu. e.g. openSUSE Tumbleweed, with Linux 5.12.6-1.g15ed7b8-default Then select it for booting test. Let's test and confirm the 719e1f561afbe patch is the root cause. Then we ping upstream experts.
Created attachment 849671 [details] This contains dmesg output with the patched 5.12.6 kernel provided by Joey Lee. This contains dmesg output with the patched 5.12.6 kernel provided by Joey Lee.
(In reply to Joey Lee from comment #48) > (In reply to Anthony Agelastos from comment #46) > > (In reply to Joey Lee from comment #45) > > > Hi Anthony and anyone's system has the same symptom: > > > > > > I have reverted the 719e1f561afbe patch and built a kernel rpm in my home > > > branch on OBS. Could you please try it? > > > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > > Factory:bsc1185513/kernel-default/standard > > > > > > Please install this kernel rpm but still keep the old kernel in case there > > > have any unknown problem. > > > > > > And, please note that this is not a real solution. I will raise this issue > > > on upstream after we confirmed that the 719e1f561afbe patch is root cause. > > > Reverting 719e1f561afbe patch will cause that we may lost some USB4 > > > functions. > > > > > > On the other hand, kernel upstream may treats this problem as a firmware > > > issue. We didn't see this ACPI error message that's because v5.11 kernel > > > doesn't follow ACPI spec's recommended behavior. > > > > Hello Joey: > > > > Thank you for your help with this. I'm glad you were able to drill down to > > the root cause. Could you (or anyone familiar with this) walk me through the > > steps for how to install your kernel to test and then to revert the changes > > afterwards? One method could be manual whilst another could leverage Snapper > > and do it through snapshots. I am new enough to openSUSE that I don't have > > enough experience with either of those methods to know precisely how to. So, > > any guidance would be appreciated. Thank you for your help with this. > > > > Normally I just download the kernel-default RPM and use 'rpm -ivh' command > to install it on my machine. e.g. > > # rpm -ivh kernel-default-5.12.6-1.1.g15ed7b8.x86_64.rpm > > Then reboot system, you should see a boot item on grub2 menu. e.g. > > openSUSE Tumbleweed, with Linux 5.12.6-1.g15ed7b8-default > > Then select it for booting test. > > Let's test and confirm the 719e1f561afbe patch is the root cause. Then we > ping upstream experts. Greetings: I installed the kernel and attached its dmesg output to this ticket. I was not able to boot the system graphically (it stopped at the console login). I didn't see any of the errors we saw before (but please double check me). This seems to indicate that you have correctly identified the problem. Please let me know if you have any questions, comments, or concerns. Additionally, please let me know how to proceed once you discuss things with upstream. Thank you for your help with this. Kind regards, Anthony
(In reply to Anthony Agelastos from comment #50) > (In reply to Joey Lee from comment #48) > > (In reply to Anthony Agelastos from comment #46) > > > (In reply to Joey Lee from comment #45) > > > > Hi Anthony and anyone's system has the same symptom: > > > > > > > > I have reverted the 719e1f561afbe patch and built a kernel rpm in my home > > > > branch on OBS. Could you please try it? > > > > > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > > > Factory:bsc1185513/kernel-default/standard > > > > > > > > Please install this kernel rpm but still keep the old kernel in case there > > > > have any unknown problem. > > > > > > > > And, please note that this is not a real solution. I will raise this issue > > > > on upstream after we confirmed that the 719e1f561afbe patch is root cause. > > > > Reverting 719e1f561afbe patch will cause that we may lost some USB4 > > > > functions. > > > > > > > > On the other hand, kernel upstream may treats this problem as a firmware > > > > issue. We didn't see this ACPI error message that's because v5.11 kernel > > > > doesn't follow ACPI spec's recommended behavior. > > > > > > Hello Joey: > > > > > > Thank you for your help with this. I'm glad you were able to drill down to > > > the root cause. Could you (or anyone familiar with this) walk me through the > > > steps for how to install your kernel to test and then to revert the changes > > > afterwards? One method could be manual whilst another could leverage Snapper > > > and do it through snapshots. I am new enough to openSUSE that I don't have > > > enough experience with either of those methods to know precisely how to. So, > > > any guidance would be appreciated. Thank you for your help with this. > > > > > > > Normally I just download the kernel-default RPM and use 'rpm -ivh' command > > to install it on my machine. e.g. > > > > # rpm -ivh kernel-default-5.12.6-1.1.g15ed7b8.x86_64.rpm > > > > Then reboot system, you should see a boot item on grub2 menu. e.g. > > > > openSUSE Tumbleweed, with Linux 5.12.6-1.g15ed7b8-default > > > > Then select it for booting test. > > > > Let's test and confirm the 719e1f561afbe patch is the root cause. Then we > > ping upstream experts. > > Greetings: > > I installed the kernel and attached its dmesg output to this ticket. I was > not able to boot the system graphically (it stopped at the console login). I > didn't see any of the errors we saw before (but please double check me). > This seems to indicate that you have correctly identified the problem. > Please let me know if you have any questions, comments, or concerns. > Additionally, please let me know how to proceed once you discuss things with > upstream. Thank you for your help with this. > > Kind regards, > Anthony I forgot to mention... the updated dmesg is within comment#49.
Hi Anthony, Thanks for your testing. (In reply to Anthony Agelastos from comment #50) > (In reply to Joey Lee from comment #48) [...snip] > > Let's test and confirm the 719e1f561afbe patch is the root cause. Then we > > ping upstream experts. > > Greetings: > > I installed the kernel and attached its dmesg output to this ticket. I was > not able to boot the system graphically (it stopped at the console login). I hm... There have other problem for graphic. Please switch back to your old kernel. > didn't see any of the errors we saw before (but please double check me). > This seems to indicate that you have correctly identified the problem. > Please let me know if you have any questions, comments, or concerns. > Additionally, please let me know how to proceed once you discuss things with > upstream. Thank you for your help with this. Yes, your testing confirmed that the Cpu0Hwp and HwpLvt tables be loaded after reverted 719e1f561a patch. Now the "Dynamic OEM Table Load" logs of modified v5.12 are the same with v5.11 : [ 0.170109] ACPI: Dynamic OEM Table Load: [ 0.170115] ACPI: SSDT 0xFFFF8C50DC8E1800 00077A (v02 PmRef Cpu0Ist 00003000 INTL 20160422) [ 0.171104] ACPI: \_PR_.PR00: _OSC native thermal LVT Acked [ 0.172348] ACPI: Dynamic OEM Table Load: [ 0.172352] ACPI: SSDT 0xFFFF8C50DC8EF400 0003FF (v02 PmRef Cpu0Cst 00003001 INTL 20160422) [ 0.173248] ACPI: Dynamic OEM Table Load: [ 0.173251] ACPI: SSDT 0xFFFF8C4A014043C0 0000BA (v02 PmRef Cpu0Hwp 00003000 INTL 20160422) <-- be loaded after revert 719e1f561a patch [ 0.174075] ACPI: Dynamic OEM Table Load: [ 0.174078] ACPI: SSDT 0xFFFF8C50DC8E5800 000628 (v02 PmRef HwpLvt 00003000 INTL 20160422) <-- be loaded after revert 719e1f561a patch [ 0.175200] ACPI: Dynamic OEM Table Load: [ 0.175206] ACPI: SSDT 0xFFFF8C4A00D20000 000D14 (v02 PmRef ApIst 00003000 INTL 20160422) [ 0.176671] ACPI: Dynamic OEM Table Load: [ 0.176674] ACPI: SSDT 0xFFFF8C50DC8E8000 000317 (v02 PmRef ApHwp 00003000 INTL 20160422) [ 0.177585] ACPI: Dynamic OEM Table Load: [ 0.177589] ACPI: SSDT 0xFFFF8C50DC8EE000 00030A (v02 PmRef ApCst 00003000 INTL 20160422) I will looking at what's the impact of _CPC and also ping kernel upstream experts. Thanks
Hi Anthony, and anyone got the same problem. On x86 platform, the _CPC affects intel_pstate drivers. Base on source code, the CPU priority based on ITMI (Intel Turbo Boost Max Technology) will not be set when _CPC is broken. Using no_hwp module parameter can disable the CPPC part in intel_pstate drivers, it may workaround issue. Sorry for I am not familiar with intel_pstate driver, and I didn't find a _CPC machine yet. So I don't know what's the really affect to intel_pstate and CPU performance. I will raise this situation on upstream.
(In reply to Joey Lee from comment #53) > Hi Anthony, and anyone got the same problem. > > On x86 platform, the _CPC affects intel_pstate drivers. Base on source code, > the CPU priority based on ITMI (Intel Turbo Boost Max Technology) will not > be set when _CPC is broken. Using no_hwp module parameter can disable the > CPPC part in intel_pstate drivers, it may workaround issue. > > Sorry for I am not familiar with intel_pstate driver, and I didn't find a > _CPC machine yet. So I don't know what's the really affect to intel_pstate > and CPU performance. > > I will raise this situation on upstream. JFYI https://bugzilla.kernel.org/show_bug.cgi?id=213023#c25
(In reply to Frank Krüger from comment #54) > (In reply to Joey Lee from comment #53) > > Hi Anthony, and anyone got the same problem. > > > > On x86 platform, the _CPC affects intel_pstate drivers. Base on source code, > > the CPU priority based on ITMI (Intel Turbo Boost Max Technology) will not > > be set when _CPC is broken. Using no_hwp module parameter can disable the > > CPPC part in intel_pstate drivers, it may workaround issue. > > > > Sorry for I am not familiar with intel_pstate driver, and I didn't find a > > _CPC machine yet. So I don't know what's the really affect to intel_pstate > > and CPU performance. > > > > I will raise this situation on upstream. > > JFYI https://bugzilla.kernel.org/show_bug.cgi?id=213023#c25 Yes, I will backport Hans's patch to openSUSE TW kernel for testing.
Hi Anthony, and anyone got the same problem, I have backported Hans's patch from upstream bko#213023 to openSUSE TW kernel for testing. The kernel RPM is in my home branch on OBS: https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE:Factory:bsc1185513/kernel-default/standard Could you please help to install the kernel-default-5.12.9-1.1.g7c88540.x86_64.rpm and check that the issue is gone? Thanks!
I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version 20210606 and the issue is NOT gone. The error messages are still there.
(In reply to Hendrik Woltersdorf from comment #57) > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > 20210606 and the issue is NOT gone. The error messages are still there. There seems to be a new upstream patch: https://bugzilla.kernel.org/show_bug.cgi?id=213023#c33
(In reply to Hendrik Woltersdorf from comment #57) > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > 20210606 and the issue is NOT gone. The error messages are still there. I, too, installed this kernel atop 20210606 and the issue remained as well.
(In reply to Hendrik Woltersdorf from comment #57) > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > 20210606 and the issue is NOT gone. The error messages are still there. (In reply to Anthony Agelastos from comment #59) > (In reply to Hendrik Woltersdorf from comment #57) > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > 20210606 and the issue is NOT gone. The error messages are still there. > > I, too, installed this kernel atop 20210606 and the issue remained as well. Thanks for your testing to confirm that Hans's patch doesn't work for our issue. Mika Westerberg sent a new patch on upstream: ACPI: Pass the same capabilities to the _OSC regardless of the query flag https://patchwork.kernel.org/project/linux-acpi/patch/20210608163810.18071-1-mika.westerberg@linux.intel.com/ Then Hans modified Mika's patch and put to as Frank mentioned: https://bugzilla.kernel.org/attachment.cgi?id=297241&action=diff I will backport the patch to openSUSE TW kernel and generate a new kernel RPM for testing.
(In reply to Joey Lee from comment #60) > (In reply to Hendrik Woltersdorf from comment #57) > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > 20210606 and the issue is NOT gone. The error messages are still there. > > (In reply to Anthony Agelastos from comment #59) > > (In reply to Hendrik Woltersdorf from comment #57) > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > I, too, installed this kernel atop 20210606 and the issue remained as well. > > Thanks for your testing to confirm that Hans's patch doesn't work for our > issue. > > Mika Westerberg sent a new patch on upstream: > > ACPI: Pass the same capabilities to the _OSC regardless of the query flag > https://patchwork.kernel.org/project/linux-acpi/patch/20210608163810.18071-1- > mika.westerberg@linux.intel.com/ > > Then Hans modified Mika's patch and put to as Frank mentioned: > > https://bugzilla.kernel.org/attachment.cgi?id=297241&action=diff > > I will backport the patch to openSUSE TW kernel and generate a new kernel > RPM for testing. I have backported Hans's new patch for testing: https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE:Factory:bsc1185513/kernel-default/standard Hi Anthony, Could you please help to test again? Thanks
With kernel 5.12.9-1.g354694a-default the error messages are gone. Works for me :)
(In reply to Joey Lee from comment #61) > (In reply to Joey Lee from comment #60) > > (In reply to Hendrik Woltersdorf from comment #57) > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > (In reply to Anthony Agelastos from comment #59) > > > (In reply to Hendrik Woltersdorf from comment #57) > > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > > > I, too, installed this kernel atop 20210606 and the issue remained as well. > > > > Thanks for your testing to confirm that Hans's patch doesn't work for our > > issue. > > > > Mika Westerberg sent a new patch on upstream: > > > > ACPI: Pass the same capabilities to the _OSC regardless of the query flag > > https://patchwork.kernel.org/project/linux-acpi/patch/20210608163810.18071-1- > > mika.westerberg@linux.intel.com/ > > > > Then Hans modified Mika's patch and put to as Frank mentioned: > > > > https://bugzilla.kernel.org/attachment.cgi?id=297241&action=diff > > > > I will backport the patch to openSUSE TW kernel and generate a new kernel > > RPM for testing. > > I have backported Hans's new patch for testing: > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > Factory:bsc1185513/kernel-default/standard > > Hi Anthony, > > Could you please help to test again? > > Thanks The errors were not present. It looks like this patch worked. Does it seem like they are going to put it in the upstream kernel?
(In reply to Anthony Agelastos from comment #63) > (In reply to Joey Lee from comment #61) > > (In reply to Joey Lee from comment #60) > > > (In reply to Hendrik Woltersdorf from comment #57) > > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > > > (In reply to Anthony Agelastos from comment #59) > > > > (In reply to Hendrik Woltersdorf from comment #57) > > > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > > > > > I, too, installed this kernel atop 20210606 and the issue remained as well. > > > > > > Thanks for your testing to confirm that Hans's patch doesn't work for our > > > issue. > > > > > > Mika Westerberg sent a new patch on upstream: > > > > > > ACPI: Pass the same capabilities to the _OSC regardless of the query flag > > > https://patchwork.kernel.org/project/linux-acpi/patch/20210608163810.18071-1- > > > mika.westerberg@linux.intel.com/ > > > > > > Then Hans modified Mika's patch and put to as Frank mentioned: > > > > > > https://bugzilla.kernel.org/attachment.cgi?id=297241&action=diff > > > > > > I will backport the patch to openSUSE TW kernel and generate a new kernel > > > RPM for testing. > > > > I have backported Hans's new patch for testing: > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > Factory:bsc1185513/kernel-default/standard > > > > Hi Anthony, > > > > Could you please help to test again? > > > > Thanks > > The errors were not present. It looks like this patch worked. Does it seem > like they are going to put it in the upstream kernel? According to https://bugzilla.kernel.org/show_bug.cgi?id=213023#c35 the patch will show up soon in one of the next 5.13-rc# and later on in 5.12.y.
(In reply to Frank Krüger from comment #64) > (In reply to Anthony Agelastos from comment #63) > > (In reply to Joey Lee from comment #61) > > > (In reply to Joey Lee from comment #60) > > > > (In reply to Hendrik Woltersdorf from comment #57) > > > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > > > > > (In reply to Anthony Agelastos from comment #59) > > > > > (In reply to Hendrik Woltersdorf from comment #57) > > > > > > I installed kernel 5.12.9-1.g7c88540-default on top of Tumbleweed version > > > > > > 20210606 and the issue is NOT gone. The error messages are still there. > > > > > > > > > > I, too, installed this kernel atop 20210606 and the issue remained as well. > > > > > > > > Thanks for your testing to confirm that Hans's patch doesn't work for our > > > > issue. > > > > > > > > Mika Westerberg sent a new patch on upstream: > > > > > > > > ACPI: Pass the same capabilities to the _OSC regardless of the query flag > > > > https://patchwork.kernel.org/project/linux-acpi/patch/20210608163810.18071-1- > > > > mika.westerberg@linux.intel.com/ > > > > > > > > Then Hans modified Mika's patch and put to as Frank mentioned: > > > > > > > > https://bugzilla.kernel.org/attachment.cgi?id=297241&action=diff > > > > > > > > I will backport the patch to openSUSE TW kernel and generate a new kernel > > > > RPM for testing. > > > > > > I have backported Hans's new patch for testing: > > > > > > https://build.opensuse.org/package/binaries/home:joeyli:branches:openSUSE: > > > Factory:bsc1185513/kernel-default/standard > > > > > > Hi Anthony, > > > > > > Could you please help to test again? > > > > > > Thanks > > > > The errors were not present. It looks like this patch worked. Does it seem > > like they are going to put it in the upstream kernel? > > According to https://bugzilla.kernel.org/show_bug.cgi?id=213023#c35 the > patch will show up soon in one of the next 5.13-rc# and later on in 5.12.y. Yes, and thanks for Hendrik and Anthony's testing and confirm. The patch will show up in 5.13-rc kernel. I will backport it to TW kernel after the patch be merged on maineline.
The patch is part of kernel 5.13-rc6 (https://lore.kernel.org/linux-acpi/CAJZ5v0gyZhwcODnLhOfNc=Lxkr9kqt4UrsYsQUggp2dtGRWWSg@mail.gmail.com/T/).
(In reply to Frank Krüger from comment #66) > The patch is part of kernel 5.13-rc6 > (https://lore.kernel.org/linux-acpi/ > CAJZ5v0gyZhwcODnLhOfNc=Lxkr9kqt4UrsYsQUggp2dtGRWWSg@mail.gmail.com/T/). I have backported this patch to openSUSE TW v5.12 kernel and I am waiting it be merged.
FYI: The patch is now part of the upstream stable kernel 5.12.11.
Patch 159d8c274fd be merged to openSUSE kernel stable branch for openSUSE TW. Set this issue to fixed.
(In reply to Joey Lee from comment #69) > Patch 159d8c274fd be merged to openSUSE kernel stable branch for openSUSE > TW. > > Set this issue to fixed. THANK YOU EVERYONE for your hard work on this ticket; it is much appreciated! I look forward to the upcoming update that resolves this. When >=5.12.11 is installed, if the errors persist I will come back to this ticket. If there is no reply, then please assume this works for me. Thanks again!