Bug 1168776 - Message spam about framebuffer compression in system log
Message spam about framebuffer compression in system log
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 Other
: P5 - None : Major (vote)
: ---
Assigned To: E-mail List
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-06 18:36 UTC by Achim Gratz
Modified: 2020-05-05 19:19 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Achim Gratz 2020-04-06 18:36:55 UTC
This update produces reams of these messages on my system:

Apr 05 09:17:36 Gertrud kernel: [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.

There is no BIOS setting I could change on my Dell T20 unfortunaltely.

I have already tried to set the kernel parameter i915.enable_fbc=0 (and
a parameter file in /etc/modprobe.d with the same settings), which does
result in a tainted kernel (so the setting should be active), but these
messages still continue at a high rate and the frambuffer compression
seems to be enabled anyway.

The 5.5.* kernels do not have this problem, this message never occurs
there.
Comment 1 Takashi Iwai 2020-04-07 08:59:00 UTC
So the message appears repeatedly?

Interesting, a patch to make the message to DRM_INFO_ONCE() was submitted in more than year ago, but it didn't get merged.
  https://lore.kernel.org/patchwork/patch/960241/

And you've hit this while it didn't happen on 5.5.  So the call pattern has changed between 5.5 and 5.6 and starts triggering it...
Comment 2 Jiri Slaby 2020-04-07 09:09:04 UTC
What do you have in
/sys/kernel/debug/dri/0/i915_gem_stolen
in both kernels?
Comment 3 Achim Gratz 2020-04-07 17:35:34 UTC
(In reply to Jiri Slaby from comment #2)
> What do you have in
> /sys/kernel/debug/dri/0/i915_gem_stolen
> in both kernels?

That sysfs file does not exist under the 5.5.13 I'm currently running.  I'll have to reboot into the 5.6.0 kernel later.
Comment 4 Achim Gratz 2020-04-07 18:24:20 UTC
(In reply to Takashi Iwai from comment #1)
> So the message appears repeatedly?

Yes, several times a second actually, in bursts of a few seconds apart.

> Interesting, a patch to make the message to DRM_INFO_ONCE() was submitted in
> more than year ago, but it didn't get merged.
>   https://lore.kernel.org/patchwork/patch/960241/

Yes I'd seen that, but it'

> And you've hit this while it didn't happen on 5.5.  So the call pattern has
> changed between 5.5 and 5.6 and starts triggering it...

Or the kernel decides to use the reported BIOS features differently.  THe other thing that irks me is that I am apparently unable to switch off the framebuffer compression even though the kernel parameter was recognized.  What's up with that?
Comment 5 Achim Gratz 2020-04-07 19:03:02 UTC
(In reply to Achim Gratz from comment #3)
> (In reply to Jiri Slaby from comment #2)
> > What do you have in
> > /sys/kernel/debug/dri/0/i915_gem_stolen
> > in both kernels?
> 
> That sysfs file does not exist under the 5.5.13 I'm currently running.  I'll
> have to reboot into the 5.6.0 kernel later.

The new kernel doesn't have that either... maybe this is what you were looking for?

# cat /sys/kernel/debug/dri/0/i915_gem_objects 
391 shrinkable [0 free] objects, 594219008 bytes
system: total:0x00000001eb41d000, available:0x00000001eb41d000 bytes
stolen: total:0x0000000002000000, available:0x0000000002000000 bytes
Comment 6 Achim Gratz 2020-04-07 19:22:54 UTC
(In reply to Achim Gratz from comment #5)
> (In reply to Achim Gratz from comment #3)
> > (In reply to Jiri Slaby from comment #2)
> > > What do you have in
> > > /sys/kernel/debug/dri/0/i915_gem_stolen
> > > in both kernels?
> > 
> > That sysfs file does not exist under the 5.5.13 I'm currently running.  I'll
> > have to reboot into the 5.6.0 kernel later.
> 
> The new kernel doesn't have that either... maybe this is what you were
> looking for?
> 
> # cat /sys/kernel/debug/dri/0/i915_gem_objects 
> 391 shrinkable [0 free] objects, 594219008 bytes
> system: total:0x00000001eb41d000, available:0x00000001eb41d000 bytes
> stolen: total:0x0000000002000000, available:0x0000000002000000 bytes

the same output from the 5.5.13 kernel is just:

# cat /sys/kernel/debug/dri/0/i915_gem_objects 
367 shrinkable [0 free] objects, 591519744 bytes

OK, so I dove into /sys/module/i915/parameters and see that fbc_enable is reading "0".  So indeed the 5.5.13 kernel did _not_ switch on fb compression.  Switching it on manually yields this:

[  617.221643] Setting dangerous option enable_fbc - tainting kernel
[  617.267790] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.

So far that has been the only message.  So I start to think that the bug with 5.6.0 is actually that the fb compression gets set to on unconditionally (as evidenced by the fact that the kernel parameter was read as "0" on boot, but the compression stayed on anyway).  I might try to manually disable it via sysfs on 5.6.0 later, but I have to shut down for today.
Comment 7 Achim Gratz 2020-04-08 16:35:46 UTC
I updated my other Tumbleweed machine to 5.6.0 today (it's also a Haswell, but only an i3-4130 and FullHD display).  Here the FBC is not on by default:

# cat /sys/kernel/debug/dri/0/i915_gem_objects 
255 shrinkable [0 free] objects, 139313152 bytes
system: total:0x00000001e85dd000, available:0x00000001e85dd000 bytes
stolen: total:0x0000000002000000, available:0x0000000002000000 bytes

# cat /sys/kernel/debug/dri/0/i915_fbc_status 
FBC disabled: disabled at runtime per module param

I can manually switch it on via module parameter in sysfs interface:

# cat /sys/kernel/debug/dri/0/i915_fbc_status 
FBC enabled
Compressing: yes

No messages about "reducing compressed framebuffer size".

# lspci -vv -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation 4th Generation Core Processor Family Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller])
        DeviceName:  Onboard IGD
        Subsystem: ASUSTeK Computer Inc. Device 8534
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 29
        Region 0: Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee04004  Data: 4026
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a4] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: i915
        Kernel modules: i915
Comment 8 Achim Gratz 2020-04-08 16:47:56 UTC
The plot thickens... back on the original system (Xeon E3-1225v3 and a 4k display); FBC is actually _not_ on:

# cat /sys/kernel/debug/dri/0/i915_fbc_status 
FBC disabled: disabled at runtime per module param

That likely explains why trying to switch it off just taints the kernel and doesn't seem to do aynthing else.

# lspci -vv -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3 Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller])
        DeviceName:  Onboard IGD
        Subsystem: Dell Device 0620
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 30
        Region 0: Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee00018  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a4] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: i915
        Kernel modules: i915

Trying to explicitly switch on FBC gets me this nugget:

# echo 1 > /sys/module/i915/parameters/enable_fbc
# cat /sys/kernel/debug/dri/0/i915_fbc_status 
FBC disabled: pixel rate is too big

The messages continue to be spewed out, but the actual error that trigger them is probably unrelated to framebuffer compression and a bug somewhere else in the driver.
Comment 9 Takashi Iwai 2020-04-08 17:40:44 UTC
Could you report this to upstream?  gitlab.freedesktop.org issues.
Comment 10 Achim Gratz 2020-04-08 18:14:30 UTC
(In reply to Takashi Iwai from comment #9)
> Could you report this to upstream?  gitlab.freedesktop.org issues.

Oh, please not another one of these stupid bugtrackers that require registration…
Comment 11 Takashi Iwai 2020-04-08 18:56:07 UTC
Well, basically there is no chance to fix such a bug in the distro side for TW.
This must be handled in the upstream, and they need the proper bug report.
Comment 12 Achim Gratz 2020-04-18 06:15:31 UTC
Upstream bugreport:
https://gitlab.freedesktop.org/drm/intel/-/issues/1745
Comment 13 Achim Gratz 2020-04-18 06:37:27 UTC
(In reply to Takashi Iwai from comment #11)
> Well, basically there is no chance to fix such a bug in the distro side for
> TW.
> This must be handled in the upstream, and they need the proper bug report.

Well, one would have thought that somebody has figured out a way to "upstream" distro bugreports by now, as that's not exactly an uncommon occurence.
Comment 14 Takashi Iwai 2020-04-18 07:35:24 UTC
(In reply to Achim Gratz from comment #13)
> (In reply to Takashi Iwai from comment #11)
> > Well, basically there is no chance to fix such a bug in the distro side for
> > TW.
> > This must be handled in the upstream, and they need the proper bug report.
> 
> Well, one would have thought that somebody has figured out a way to
> "upstream" distro bugreports by now, as that's not exactly an uncommon
> occurence.

Such a hardware-related bug report needs to be done by the original reporter, since the distro has no hardware access.  If it were only on Leap, it's distro's responsibility as it's a backport problem.  But if it's about TW, it's certainly an upstream bug, hence the hardware owner needs to cooperate with upstream devs for tracking down the problem.
Comment 15 Achim Gratz 2020-04-28 17:08:03 UTC
So upstreams solution it to apply the almost two year old patch that makes the message appear only once per boot.  Can that be backported to 5.6?
Comment 16 Takashi Iwai 2020-04-29 06:59:43 UTC
Thanks for the update.  Now backported to stable branch.
It looks merely like a workaround rather than a proper fix, at least the annoyance would be gone by this, so let's close now.

Feel free to reopen if you still encounter the problem even with the patch.
Comment 17 Achim Gratz 2020-05-05 19:19:40 UTC
(In reply to Takashi Iwai from comment #16)
> Thanks for the update.  Now backported to stable branch.
> It looks merely like a workaround rather than a proper fix, at least the
> annoyance would be gone by this, so let's close now.

Yes, it seems to be papering over the issue of why we get that message in the first place.  FB compression is already off for this display size, so the message must get triggered by some other error.  I haven't got the round tuit to go look by which path it snakes there.

> Feel free to reopen if you still encounter the problem even with the patch.

I've been running that for two days now (not too thoroughly stressed) and aside from the missing journal entries everything still seems to work as it should.