Bug 1084438 - nouveau backtrace in nouveau_bo_move_ntfy+0xc9/0xd0
nouveau backtrace in nouveau_bo_move_ntfy+0xc9/0xd0
Status: RESOLVED UPSTREAM
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: E-mail List
E-mail List
https://bugs.freedesktop.org/show_bug...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-03-08 08:41 UTC by Markos Chandras
Modified: 2018-06-15 09:17 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Markos Chandras 2018-03-08 08:41:11 UTC
In the latest TW snapshots I am getting the following traceback and then graphics simply stop working (ie only mouse is moving, the rest of the display is frozen). My kernel is 4.15.7-1-default and I am on the 20180305 snapshot. My graphics card is the following one

03:00.0 VGA compatible controller: NVIDIA Corporation GM107GL [Quadro K620] (rev a2)

Here is the backtrace. Let me know if you need more information

[43526.545896] [TTM] Buffer eviction failed
[43541.649634] WARNING: CPU: 5 PID: 3861 at ../drivers/gpu/drm/nouveau/nouveau_bo.c:1289 nouveau_bo_move_ntfy+0xc9/0xd0 [nouveau]
[43541.649641] Modules linked in: iscsi_ibft iscsi_boot_sysfs fuse af_packet tun devlink xfrm_user xfrm_algo ip_set nfnetlink br_netfilter bridge stp llc overlay vboxpci(O) x_tables msr xfs libcrc32c dm_crypt algif_skcipher af_alg snd_hda_codec_hdmi joydev hid_generic intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp hid_microsoft coretemp kvm_intel kvm irqbypass intel_wmi_thunderbolt crct10dif_pclmul crc32_pclmul iTCO_wdt mei_wdt crc32c_intel wmi_bmof iTCO_vendor_support ghash_clmulni_intel pcbc dell_smbios_wmi dell_smbios dell_wmi_descriptor dcdbas dell_smm_hwmon aesni_intel aes_x86_64 crypto_simd snd_hda_codec_realtek glue_helper pcspkr usbhid cryptd i2c_i801 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd e1000e mei_me ptp soundcore mei pps_core
[43541.649709]  lpc_ich shpchp nouveau video mxm_wmi i2c_algo_bit xhci_pci drm_kms_helper ehci_pci syscopyarea sysfillrect xhci_hcd sysimgblt ehci_hcd fb_sys_fops ttm sr_mod usbcore ata_generic cdrom drm wmi button sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua vboxnetflt(O) vboxnetadp(O) vboxdrv(O) [last unloaded: ip_tables]
[43541.649747] CPU: 5 PID: 3861 Comm: X Tainted: G           O     4.15.7-1-default #1
[43541.649750] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A25 02/02/2018
[43541.649810] RIP: 0010:nouveau_bo_move_ntfy+0xc9/0xd0 [nouveau]
[43541.649814] RSP: 0018:ffffb4b08629b898 EFLAGS: 00010286
[43541.649818] RAX: 00000000fffffff0 RBX: ffff8ab2c7fb3a40 RCX: 0000000000000000
[43541.649821] RDX: ffff8ab23b1b0128 RSI: 0000000000000286 RDI: 0000000000000286
[43541.649824] RBP: ffff8ab2c7fd7000 R08: 0000000000000000 R09: 00000000000014dd
[43541.649827] R10: 0000000000000000 R11: 00000000003d0900 R12: ffff8ab2c7fd72f8
[43541.649830] R13: ffff8ab23a91f680 R14: ffff8ab2d917e518 R15: ffffb4b08629b988
[43541.649834] FS:  00007f242e6e3ec0(0000) GS:ffff8ab2efd40000(0000) knlGS:0000000000000000
[43541.649837] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43541.649840] CR2: 00007fdc857e8fd8 CR3: 000000041dd6e001 CR4: 00000000001626e0
[43541.649843] Call Trace:
[43541.649861]  ttm_bo_handle_move_mem+0x269/0x610 [ttm]
[43541.649870]  ttm_bo_evict+0x123/0x2f0 [ttm]
[43541.649930]  ? nvc0_fence_sync32+0x163/0x190 [nouveau]
[43541.649939]  ttm_mem_evict_first+0x155/0x1b0 [ttm]
[43541.649948]  ttm_bo_mem_space+0x344/0x4c0 [ttm]
[43541.649957]  ttm_bo_validate+0xaa/0x130 [ttm]
[43541.649990]  ? drm_vma_offset_add+0x41/0x60 [drm]
[43541.649998]  ttm_bo_init_reserved+0x38f/0x430 [ttm]
[43541.650007]  ttm_bo_init+0x2f/0x90 [ttm]
[43541.650064]  ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
[43541.650116]  nouveau_bo_new+0x416/0x590 [nouveau]
[43541.650166]  ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
[43541.650215]  ? nouveau_gem_new+0x100/0x100 [nouveau]
[43541.650262]  nouveau_gem_new+0x49/0x100 [nouveau]
[43541.650309]  nouveau_gem_ioctl_new+0x41/0xc0 [nouveau]
[43541.650329]  drm_ioctl_kernel+0x5b/0xb0 [drm]
[43541.650348]  drm_ioctl+0x2ad/0x350 [drm]
[43541.650396]  ? nouveau_gem_new+0x100/0x100 [nouveau]
[43541.650403]  ? sock_sendmsg+0x36/0x40
[43541.650410]  ? aa_file_perm+0x196/0x310
[43541.650458]  nouveau_drm_ioctl+0x64/0xc0 [nouveau]
[43541.650466]  do_vfs_ioctl+0x90/0x5f0
[43541.650472]  ? __fget+0x6e/0xb0
[43541.650477]  SyS_ioctl+0x74/0x80
[43541.650484]  do_syscall_64+0x76/0x140
[43541.650491]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[43541.650495] RIP: 0033:0x7f242bfe0967
[43541.650498] RSP: 002b:00007ffe2b084488 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[43541.650502] RAX: ffffffffffffffda RBX: 0000563a98cc4270 RCX: 00007f242bfe0967
[43541.650505] RDX: 00007ffe2b0844e0 RSI: 00000000c0306480 RDI: 000000000000000c
[43541.650508] RBP: 00007ffe2b0844e0 R08: 0000000000000000 R09: 00007f242c2a8d20
[43541.650511] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c0306480
[43541.650514] R13: 000000000000000c R14: 0000563a98854a70 R15: 0000563a98124060
[43541.650518] Code: f0 49 39 c4 75 db e9 7a ff ff ff 48 3d 70 c3 5a c0 0f 85 6e ff ff ff 48 8b 87 f8 02 00 00 4c 8d a7 f8 02 00 00 48 8d 58 f0 eb d6 <0f> 0b eb c2 0f 1f 00 0f 1f 44 00 00 41 57 41 56 49 89 ce 41 55 
[43541.650574] ---[ end trace 7c8e097f09016ce7 ]---
[43556.754204] [TTM] Buffer eviction failed
[43571.858405] [TTM] Buffer eviction failed
Comment 1 Markos Chandras 2018-03-08 08:42:57 UTC
Also, a few lines above I see this

[42596.244912] nouveau 0000:03:00.0: fifo: read fault at 0000012000 engine 07 [HOST0] client 06 [HOST] reason 00 [PDE] on channel 2 [007fadb000 systemd-logind[1490]]
[42596.244920] nouveau 0000:03:00.0: fifo: channel 2: killed
[42596.244922] nouveau 0000:03:00.0: fifo: runlist 0: scheduled for recovery


This problem normally happens after I resume the computer from hibernation.
Comment 2 Takashi Iwai 2018-03-14 14:26:03 UTC
So this is a regression by kernel update?  Could you confirm it by testing with the older kernel?
Comment 3 Markos Chandras 2018-03-14 15:58:44 UTC
(In reply to Takashi Iwai from comment #2)
> So this is a regression by kernel update?  Could you confirm it by testing
> with the older kernel?

Yeah it's a regression. As far as I remember it was working fine with all 4.14 kernels but I am not sure about any of the 4.15 as I don't update very often.
Comment 4 Takashi Iwai 2018-03-14 16:27:57 UTC
OK, then could you report this to upstream?  At best, bugzilla.freedesktop.org, e.g. with component DRI/Nouveau.
Feel free to put me (tiwai@suse.de) in Cc in case you need assistance for openSUSE kernels.
Comment 5 Markos Chandras 2018-05-03 09:25:47 UTC
Sorry for the late reply.

The problem looks similar to this

https://bugs.freedesktop.org/show_bug.cgi?id=103689
Comment 6 Jiri Slaby 2018-06-15 09:17:53 UTC
.