Bug 1072358

Summary: 4.15.0-rc3 crash on boot on ppc64 (BE) power7
Product: [openSUSE] openSUSE Tumbleweed Reporter: Ruediger Oertel <ro>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED NORESPONSE QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: agraf, hare, jslaby, msuchanek, ro, tiwai
Version: CurrentFlags: jslaby: needinfo? (ro)
Target Milestone: ---   
Hardware: PowerPC-64   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Ruediger Oertel 2017-12-12 10:33:23 UTC
Unable to handle kernel paging request for data at address 0x00000080
Faulting instruction address: 0xd00000000a620b58
Oops: Kernel access of bad area, sig: 11 [#1]
BE SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: dm_service_time dm_multipath fuse squashfs zstd_decompress xxhash loop dm_mod brd af_packet sr_mod cdrom sd_mod ibmvscsi scsi_transport_srp scsi_mod ibmveth
CPU: 0 PID: 1079 Comm: systemd-udevd Not tainted 4.15.0-rc3-7-default #1
NIP:  d00000000a620b58 LR: d00000000a347028 CTR: d00000000a620ac0
REGS: 00000000b30778d9 TRAP: 0300   Not tainted  (4.15.0-rc3-7-default)
MSR:  8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 44002888  XER: 00000000
CFAR: c000000000008768 DAR: 0000000000000080 DSISR: 40000000 SOFTE: 0 
GPR00: d00000000a347028 c00000000cb974e0 d00000000a62ee30 0000000000000000 
GPR04: 0000000000000000 c00000000c8fbbe0 0000000000000000 0000000000000000 
GPR08: 00000000000001b1 0000000000000000 0000000000000000 0000000000000001 
GPR12: c0000000006271c0 c00000000e950000 c00000076db89898 0000000000000000 
GPR16: c00000076db89800 0000000000020000 c000000757214f50 c000000757214f58 
GPR20: 0000000000000001 000000000009ffff 0000000000000001 00000000014212c0 
GPR24: c00000000cb97930 0000000004ffff80 c00000000c9342c8 c00000000c8fbbe0 
GPR28: d00000000a680080 c0000000861e1530 0000000000000000 c00000000c9d3b00 
NIP [d00000000a620b58] .multipath_busy+0x98/0x180 [dm_multipath]
LR [d00000000a347028] .dm_old_request_fn+0x88/0x2b0 [dm_mod]
Call Trace:
[c00000000cb974e0] [c00000000062729c] .blk_peek_request+0xdc/0x3d0 (unreliable)
[c00000000cb97570] [d00000000a347028] .dm_old_request_fn+0x88/0x2b0 [dm_mod]
[c00000000cb97630] [c000000000620ca8] .__blk_run_queue+0x68/0xb0
[c00000000cb976b0] [c00000000062138c] .queue_unplugged+0x11c/0x160
[c00000000cb97750] [c0000000006285d8] .blk_flush_plug_list+0x2c8/0x380
[c00000000cb97820] [c000000000628d28] .blk_finish_plug+0x48/0x80
[c00000000cb978a0] [c00000000031250c] .__do_page_cache_readahead+0x25c/0x3a0
[c00000000cb979d0] [c000000000312be0] .force_page_cache_readahead+0xe0/0x1a0
[c00000000cb97a70] [c0000000002f9fb4] .generic_file_buffered_read+0x464/0xa20
[c00000000cb97b80] [c00000000043c550] .blkdev_read_iter+0x60/0x90
[c00000000cb97c00] [c0000000003dd98c] .__vfs_read+0x14c/0x1d0
[c00000000cb97cf0] [c0000000003ddac4] .vfs_read+0xb4/0x1a0
[c00000000cb97d90] [c0000000003de284] .SyS_read+0x64/0x110
[c00000000cb97e30] [c00000000000b810] system_call+0x58/0x6c
Instruction dump:
39400000 48000010 ebff0000 7fbfe840 419e00c8 893f0098 71290080 4182ffec 
e93f0030 2fa90000 419e0024 e9290000 <e9290080> e86903e8 480049c1 e8410028 
---[ end trace 9ef45eb021c815f8 ]---


watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [multipathd:1061]
Modules linked in: dm_service_time dm_multipath fuse squashfs zstd_decompress xxhash loop dm_mod brd af_packet sr_mod cdrom sd_mod ibmvscsi scsi_transport_srp scsi_mod ibmveth
CPU: 9 PID: 1061 Comm: multipathd Tainted: G      D          4.15.0-rc3-7-default #1
NIP:  c0000000000d597c LR: c000000000093e04 CTR: 0000000000000000
REGS: 0000000063989018 TRAP: 0901   Tainted: G      D           (4.15.0-rc3-7-default)
MSR:  8000000002009032 <SF,VEC,EE,ME,IR,DR,RI>  CR: 24024282  XER: 00000000
CFAR: 0000000000000c00 SOFTE: 1 
GPR00: 0000000048024282 c00000000cf1b640 c00000000168ea00 0000000000000000 
GPR04: 0000000000000000 0000000000003ab7 c000000000bbcc20 c0000000017307b8 
GPR08: 0000000080000000 c00000000e950000 c00000000e950000 d00000000a349a58 
GPR12: 0000000088022282 c00000000e955e80 
NIP [c0000000000d597c] .plpar_hcall_norets+0x14/0x20
LR [c000000000093e04] .__spin_yield+0x94/0xa0
Call Trace:
[c00000000cf1b6c0] [c000000000b0c9c8] ._raw_spin_lock_irqsave+0x148/0x150
[c00000000cf1b750] [d00000000a337dbc] .dm_table_run_md_queue_async+0x8c/0xc0 [dm_mod]                                                                           
[c00000000cf1b7d0] [d00000000a622e50] .queue_if_no_path+0x140/0x180 [dm_multipath]                                                                              
[c00000000cf1b880] [d00000000a623200] .multipath_message+0x330/0x4f0 [dm_multipath]                                                                             
[c00000000cf1b950] [d00000000a33d5a8] .target_message+0x298/0x400 [dm_mod]
[c00000000cf1ba10] [d00000000a33ee90] .ctl_ioctl+0x200/0x580 [dm_mod]
[c00000000cf1bc10] [d00000000a33f240] .dm_ctl_ioctl+0x30/0x40 [dm_mod]
[c00000000cf1bca0] [c0000000003fb308] .do_vfs_ioctl+0xd8/0x880
[c00000000cf1bd90] [c0000000003fbb74] .SyS_ioctl+0xc4/0x130
[c00000000cf1be30] [c00000000000b810] system_call+0x58/0x6c
Instruction dump:
f95f0010 f93f0018 382100d0 e8010010 ebe1fff8 7c0803a6 4e800020 7c421378 
7c000026 90010008 60000000 44000022 <80010008> 7c0ff120 4e800020 7c0802a6 
BUG: workqueue lockup - pool cpus=9 node=0 flags=0x0 nice=0 stuck for 35s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
  pwq 18: cpus=9 node=0 flags=0x0 nice=0 active=1/256
    pending: .cache_reap
systemd-udevd[411]: seq 2816 '/devices/virtual/block/dm-0' is taking a long time
 sda: sda1
INFO: rcu_sched self-detected stall on CPU
        9-....: (5999 ticks this GP) idle=126/140000000000001/0 softirq=228/228 fqs=2998                                                                        
         (t=6000 jiffies g=102 c=101 q=53377)
NMI backtrace for cpu 9
CPU: 9 PID: 1061 Comm: multipathd Tainted: G      D      L   4.15.0-rc3-7-default #1                                                                            
...
Comment 1 Jiri Slaby 2018-02-08 15:32:40 UTC
Does this still happen with 4.15-final?
Comment 2 Jiri Slaby 2018-06-16 12:15:20 UTC
Closing due to lack of response.