Bug 904023 - WARNING: CPU: 2 PID: 31066 at ../fs/btrfs/extent-tree.c:3799 btrfs_free_reserved_data_space+0xee/0x100 [btrfs]()
WARNING: CPU: 2 PID: 31066 at ../fs/btrfs/extent-tree.c:3799 btrfs_free_reser...
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
NO 13.2 BUGS!!
x86-64 openSUSE 13.2
: P5 - None : Normal (vote)
: ---
Assigned To: E-mail List
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-11-05 13:10 UTC by Martin Pluskal
Modified: 2016-10-19 10:23 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
dmesg (127.08 KB, text/plain)
2014-11-05 13:10 UTC, Martin Pluskal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Pluskal 2014-11-05 13:10:01 UTC
Created attachment 612492 [details]
dmesg

With recent factory and kernel from kernel:stable (3.17.2-1.g1afb260-desktop), following occurs:
[ 6809.337302] ------------[ cut here ]------------
[ 6809.337388] WARNING: CPU: 2 PID: 31066 at ../fs/btrfs/extent-tree.c:3799 btrfs_free_reserved_data_space+0xee/0x100 [btrfs]()
[ 6809.337390] Modules linked in: hidp cmtp kernelcapi fuse bnep rfcomm can_bcm l2tp_ppp l2tp_netlink l2tp_core udp_tunnel lp parport_pc ppdev parport joydev st phonet af_key caif_socket caif llc2 rose bluetooth vmw_vsock_vmci_transport vmw_vmci netrom vsock pppoe pppox ppp_generic slhc af_rxrpc af_alg scsi_transport_iscsi md5 can_raw can xfrm_user xfrm_algo sctp libcrc32c nfnetlink nfc rfkill af_802154 ieee802154 irda crc_ccitt rds x25 atm appletalk ipx p8023 p8022 psnap llc ax25 iscsi_ibft iscsi_boot_sysfs af_packet nls_iso8859_1 nls_cp437 vfat fat iTCO_wdt iTCO_vendor_support coretemp ipmi_devintf kvm_intel kvm lpc_ich rtc_efi pcspkr i2c_i801 serio_raw i7core_edac cdc_ether usbnet mii mfd_core ioatdma ipmi_si bnx2 edac_core dca button tpm_tis tpm ipmi_msghandler processor shpchp dm_mod efivarfs btrfs
[ 6809.337436]  xor raid6_pq sr_mod cdrom crc32c_intel ata_generic ata_piix megaraid_sas sg [last unloaded: parport_pc]
[ 6809.337446] CPU: 2 PID: 31066 Comm: kworker/u32:8 Tainted: G        W      3.17.2-1.g1afb260-desktop #1
[ 6809.337448] Hardware name: IBM System x3550 M3 -[7944K1G]-/69Y4438     , BIOS -[D6E158AUS-1.16]- 11/26/2012
[ 6809.337456] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[ 6809.337458]  0000000000000009 ffffffff8162cd5b 0000000000000000 ffffffff8105e157
[ 6809.337461]  ffff88007eb84000 0000000008000000 ffff88007eb84000 0000000008000000
[ 6809.337463]  ffff88003f952200 ffffffffa009b7fe ffff88016f7aa5b8 ffff880170706e10
[ 6809.337466]  ffff88007eb84000 0000000008000000 ffff88016f7aa43c ffffffffa00b68eb
[ 6809.337469] Call Trace:
[ 6809.337484]  [<ffffffff8100518e>] dump_trace+0x8e/0x350
[ 6809.337487]  [<ffffffff810054f6>] show_stack_log_lvl+0xa6/0x190
[ 6809.337491]  [<ffffffff81006bf1>] show_stack+0x21/0x50
[ 6809.337497]  [<ffffffff8162cd5b>] dump_stack+0x49/0x6a
[ 6809.337504]  [<ffffffff8105e157>] warn_slowpath_common+0x77/0x90
[ 6809.337517]  [<ffffffffa009b7fe>] btrfs_free_reserved_data_space+0xee/0x100 [btrfs]
[ 6809.337536]  [<ffffffffa00b68eb>] btrfs_clear_bit_hook+0x23b/0x300 [btrfs]
[ 6809.337556]  [<ffffffffa00ce6b1>] clear_state_bit+0x51/0x190 [btrfs]
[ 6809.337573]  [<ffffffffa00cefa2>] clear_extent_bit+0x252/0x3e0 [btrfs]
[ 6809.337591]  [<ffffffffa00cfed6>] extent_clear_unlock_delalloc+0x56/0x1d0 [btrfs]
[ 6809.337607]  [<ffffffffa00b9967>] cow_file_range+0x287/0x420 [btrfs]
[ 6809.337622]  [<ffffffffa00ba97a>] run_delalloc_range+0x32a/0x360 [btrfs]
[ 6809.337641]  [<ffffffffa00d073e>] writepage_delalloc.isra.34+0xfe/0x160 [btrfs]
[ 6809.337659]  [<ffffffffa00d1381>] __extent_writepage+0xc1/0x2e0 [btrfs]
[ 6809.337676]  [<ffffffffa00d184f>] extent_write_cache_pages.isra.29.constprop.48+0x2af/0x360 [btrfs]
[ 6809.337694]  [<ffffffffa00d35ed>] extent_writepages+0x4d/0x60 [btrfs]
[ 6809.337698]  [<ffffffff811e61fd>] __writeback_single_inode+0x3d/0x2a0
[ 6809.337702]  [<ffffffff811e7030>] writeback_sb_inodes+0x220/0x3c0
[ 6809.337706]  [<ffffffff811e7266>] __writeback_inodes_wb+0x96/0xc0
[ 6809.337711]  [<ffffffff811e74ab>] wb_writeback+0x21b/0x330
[ 6809.337715]  [<ffffffff811e96e8>] bdi_writeback_workfn+0x108/0x470
[ 6809.337720]  [<ffffffff81074d43>] process_one_work+0x143/0x400
[ 6809.337724]  [<ffffffff81075114>] worker_thread+0x114/0x470
[ 6809.337728]  [<ffffffff81079f3d>] kthread+0xbd/0xe0
[ 6809.337733]  [<ffffffff8163397c>] ret_from_fork+0x7c/0xb0
[ 6809.337737] ---[ end trace 32c18e26f677bdd8 ]---


# btrfs filesystem df /
Data, single: total=140.88GiB, used=23.92GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=1.50GiB, used=650.47MiB
GlobalReserve, single: total=224.00MiB, used=272.00KiB
Comment 1 Martin Pluskal 2014-11-06 07:17:55 UTC
I have managed to trigger issue again, vmcore from crashdump is here: http://w3.suse.de/~mpluskal/904023/vmcore
Comment 2 Mark Fasheh 2014-11-06 22:56:11 UTC
(In reply to Martin Pluskal from comment #1)
> I have managed to trigger issue again, vmcore from crashdump is here:
> http://w3.suse.de/~mpluskal/904023/vmcore

Cool thanks, I downloaded the vmcore and am trying to get ahold of a set of  3.17.2-1.g1afb260-desktop rpms.
Comment 3 Martin Pluskal 2014-11-07 22:51:29 UTC
Issue is also reproducible with 3.18.0-rc3-4.g07807b9-desktop:
Comment 4 Mark Fasheh 2015-12-16 23:56:09 UTC
I found a promising conversation about this bug which lead me to the following patch:

https://patchwork.kernel.org/patch/6623001/

I'm going to give reproducing this with and without that fix a try.
Comment 5 Mark Fasheh 2015-12-23 20:40:20 UTC
Hey Martin, have you seen this with any kernel beyond 3.18? I've got 4.3 going and haven't seen it yet, though that may change as I'm running a full xfstest run on it at the moment.

Also if you happen to recall what actions might be setting off the bug please let me know (I understand if you don't know).

Thanks.
Comment 6 Filipe Manana 2015-12-24 10:48:56 UTC
(In reply to Mark Fasheh from comment #5)
> Hey Martin, have you seen this with any kernel beyond 3.18? I've got 4.3
> going and haven't seen it yet, though that may change as I'm running a full
> xfstest run on it at the moment.
> 
> Also if you happen to recall what actions might be setting off the bug
> please let me know (I understand if you don't know).
> 
> Thanks.

Mark, if you backport that fix, please also backport the fix to that fix:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=50745b0a7f46f68574cd2b9ae24566bf026e7ebd

There's also a bunch of several other direct IO fixes on top of that, some in upstream already others targeted for 4.5 merge window.
Comment 7 Martin Pluskal 2015-12-24 10:57:34 UTC
(In reply to Mark Fasheh from comment #5)
> Hey Martin, have you seen this with any kernel beyond 3.18? I've got 4.3
> going and haven't seen it yet, though that may change as I'm running a full
> xfstest run on it at the moment.
Actually I haven't seen this issue in a while, not sure with which kernel it stopped occurring.
> 
> Also if you happen to recall what actions might be setting off the bug
> please let me know (I understand if you don't know).
I was aware of anything that would make issue reproducible, I would mention it in original report - if I recall correctly this occurred randomly on system with no significant load or activity (at least I was not aware of such).
Comment 8 Filipe Manana 2015-12-24 11:49:57 UTC
Mark, also that fix is only to fix a bug introduced in some performance patches that aren't in 3.17:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3e05bde8c3c2dd761da4d52944a087907955a53c
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3266789f9d08b27275bae5ab1dcd27d1bbf15e79

They landed in kernel 4.0.
Comment 10 Jiri Slaby 2016-10-19 10:23:26 UTC
(In reply to Martin Pluskal from comment #7)
> Actually I haven't seen this issue in a while, not sure with which kernel it
> stopped occurring.

2015-12-24... maybe Jezisek fixed that. I think we can close, now?