Skip to content

Commit 036a934

Browse files
fdmananamasoncl
authored andcommitted
Btrfs: fix race between cleaner kthread and space cache writeout
When a block group becomes unused and the cleaner kthread is currently running, we can end up getting the current transaction aborted with error -ENOENT when we try to commit the transaction, leading to the following trace: [59779.258768] WARNING: CPU: 3 PID: 5990 at fs/btrfs/extent-tree.c:3740 btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]() [59779.272594] BTRFS: Transaction aborted (error -2) (...) [59779.291137] Call Trace: [59779.291621] [<ffffffff812566f4>] dump_stack+0x4e/0x79 [59779.292543] [<ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8 [59779.293435] [<ffffffffa04cb81f>] ? btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs] [59779.295000] [<ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50 [59779.296138] [<ffffffffa04c2721>] ? write_one_cache_group.isra.32+0x77/0x82 [btrfs] [59779.297663] [<ffffffffa04cb81f>] btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs] [59779.299141] [<ffffffffa0549b0d>] commit_cowonly_roots+0x1de/0x261 [btrfs] [59779.300359] [<ffffffffa04dd5b6>] btrfs_commit_transaction+0x4c4/0x99c [btrfs] [59779.301805] [<ffffffffa04b5df4>] btrfs_sync_fs+0x145/0x1ad [btrfs] [59779.302893] [<ffffffff81196634>] sync_filesystem+0x7f/0x93 (...) [59779.318186] ---[ end trace 577e2daff90da33a ]--- The following diagram illustrates a sequence of steps leading to this problem: CPU 1 CPU 2 <at transaction N> adds bg A to list fs_info->unused_bgs adds bg B to list fs_info->unused_bgs <transaction kthread commits transaction N and wakes up the cleaner kthread> cleaner kthread delete_unused_bgs() sees bg A in list fs_info->unused_bgs btrfs_start_transaction() <transaction N + 1 starts> deletes bg A update_block_group(bg C) --> adds bg C to list fs_info->unused_bgs deletes bg B sees bg C in the list fs_info->unused_bgs btrfs_remove_chunk(bg C) btrfs_remove_block_group(bg C) --> checks if the block group is in a dirty list, and because it isn't now, it does nothing --> the block group item is deleted from the extent tree --> adds bg C to list transaction->dirty_bgs some task calls btrfs_commit_transaction(t N + 1) commit_cowonly_roots() btrfs_write_dirty_block_groups() --> sees bg C in cur_trans->dirty_bgs --> calls write_one_cache_group() which returns -ENOENT because it did not find the block group item in the extent tree --> transaction aborte with -ENOENT because write_one_cache_group() returned that error So fix this by adding a block group to the list of dirty block groups before adding it to the list of unused block groups. This happened on a stress test using fsstress plus concurrent calls to fallocate 20G and truncate (releasing part of the space allocated with fallocate). Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: Chris Mason <[email protected]>
1 parent 758f2df commit 036a934

File tree

1 file changed

+16
-13
lines changed

1 file changed

+16
-13
lines changed

fs/btrfs/extent-tree.c

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5915,19 +5915,6 @@ static int update_block_group(struct btrfs_trans_handle *trans,
59155915
set_extent_dirty(info->pinned_extents,
59165916
bytenr, bytenr + num_bytes - 1,
59175917
GFP_NOFS | __GFP_NOFAIL);
5918-
/*
5919-
* No longer have used bytes in this block group, queue
5920-
* it for deletion.
5921-
*/
5922-
if (old_val == 0) {
5923-
spin_lock(&info->unused_bgs_lock);
5924-
if (list_empty(&cache->bg_list)) {
5925-
btrfs_get_block_group(cache);
5926-
list_add_tail(&cache->bg_list,
5927-
&info->unused_bgs);
5928-
}
5929-
spin_unlock(&info->unused_bgs_lock);
5930-
}
59315918
}
59325919

59335920
spin_lock(&trans->transaction->dirty_bgs_lock);
@@ -5939,6 +5926,22 @@ static int update_block_group(struct btrfs_trans_handle *trans,
59395926
}
59405927
spin_unlock(&trans->transaction->dirty_bgs_lock);
59415928

5929+
/*
5930+
* No longer have used bytes in this block group, queue it for
5931+
* deletion. We do this after adding the block group to the
5932+
* dirty list to avoid races between cleaner kthread and space
5933+
* cache writeout.
5934+
*/
5935+
if (!alloc && old_val == 0) {
5936+
spin_lock(&info->unused_bgs_lock);
5937+
if (list_empty(&cache->bg_list)) {
5938+
btrfs_get_block_group(cache);
5939+
list_add_tail(&cache->bg_list,
5940+
&info->unused_bgs);
5941+
}
5942+
spin_unlock(&info->unused_bgs_lock);
5943+
}
5944+
59425945
btrfs_put_block_group(cache);
59435946
total -= num_bytes;
59445947
bytenr += num_bytes;

0 commit comments

Comments
 (0)