Skip to content

Commit 75bcffb

Browse files
Dave ChinnerChandan Babu R
authored andcommitted
xfs: shrink failure needs to hold AGI buffer
Chandan reported a AGI/AGF lock order hang on xfs/168 during recent testing. The cause of the problem was the task running xfs_growfs to shrink the filesystem. A failure occurred trying to remove the free space from the btrees that the shrink would make disappear, and that meant it ran the error handling for a partial failure. This error path involves restoring the per-ag block reservations, and that requires calculating the amount of space needed to be reserved for the free inode btree. The growfs operation hung here: [18679.536829] down+0x71/0xa0 [18679.537657] xfs_buf_lock+0xa4/0x290 [xfs] [18679.538731] xfs_buf_find_lock+0xf7/0x4d0 [xfs] [18679.539920] xfs_buf_lookup.constprop.0+0x289/0x500 [xfs] [18679.542628] xfs_buf_get_map+0x2b3/0xe40 [xfs] [18679.547076] xfs_buf_read_map+0xbb/0x900 [xfs] [18679.562616] xfs_trans_read_buf_map+0x449/0xb10 [xfs] [18679.569778] xfs_read_agi+0x1cd/0x500 [xfs] [18679.573126] xfs_ialloc_read_agi+0xc2/0x5b0 [xfs] [18679.578708] xfs_finobt_calc_reserves+0xe7/0x4d0 [xfs] [18679.582480] xfs_ag_resv_init+0x2c5/0x490 [xfs] [18679.586023] xfs_ag_shrink_space+0x736/0xd30 [xfs] [18679.590730] xfs_growfs_data_private.isra.0+0x55e/0x990 [xfs] [18679.599764] xfs_growfs_data+0x2f1/0x410 [xfs] [18679.602212] xfs_file_ioctl+0xd1e/0x1370 [xfs] trying to get the AGI lock. The AGI lock was held by a fstress task trying to do an inode allocation, and it was waiting on the AGF lock to allocate a new inode chunk on disk. Hence deadlock. The fix for this is for the growfs code to hold the AGI over the transaction roll it does in the error path. It already holds the AGF locked across this, and that is what causes the lock order inversion in the xfs_ag_resv_init() call. Reported-by: Chandan Babu R <[email protected]> Fixes: 46141dc ("xfs: introduce xfs_ag_shrink_space()") Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
1 parent 8d4dd9d commit 75bcffb

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

fs/xfs/libxfs/xfs_ag.c

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -975,14 +975,23 @@ xfs_ag_shrink_space(
975975

976976
if (error) {
977977
/*
978-
* if extent allocation fails, need to roll the transaction to
978+
* If extent allocation fails, need to roll the transaction to
979979
* ensure that the AGFL fixup has been committed anyway.
980+
*
981+
* We need to hold the AGF across the roll to ensure nothing can
982+
* access the AG for allocation until the shrink is fully
983+
* cleaned up. And due to the resetting of the AG block
984+
* reservation space needing to lock the AGI, we also have to
985+
* hold that so we don't get AGI/AGF lock order inversions in
986+
* the error handling path.
980987
*/
981988
xfs_trans_bhold(*tpp, agfbp);
989+
xfs_trans_bhold(*tpp, agibp);
982990
err2 = xfs_trans_roll(tpp);
983991
if (err2)
984992
return err2;
985993
xfs_trans_bjoin(*tpp, agfbp);
994+
xfs_trans_bjoin(*tpp, agibp);
986995
goto resv_init_out;
987996
}
988997

0 commit comments

Comments
 (0)