Skip to content

Commit ed8ad83

Browse files
Jan Karatytso
authored andcommitted
ext4: fix bh->b_state corruption
ext4 can update bh->b_state non-atomically in _ext4_get_block() and ext4_da_get_block_prep(). Usually this is fine since bh is just a temporary storage for mapping information on stack but in some cases it can be fully living bh attached to a page. In such case non-atomic update of bh->b_state can race with an atomic update which then gets lost. Usually when we are mapping bh and thus updating bh->b_state non-atomically, nobody else touches the bh and so things work out fine but there is one case to especially worry about: ext4_finish_bio() uses BH_Uptodate_Lock on the first bh in the page to synchronize handling of PageWriteback state. So when blocksize < pagesize, we can be atomically modifying bh->b_state of a buffer that actually isn't under IO and thus can race e.g. with delalloc trying to map that buffer. The result is that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock. Fix the problem by always updating bh->b_state bits atomically. CC: [email protected] Reported-by: Nikolay Borisov <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
1 parent c906f38 commit ed8ad83

File tree

1 file changed

+30
-2
lines changed

1 file changed

+30
-2
lines changed

fs/ext4/inode.c

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -686,6 +686,34 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
686686
return retval;
687687
}
688688

689+
/*
690+
* Update EXT4_MAP_FLAGS in bh->b_state. For buffer heads attached to pages
691+
* we have to be careful as someone else may be manipulating b_state as well.
692+
*/
693+
static void ext4_update_bh_state(struct buffer_head *bh, unsigned long flags)
694+
{
695+
unsigned long old_state;
696+
unsigned long new_state;
697+
698+
flags &= EXT4_MAP_FLAGS;
699+
700+
/* Dummy buffer_head? Set non-atomically. */
701+
if (!bh->b_page) {
702+
bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | flags;
703+
return;
704+
}
705+
/*
706+
* Someone else may be modifying b_state. Be careful! This is ugly but
707+
* once we get rid of using bh as a container for mapping information
708+
* to pass to / from get_block functions, this can go away.
709+
*/
710+
do {
711+
old_state = READ_ONCE(bh->b_state);
712+
new_state = (old_state & ~EXT4_MAP_FLAGS) | flags;
713+
} while (unlikely(
714+
cmpxchg(&bh->b_state, old_state, new_state) != old_state));
715+
}
716+
689717
/* Maximum number of blocks we map for direct IO at once. */
690718
#define DIO_MAX_BLOCKS 4096
691719

@@ -722,7 +750,7 @@ static int _ext4_get_block(struct inode *inode, sector_t iblock,
722750
ext4_io_end_t *io_end = ext4_inode_aio(inode);
723751

724752
map_bh(bh, inode->i_sb, map.m_pblk);
725-
bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | map.m_flags;
753+
ext4_update_bh_state(bh, map.m_flags);
726754
if (io_end && io_end->flag & EXT4_IO_END_UNWRITTEN)
727755
set_buffer_defer_completion(bh);
728756
bh->b_size = inode->i_sb->s_blocksize * map.m_len;
@@ -1685,7 +1713,7 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
16851713
return ret;
16861714

16871715
map_bh(bh, inode->i_sb, map.m_pblk);
1688-
bh->b_state = (bh->b_state & ~EXT4_MAP_FLAGS) | map.m_flags;
1716+
ext4_update_bh_state(bh, map.m_flags);
16891717

16901718
if (buffer_unwritten(bh)) {
16911719
/* A delayed write to unwritten bh should be marked

0 commit comments

Comments
 (0)