Skip to content

Commit 46a6e10

Browse files
fdmananakdave
authored andcommitted
btrfs: send: allow cloning non-aligned extent if it ends at i_size
If we a find that an extent is shared but its end offset is not sector size aligned, then we don't clone it and issue write operations instead. This is because the reflink (remap_file_range) operation does not allow to clone unaligned ranges, except if the end offset of the range matches the i_size of the source and destination files (and the start offset is sector size aligned). While this is not incorrect because send can only guarantee that a file has the same data in the source and destination snapshots, it's not optimal and generates confusion and surprising behaviour for users. For example, running this test: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f $DEV mount $DEV $MNT # Use a file size not aligned to any possible sector size. file_size=$((1 * 1024 * 1024 + 5)) # 1MB + 5 bytes dd if=/dev/random of=$MNT/foo bs=$file_size count=1 cp --reflink=always $MNT/foo $MNT/bar btrfs subvolume snapshot -r $MNT/ $MNT/snap rm -f /tmp/send-test btrfs send -f /tmp/send-test $MNT/snap umount $MNT mkfs.btrfs -f $DEV mount $DEV $MNT btrfs receive -vv -f /tmp/send-test $MNT xfs_io -r -c "fiemap -v" $MNT/snap/bar umount $MNT Gives the following result: (...) mkfile o258-7-0 rename o258-7-0 -> bar write bar - offset=0 length=49152 write bar - offset=49152 length=49152 write bar - offset=98304 length=49152 write bar - offset=147456 length=49152 write bar - offset=196608 length=49152 write bar - offset=245760 length=49152 write bar - offset=294912 length=49152 write bar - offset=344064 length=49152 write bar - offset=393216 length=49152 write bar - offset=442368 length=49152 write bar - offset=491520 length=49152 write bar - offset=540672 length=49152 write bar - offset=589824 length=49152 write bar - offset=638976 length=49152 write bar - offset=688128 length=49152 write bar - offset=737280 length=49152 write bar - offset=786432 length=49152 write bar - offset=835584 length=49152 write bar - offset=884736 length=49152 write bar - offset=933888 length=49152 write bar - offset=983040 length=49152 write bar - offset=1032192 length=16389 chown bar - uid=0, gid=0 chmod bar - mode=0644 utimes bar utimes BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=06d640da-9ca1-604c-b87c-3375175a8eb3, stransid=7 /mnt/sdi/snap/bar: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..2055]: 26624..28679 2056 0x1 There's no clone operation to clone extents from the file foo into file bar and fiemap confirms there's no shared flag (0x2000). So update send_write_or_clone() so that it proceeds with cloning if the source and destination ranges end at the i_size of the respective files. After this changes the result of the test is: (...) mkfile o258-7-0 rename o258-7-0 -> bar clone bar - source=foo source offset=0 offset=0 length=1048581 chown bar - uid=0, gid=0 chmod bar - mode=0644 utimes bar utimes BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=582420f3-ea7d-564e-bbe5-ce440d622190, stransid=7 /mnt/sdi/snap/bar: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..2055]: 26624..28679 2056 0x2001 A test case for fstests will also follow up soon. Link: kdave/btrfs-progs#572 (comment) CC: [email protected] # 5.10+ Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent ae1e766 commit 46a6e10

File tree

1 file changed

+39
-13
lines changed

1 file changed

+39
-13
lines changed

fs/btrfs/send.c

Lines changed: 39 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6157,25 +6157,51 @@ static int send_write_or_clone(struct send_ctx *sctx,
61576157
u64 offset = key->offset;
61586158
u64 end;
61596159
u64 bs = sctx->send_root->fs_info->sectorsize;
6160+
struct btrfs_file_extent_item *ei;
6161+
u64 disk_byte;
6162+
u64 data_offset;
6163+
u64 num_bytes;
6164+
struct btrfs_inode_info info = { 0 };
61606165

61616166
end = min_t(u64, btrfs_file_extent_end(path), sctx->cur_inode_size);
61626167
if (offset >= end)
61636168
return 0;
61646169

6165-
if (clone_root && IS_ALIGNED(end, bs)) {
6166-
struct btrfs_file_extent_item *ei;
6167-
u64 disk_byte;
6168-
u64 data_offset;
6170+
num_bytes = end - offset;
61696171

6170-
ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
6171-
struct btrfs_file_extent_item);
6172-
disk_byte = btrfs_file_extent_disk_bytenr(path->nodes[0], ei);
6173-
data_offset = btrfs_file_extent_offset(path->nodes[0], ei);
6174-
ret = clone_range(sctx, path, clone_root, disk_byte,
6175-
data_offset, offset, end - offset);
6176-
} else {
6177-
ret = send_extent_data(sctx, path, offset, end - offset);
6178-
}
6172+
if (!clone_root)
6173+
goto write_data;
6174+
6175+
if (IS_ALIGNED(end, bs))
6176+
goto clone_data;
6177+
6178+
/*
6179+
* If the extent end is not aligned, we can clone if the extent ends at
6180+
* the i_size of the inode and the clone range ends at the i_size of the
6181+
* source inode, otherwise the clone operation fails with -EINVAL.
6182+
*/
6183+
if (end != sctx->cur_inode_size)
6184+
goto write_data;
6185+
6186+
ret = get_inode_info(clone_root->root, clone_root->ino, &info);
6187+
if (ret < 0)
6188+
return ret;
6189+
6190+
if (clone_root->offset + num_bytes == info.size)
6191+
goto clone_data;
6192+
6193+
write_data:
6194+
ret = send_extent_data(sctx, path, offset, num_bytes);
6195+
sctx->cur_inode_next_write_offset = end;
6196+
return ret;
6197+
6198+
clone_data:
6199+
ei = btrfs_item_ptr(path->nodes[0], path->slots[0],
6200+
struct btrfs_file_extent_item);
6201+
disk_byte = btrfs_file_extent_disk_bytenr(path->nodes[0], ei);
6202+
data_offset = btrfs_file_extent_offset(path->nodes[0], ei);
6203+
ret = clone_range(sctx, path, clone_root, disk_byte, data_offset, offset,
6204+
num_bytes);
61796205
sctx->cur_inode_next_write_offset = end;
61806206
return ret;
61816207
}

0 commit comments

Comments
 (0)