Skip to content

Commit f044b31

Browse files
josefbacikkdave
authored andcommitted
btrfs: handle the ro->rw transition for mounting different subvolumes
This is a special case that we've carried around since 0723a04 ("btrfs: allow mounting btrfs subvolumes with different ro/rw options") where we'll under the covers flip the file system to RW if you're mixing and matching ro/rw options with different subvol mounts. The first mount is what the super gets setup as, so we'd handle this by remount the super as rw under the covers to facilitate this behavior. With the new mount API we can't really allow this, because user space has the ability to specify the super block settings, and the mount settings. So if the user explicitly sets the super block as read only, and then tried to mount a rw mount with the super block we'll reject this. However the old API was less descriptive and thus we allowed this kind of behavior. This patch preserves this behavior for the old API calls. This is inspired by Christians work [1], and includes his comment in btrfs_get_tree_super() explaining the history and how it all works in the old and new APIs. Link: https://lore.kernel.org/all/[email protected]/ Reviewed-by: Christian Brauner <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 3bb17a2 commit f044b31

File tree

1 file changed

+128
-1
lines changed

1 file changed

+128
-1
lines changed

fs/btrfs/super.c

Lines changed: 128 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2477,13 +2477,15 @@ static int btrfs_reconfigure(struct fs_context *fc)
24772477
struct btrfs_fs_context *ctx = fc->fs_private;
24782478
struct btrfs_fs_context old_ctx;
24792479
int ret = 0;
2480+
bool mount_reconfigure = (fc->s_fs_info != NULL);
24802481

24812482
btrfs_info_to_ctx(fs_info, &old_ctx);
24822483

24832484
sync_filesystem(sb);
24842485
set_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
24852486

2486-
if (!check_options(fs_info, &ctx->mount_opt, fc->sb_flags))
2487+
if (!mount_reconfigure &&
2488+
!check_options(fs_info, &ctx->mount_opt, fc->sb_flags))
24872489
return -EINVAL;
24882490

24892491
ret = btrfs_check_features(fs_info, !(fc->sb_flags & SB_RDONLY));
@@ -2885,6 +2887,129 @@ static int btrfs_get_tree_super(struct fs_context *fc)
28852887
return ret;
28862888
}
28872889

2890+
/*
2891+
* Ever since commit 0723a0473fb4 ("btrfs: allow mounting btrfs subvolumes
2892+
* with different ro/rw options") the following works:
2893+
*
2894+
* (i) mount /dev/sda3 -o subvol=foo,ro /mnt/foo
2895+
* (ii) mount /dev/sda3 -o subvol=bar,rw /mnt/bar
2896+
*
2897+
* which looks nice and innocent but is actually pretty intricate and deserves
2898+
* a long comment.
2899+
*
2900+
* On another filesystem a subvolume mount is close to something like:
2901+
*
2902+
* (iii) # create rw superblock + initial mount
2903+
* mount -t xfs /dev/sdb /opt/
2904+
*
2905+
* # create ro bind mount
2906+
* mount --bind -o ro /opt/foo /mnt/foo
2907+
*
2908+
* # unmount initial mount
2909+
* umount /opt
2910+
*
2911+
* Of course, there's some special subvolume sauce and there's the fact that the
2912+
* sb->s_root dentry is really swapped after mount_subtree(). But conceptually
2913+
* it's very close and will help us understand the issue.
2914+
*
2915+
* The old mount API didn't cleanly distinguish between a mount being made ro
2916+
* and a superblock being made ro. The only way to change the ro state of
2917+
* either object was by passing ms_rdonly. If a new mount was created via
2918+
* mount(2) such as:
2919+
*
2920+
* mount("/dev/sdb", "/mnt", "xfs", ms_rdonly, null);
2921+
*
2922+
* the MS_RDONLY flag being specified had two effects:
2923+
*
2924+
* (1) MNT_READONLY was raised -> the resulting mount got
2925+
* @mnt->mnt_flags |= MNT_READONLY raised.
2926+
*
2927+
* (2) MS_RDONLY was passed to the filesystem's mount method and the filesystems
2928+
* made the superblock ro. Note, how SB_RDONLY has the same value as
2929+
* ms_rdonly and is raised whenever MS_RDONLY is passed through mount(2).
2930+
*
2931+
* Creating a subtree mount via (iii) ends up leaving a rw superblock with a
2932+
* subtree mounted ro.
2933+
*
2934+
* But consider the effect on the old mount API on btrfs subvolume mounting
2935+
* which combines the distinct step in (iii) into a single step.
2936+
*
2937+
* By issuing (i) both the mount and the superblock are turned ro. Now when (ii)
2938+
* is issued the superblock is ro and thus even if the mount created for (ii) is
2939+
* rw it wouldn't help. Hence, btrfs needed to transition the superblock from ro
2940+
* to rw for (ii) which it did using an internal remount call.
2941+
*
2942+
* IOW, subvolume mounting was inherently complicated due to the ambiguity of
2943+
* MS_RDONLY in mount(2). Note, this ambiguity has mount(8) always translate
2944+
* "ro" to MS_RDONLY. IOW, in both (i) and (ii) "ro" becomes MS_RDONLY when
2945+
* passed by mount(8) to mount(2).
2946+
*
2947+
* Enter the new mount API. The new mount API disambiguates making a mount ro
2948+
* and making a superblock ro.
2949+
*
2950+
* (3) To turn a mount ro the MOUNT_ATTR_ONLY flag can be used with either
2951+
* fsmount() or mount_setattr() this is a pure VFS level change for a
2952+
* specific mount or mount tree that is never seen by the filesystem itself.
2953+
*
2954+
* (4) To turn a superblock ro the "ro" flag must be used with
2955+
* fsconfig(FSCONFIG_SET_FLAG, "ro"). This option is seen by the filesystem
2956+
* in fc->sb_flags.
2957+
*
2958+
* This disambiguation has rather positive consequences. Mounting a subvolume
2959+
* ro will not also turn the superblock ro. Only the mount for the subvolume
2960+
* will become ro.
2961+
*
2962+
* So, if the superblock creation request comes from the new mount API the
2963+
* caller must have explicitly done:
2964+
*
2965+
* fsconfig(FSCONFIG_SET_FLAG, "ro")
2966+
* fsmount/mount_setattr(MOUNT_ATTR_RDONLY)
2967+
*
2968+
* IOW, at some point the caller must have explicitly turned the whole
2969+
* superblock ro and we shouldn't just undo it like we did for the old mount
2970+
* API. In any case, it lets us avoid the hack in the new mount API.
2971+
*
2972+
* Consequently, the remounting hack must only be used for requests originating
2973+
* from the old mount API and should be marked for full deprecation so it can be
2974+
* turned off in a couple of years.
2975+
*
2976+
* The new mount API has no reason to support this hack.
2977+
*/
2978+
static struct vfsmount *btrfs_reconfigure_for_mount(struct fs_context *fc)
2979+
{
2980+
struct vfsmount *mnt;
2981+
int ret;
2982+
const bool ro2rw = !(fc->sb_flags & SB_RDONLY);
2983+
2984+
/*
2985+
* We got an EBUSY because our SB_RDONLY flag didn't match the existing
2986+
* super block, so invert our setting here and retry the mount so we
2987+
* can get our vfsmount.
2988+
*/
2989+
if (ro2rw)
2990+
fc->sb_flags |= SB_RDONLY;
2991+
else
2992+
fc->sb_flags &= ~SB_RDONLY;
2993+
2994+
mnt = fc_mount(fc);
2995+
if (IS_ERR(mnt))
2996+
return mnt;
2997+
2998+
if (!fc->oldapi || !ro2rw)
2999+
return mnt;
3000+
3001+
/* We need to convert to rw, call reconfigure. */
3002+
fc->sb_flags &= ~SB_RDONLY;
3003+
down_write(&mnt->mnt_sb->s_umount);
3004+
ret = btrfs_reconfigure(fc);
3005+
up_write(&mnt->mnt_sb->s_umount);
3006+
if (ret) {
3007+
mntput(mnt);
3008+
return ERR_PTR(ret);
3009+
}
3010+
return mnt;
3011+
}
3012+
28883013
static int btrfs_get_tree_subvol(struct fs_context *fc)
28893014
{
28903015
struct btrfs_fs_info *fs_info = NULL;
@@ -2934,6 +3059,8 @@ static int btrfs_get_tree_subvol(struct fs_context *fc)
29343059
fc->security = NULL;
29353060

29363061
mnt = fc_mount(dup_fc);
3062+
if (PTR_ERR_OR_ZERO(mnt) == -EBUSY)
3063+
mnt = btrfs_reconfigure_for_mount(dup_fc);
29373064
put_fs_context(dup_fc);
29383065
if (IS_ERR(mnt))
29393066
return PTR_ERR(mnt);

0 commit comments

Comments
 (0)