Skip to content

Commit 4802310

Browse files
committed
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi: "In addition to bug fixes and cleanups there are two new features from Amir: - Consistent inode number support for the case when layers are not all on the same filesystem (feature is dubbed "xino"). - Optimize overlayfs file handle decoding. This one touches the exportfs interface to allow detecting the disconnected directory case" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: update documentation w.r.t "xino" feature ovl: add support for "xino" mount and config options ovl: consistent d_ino for non-samefs with xino ovl: consistent i_ino for non-samefs with xino ovl: constant st_ino for non-samefs with xino ovl: allocate anon bdev per unique lower fs ovl: factor out ovl_map_dev_ino() helper ovl: cleanup ovl_update_time() ovl: add WARN_ON() for non-dir redirect cases ovl: cleanup setting OVL_INDEX ovl: set d->is_dir and d->opaque for last path element ovl: Do not check for redirect if this is last layer ovl: lookup in inode cache first when decoding lower file handle ovl: do not try to reconnect a disconnected origin dentry ovl: disambiguate ovl_encode_fh() ovl: set lower layer st_dev only if setting lower st_ino ovl: fix lookup with middle layer opaque dir and absolute path redirects ovl: Set d->last properly during lookup ovl: set i_ino to the value of st_ino for NFS export
2 parents ba2b137 + 1614901 commit 4802310

File tree

12 files changed

+510
-172
lines changed

12 files changed

+510
-172
lines changed

Documentation/filesystems/overlayfs.txt

Lines changed: 33 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,13 @@ The result will inevitably fail to look exactly like a normal
1414
filesystem for various technical reasons. The expectation is that
1515
many use cases will be able to ignore these differences.
1616

17-
This approach is 'hybrid' because the objects that appear in the
18-
filesystem do not all appear to belong to that filesystem. In many
19-
cases an object accessed in the union will be indistinguishable
17+
18+
Overlay objects
19+
---------------
20+
21+
The overlay filesystem approach is 'hybrid', because the objects that
22+
appear in the filesystem do not always appear to belong to that filesystem.
23+
In many cases, an object accessed in the union will be indistinguishable
2024
from accessing the corresponding object from the original filesystem.
2125
This is most obvious from the 'st_dev' field returned by stat(2).
2226

@@ -34,6 +38,19 @@ make the overlay mount more compliant with filesystem scanners and
3438
overlay objects will be distinguishable from the corresponding
3539
objects in the original filesystem.
3640

41+
On 64bit systems, even if all overlay layers are not on the same
42+
underlying filesystem, the same compliant behavior could be achieved
43+
with the "xino" feature. The "xino" feature composes a unique object
44+
identifier from the real object st_ino and an underlying fsid index.
45+
If all underlying filesystems support NFS file handles and export file
46+
handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
47+
will use the high inode number bits for fsid. Even when the underlying
48+
filesystem uses 64bit inode numbers, users can still enable the "xino"
49+
feature with the "-o xino=on" overlay mount option. That is useful for the
50+
case of underlying filesystems like xfs and tmpfs, which use 64bit inode
51+
numbers, but are very unlikely to use the high inode number bit.
52+
53+
3754
Upper and Lower
3855
---------------
3956

@@ -290,10 +307,19 @@ Non-standard behavior
290307
---------------------
291308

292309
The copy_up operation essentially creates a new, identical file and
293-
moves it over to the old name. The new file may be on a different
294-
filesystem, so both st_dev and st_ino of the file may change.
310+
moves it over to the old name. Any open files referring to this inode
311+
will access the old data.
312+
313+
The new file may be on a different filesystem, so both st_dev and st_ino
314+
of the real file may change. The values of st_dev and st_ino returned by
315+
stat(2) on an overlay object are often not the same as the real file
316+
stat(2) values to prevent the values from changing on copy_up.
295317

296-
Any open files referring to this inode will access the old data.
318+
Unless "xino" feature is enabled, when overlay layers are not all on the
319+
same underlying filesystem, the value of st_dev may be different for two
320+
non-directory objects in the same overlay filesystem and the value of
321+
st_ino for directory objects may be non persistent and could change even
322+
while the overlay filesystem is still mounted.
297323

298324
Unless "inode index" feature is enabled, if a file with multiple hard
299325
links is copied up, then this will "break" the link. Changes will not be
@@ -302,6 +328,7 @@ propagated to other names referring to the same inode.
302328
Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
303329
directory will fail with EXDEV.
304330

331+
305332
Changes to underlying filesystems
306333
---------------------------------
307334

fs/exportfs/expfs.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,15 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
435435
if (IS_ERR_OR_NULL(result))
436436
return ERR_PTR(-ESTALE);
437437

438+
/*
439+
* If no acceptance criteria was specified by caller, a disconnected
440+
* dentry is also accepatable. Callers may use this mode to query if
441+
* file handle is stale or to get a reference to an inode without
442+
* risking the high overhead caused by directory reconnect.
443+
*/
444+
if (!acceptable)
445+
return result;
446+
438447
if (d_is_dir(result)) {
439448
/*
440449
* This request is for a directory.

fs/overlayfs/Kconfig

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,3 +86,20 @@ config OVERLAY_FS_NFS_EXPORT
8686
case basis with the "nfs_export=on" mount option.
8787

8888
Say N unless you fully understand the consequences.
89+
90+
config OVERLAY_FS_XINO_AUTO
91+
bool "Overlayfs: auto enable inode number mapping"
92+
default n
93+
depends on OVERLAY_FS
94+
help
95+
If this config option is enabled then overlay filesystems will use
96+
unused high bits in undelying filesystem inode numbers to map all
97+
inodes to a unified address space. The mapped 64bit inode numbers
98+
might not be compatible with applications that expect 32bit inodes.
99+
100+
If compatibility with applications that expect 32bit inodes is not an
101+
issue, then it is safe and recommended to say Y here.
102+
103+
For more information, see Documentation/filesystems/overlayfs.txt
104+
105+
If unsure, say N.

fs/overlayfs/copy_up.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -232,7 +232,7 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
232232
return err;
233233
}
234234

235-
struct ovl_fh *ovl_encode_fh(struct dentry *real, bool is_upper)
235+
struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper)
236236
{
237237
struct ovl_fh *fh;
238238
int fh_type, fh_len, dwords;
@@ -300,7 +300,7 @@ int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
300300
* up and a pure upper inode.
301301
*/
302302
if (ovl_can_decode_fh(lower->d_sb)) {
303-
fh = ovl_encode_fh(lower, false);
303+
fh = ovl_encode_real_fh(lower, false);
304304
if (IS_ERR(fh))
305305
return PTR_ERR(fh);
306306
}
@@ -321,7 +321,7 @@ static int ovl_set_upper_fh(struct dentry *upper, struct dentry *index)
321321
const struct ovl_fh *fh;
322322
int err;
323323

324-
fh = ovl_encode_fh(upper, true);
324+
fh = ovl_encode_real_fh(upper, true);
325325
if (IS_ERR(fh))
326326
return PTR_ERR(fh);
327327

fs/overlayfs/export.c

Lines changed: 40 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -228,8 +228,8 @@ static int ovl_d_to_fh(struct dentry *dentry, char *buf, int buflen)
228228
goto fail;
229229

230230
/* Encode an upper or lower file handle */
231-
fh = ovl_encode_fh(enc_lower ? ovl_dentry_lower(dentry) :
232-
ovl_dentry_upper(dentry), !enc_lower);
231+
fh = ovl_encode_real_fh(enc_lower ? ovl_dentry_lower(dentry) :
232+
ovl_dentry_upper(dentry), !enc_lower);
233233
err = PTR_ERR(fh);
234234
if (IS_ERR(fh))
235235
goto fail;
@@ -267,8 +267,8 @@ static int ovl_dentry_to_fh(struct dentry *dentry, u32 *fid, int *max_len)
267267
return OVL_FILEID;
268268
}
269269

270-
static int ovl_encode_inode_fh(struct inode *inode, u32 *fid, int *max_len,
271-
struct inode *parent)
270+
static int ovl_encode_fh(struct inode *inode, u32 *fid, int *max_len,
271+
struct inode *parent)
272272
{
273273
struct dentry *dentry;
274274
int type;
@@ -305,15 +305,12 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
305305
if (d_is_dir(upper ?: lower))
306306
return ERR_PTR(-EIO);
307307

308-
inode = ovl_get_inode(sb, dget(upper), lower, index, !!lower);
308+
inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
309309
if (IS_ERR(inode)) {
310310
dput(upper);
311311
return ERR_CAST(inode);
312312
}
313313

314-
if (index)
315-
ovl_set_flag(OVL_INDEX, inode);
316-
317314
dentry = d_find_any_alias(inode);
318315
if (!dentry) {
319316
dentry = d_alloc_anon(inode->i_sb);
@@ -685,7 +682,7 @@ static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
685682
if (!ofs->upper_mnt)
686683
return ERR_PTR(-EACCES);
687684

688-
upper = ovl_decode_fh(fh, ofs->upper_mnt);
685+
upper = ovl_decode_real_fh(fh, ofs->upper_mnt, true);
689686
if (IS_ERR_OR_NULL(upper))
690687
return upper;
691688

@@ -703,25 +700,39 @@ static struct dentry *ovl_lower_fh_to_d(struct super_block *sb,
703700
struct ovl_path *stack = &origin;
704701
struct dentry *dentry = NULL;
705702
struct dentry *index = NULL;
706-
struct inode *inode = NULL;
707-
bool is_deleted = false;
703+
struct inode *inode;
708704
int err;
709705

710-
/* First lookup indexed upper by fh */
706+
/* First lookup overlay inode in inode cache by origin fh */
707+
err = ovl_check_origin_fh(ofs, fh, false, NULL, &stack);
708+
if (err)
709+
return ERR_PTR(err);
710+
711+
if (!d_is_dir(origin.dentry) ||
712+
!(origin.dentry->d_flags & DCACHE_DISCONNECTED)) {
713+
inode = ovl_lookup_inode(sb, origin.dentry, false);
714+
err = PTR_ERR(inode);
715+
if (IS_ERR(inode))
716+
goto out_err;
717+
if (inode) {
718+
dentry = d_find_any_alias(inode);
719+
iput(inode);
720+
if (dentry)
721+
goto out;
722+
}
723+
}
724+
725+
/* Then lookup indexed upper/whiteout by origin fh */
711726
if (ofs->indexdir) {
712727
index = ovl_get_index_fh(ofs, fh);
713728
err = PTR_ERR(index);
714729
if (IS_ERR(index)) {
715-
if (err != -ESTALE)
716-
return ERR_PTR(err);
717-
718-
/* Found a whiteout index - treat as deleted inode */
719-
is_deleted = true;
720730
index = NULL;
731+
goto out_err;
721732
}
722733
}
723734

724-
/* Then try to get upper dir by index */
735+
/* Then try to get a connected upper dir by index */
725736
if (index && d_is_dir(index)) {
726737
struct dentry *upper = ovl_index_upper(ofs, index);
727738

@@ -734,32 +745,26 @@ static struct dentry *ovl_lower_fh_to_d(struct super_block *sb,
734745
goto out;
735746
}
736747

737-
/* Then lookup origin by fh */
738-
err = ovl_check_origin_fh(ofs, fh, NULL, &stack);
739-
if (err) {
740-
goto out_err;
741-
} else if (index) {
742-
err = ovl_verify_origin(index, origin.dentry, false);
748+
/* Otherwise, get a connected non-upper dir or disconnected non-dir */
749+
if (d_is_dir(origin.dentry) &&
750+
(origin.dentry->d_flags & DCACHE_DISCONNECTED)) {
751+
dput(origin.dentry);
752+
origin.dentry = NULL;
753+
err = ovl_check_origin_fh(ofs, fh, true, NULL, &stack);
743754
if (err)
744755
goto out_err;
745-
} else if (is_deleted) {
746-
/* Lookup deleted non-dir by origin inode */
747-
if (!d_is_dir(origin.dentry))
748-
inode = ovl_lookup_inode(sb, origin.dentry, false);
749-
err = -ESTALE;
750-
if (!inode || atomic_read(&inode->i_count) == 1)
756+
}
757+
if (index) {
758+
err = ovl_verify_origin(index, origin.dentry, false);
759+
if (err)
751760
goto out_err;
752-
753-
/* Deleted but still open? */
754-
index = dget(ovl_i_dentry_upper(inode));
755761
}
756762

757763
dentry = ovl_get_dentry(sb, NULL, &origin, index);
758764

759765
out:
760766
dput(origin.dentry);
761767
dput(index);
762-
iput(inode);
763768
return dentry;
764769

765770
out_err:
@@ -829,7 +834,7 @@ static struct dentry *ovl_get_parent(struct dentry *dentry)
829834
}
830835

831836
const struct export_operations ovl_export_operations = {
832-
.encode_fh = ovl_encode_inode_fh,
837+
.encode_fh = ovl_encode_fh,
833838
.fh_to_dentry = ovl_fh_to_dentry,
834839
.fh_to_parent = ovl_fh_to_parent,
835840
.get_name = ovl_get_name,

0 commit comments

Comments
 (0)