Skip to content

Commit 82386b4

Browse files
committed
Merge branch 'en/present-despite-skipped'
In sparse-checkouts, files mis-marked as missing from the working tree could lead to later problems. Such files were hard to discover, and harder to correct. Automatically detecting and correcting the marking of such files has been added to avoid these problems. * en/present-despite-skipped: repo_read_index: add config to expect files outside sparse patterns Accelerate clear_skip_worktree_from_present_files() by caching Update documentation related to sparsity and the skip-worktree bit repo_read_index: clear SKIP_WORKTREE bit from files present in worktree unpack-trees: fix accidental loss of user changes t1011: add testcase demonstrating accidental loss of user modifications
2 parents c216290 + ecc7c88 commit 82386b4

19 files changed

+311
-128
lines changed

Documentation/config.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,8 @@ include::config/sequencer.txt[]
503503

504504
include::config/showbranch.txt[]
505505

506+
include::config/sparse.txt[]
507+
506508
include::config/splitindex.txt[]
507509

508510
include::config/ssh.txt[]

Documentation/config/sparse.txt

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
sparse.expectFilesOutsideOfPatterns::
2+
Typically with sparse checkouts, files not matching any
3+
sparsity patterns are marked with a SKIP_WORKTREE bit in the
4+
index and are missing from the working tree. Accordingly, Git
5+
will ordinarily check whether files with the SKIP_WORKTREE bit
6+
are in fact present in the working tree contrary to
7+
expectations. If Git finds any, it marks those paths as
8+
present by clearing the relevant SKIP_WORKTREE bits. This
9+
option can be used to tell Git that such
10+
present-despite-skipped files are expected and to stop
11+
checking for them.
12+
+
13+
The default is `false`, which allows Git to automatically recover
14+
from the list of files in the index and working tree falling out of
15+
sync.
16+
+
17+
Set this to `true` if you are in a setup where some external factor
18+
relieves Git of the responsibility for maintaining the consistency
19+
between the presence of working tree files and sparsity patterns. For
20+
example, if you have a Git-aware virtual file system that has a robust
21+
mechanism for keeping the working tree and the sparsity patterns up to
22+
date based on access patterns.
23+
+
24+
Regardless of this setting, Git does not check for
25+
present-despite-skipped files unless sparse checkout is enabled, so
26+
this config option has no effect unless `core.sparseCheckout` is
27+
`true`.

Documentation/git-read-tree.txt

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -375,17 +375,23 @@ have finished your work-in-progress), attempt the merge again.
375375
SPARSE CHECKOUT
376376
---------------
377377

378+
Note: The `update-index` and `read-tree` primitives for supporting the
379+
skip-worktree bit predated the introduction of
380+
linkgit:git-sparse-checkout[1]. Users are encouraged to use
381+
`sparse-checkout` in preference to these low-level primitives.
382+
378383
"Sparse checkout" allows populating the working directory sparsely.
379-
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
380-
Git whether a file in the working directory is worth looking at.
384+
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to
385+
tell Git whether a file in the working directory is worth looking at.
381386

382387
'git read-tree' and other merge-based commands ('git merge', 'git
383388
checkout'...) can help maintaining the skip-worktree bitmap and working
384389
directory update. `$GIT_DIR/info/sparse-checkout` is used to
385390
define the skip-worktree reference bitmap. When 'git read-tree' needs
386391
to update the working directory, it resets the skip-worktree bit in the index
387392
based on this file, which uses the same syntax as .gitignore files.
388-
If an entry matches a pattern in this file, skip-worktree will not be
393+
If an entry matches a pattern in this file, or the entry corresponds to
394+
a file present in the working tree, then skip-worktree will not be
389395
set on that entry. Otherwise, skip-worktree will be set.
390396

391397
Then it compares the new skip-worktree value with the previous one. If

Documentation/git-sparse-checkout.txt

Lines changed: 46 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,7 @@ git-sparse-checkout(1)
33

44
NAME
55
----
6-
git-sparse-checkout - Initialize and modify the sparse-checkout
7-
configuration, which reduces the checkout to a set of paths
8-
given by a list of patterns.
6+
git-sparse-checkout - Reduce your working tree to a subset of tracked files
97

108

119
SYNOPSIS
@@ -17,8 +15,20 @@ SYNOPSIS
1715
DESCRIPTION
1816
-----------
1917

20-
Initialize and modify the sparse-checkout configuration, which reduces
21-
the checkout to a set of paths given by a list of patterns.
18+
This command is used to create sparse checkouts, which means that it
19+
changes the working tree from having all tracked files present, to only
20+
have a subset of them. It can also switch which subset of files are
21+
present, or undo and go back to having all tracked files present in the
22+
working copy.
23+
24+
The subset of files is chosen by providing a list of directories in
25+
cone mode (which is recommended), or by providing a list of patterns
26+
in non-cone mode.
27+
28+
When in a sparse-checkout, other Git commands behave a bit differently.
29+
For example, switching branches will not update paths outside the
30+
sparse-checkout directories/patterns, and `git commit -a` will not record
31+
paths outside the sparse-checkout directories/patterns as deleted.
2232

2333
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
2434
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
@@ -28,7 +38,7 @@ THE FUTURE.
2838
COMMANDS
2939
--------
3040
'list'::
31-
Describe the patterns in the sparse-checkout file.
41+
Describe the directories or patterns in the sparse-checkout file.
3242

3343
'set'::
3444
Enable the necessary sparse-checkout config settings
@@ -46,20 +56,26 @@ the 'set' subcommand are stored in the worktree-specific sparse-checkout
4656
file. See linkgit:git-worktree[1] and the documentation of
4757
`extensions.worktreeConfig` in linkgit:git-config[1] for more details.
4858
+
49-
When the `--stdin` option is provided, the patterns are read from
50-
standard in as a newline-delimited list instead of from the arguments.
59+
When the `--stdin` option is provided, the directories or patterns are
60+
read from standard in as a newline-delimited list instead of from the
61+
arguments.
5162
+
5263
When `--cone` is passed or `core.sparseCheckoutCone` is enabled, the
53-
input list is considered a list of directories instead of
54-
sparse-checkout patterns. This allows for better performance with a
55-
limited set of patterns (see 'CONE PATTERN SET' below). Note that the
56-
set command will write patterns to the sparse-checkout file to include
57-
all files contained in those directories (recursively) as well as
58-
files that are siblings of ancestor directories. The input format
59-
matches the output of `git ls-tree --name-only`. This includes
60-
interpreting pathnames that begin with a double quote (") as C-style
61-
quoted strings. This may become the default in the future; --no-cone
62-
can be passed to request non-cone mode.
64+
input list is considered a list of directories. This allows for
65+
better performance with a limited set of patterns (see 'CONE PATTERN
66+
SET' below). The input format matches the output of `git ls-tree
67+
--name-only`. This includes interpreting pathnames that begin with a
68+
double quote (") as C-style quoted strings. Note that the set command
69+
will write patterns to the sparse-checkout file to include all files
70+
contained in those directories (recursively) as well as files that are
71+
siblings of ancestor directories. This may become the default in the
72+
future; --no-cone can be passed to request non-cone mode.
73+
+
74+
When `--no-cone` is passed or `core.sparseCheckoutCone` is not enabled,
75+
the input list is considered a list of patterns. This mode is harder
76+
to use and less performant, and is thus not recommended. See the
77+
"Sparse Checkout" section of linkgit:git-read-tree[1] and the "Pattern
78+
Set" sections below for more details.
6379
+
6480
Use the `--[no-]sparse-index` option to use a sparse index (the
6581
default is to not use it). A sparse index reduces the size of the
@@ -77,11 +93,10 @@ understand the sparse directory entries index extension and may fail to
7793
interact with your repository until it is disabled.
7894

7995
'add'::
80-
Update the sparse-checkout file to include additional patterns.
81-
By default, these patterns are read from the command-line arguments,
82-
but they can be read from stdin using the `--stdin` option. When
83-
`core.sparseCheckoutCone` is enabled, the given patterns are interpreted
84-
as directory names as in the 'set' subcommand.
96+
Update the sparse-checkout file to include additional directories
97+
(in cone mode) or patterns (in non-cone mode). By default, these
98+
directories or patterns are read from the command-line arguments,
99+
but they can be read from stdin using the `--stdin` option.
85100

86101
'reapply'::
87102
Reapply the sparsity pattern rules to paths in the working tree.
@@ -125,13 +140,14 @@ decreased in utility.
125140
SPARSE CHECKOUT
126141
---------------
127142

128-
"Sparse checkout" allows populating the working directory sparsely.
129-
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
130-
Git whether a file in the working directory is worth looking at. If
131-
the skip-worktree bit is set, then the file is ignored in the working
132-
directory. Git will avoid populating the contents of those files, which
133-
makes a sparse checkout helpful when working in a repository with many
134-
files, but only a few are important to the current user.
143+
"Sparse checkout" allows populating the working directory sparsely. It
144+
uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell Git
145+
whether a file in the working directory is worth looking at. If the
146+
skip-worktree bit is set, and the file is not present in the working tree,
147+
then its absence is ignored. Git will avoid populating the contents of
148+
those files, which makes a sparse checkout helpful when working in a
149+
repository with many files, but only a few are important to the current
150+
user.
135151

136152
The `$GIT_DIR/info/sparse-checkout` file is used to define the
137153
skip-worktree reference bitmap. When Git updates the working

Documentation/git-update-index.txt

Lines changed: 43 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,10 @@ unchanged". Note that "assume unchanged" bit is *not* set if
351351
the index (use `git update-index --really-refresh` if you want
352352
to mark them as "assume unchanged").
353353

354+
Sometimes users confuse the assume-unchanged bit with the
355+
skip-worktree bit. See the final paragraph in the "Skip-worktree bit"
356+
section below for an explanation of the differences.
357+
354358

355359
EXAMPLES
356360
--------
@@ -392,22 +396,47 @@ M foo.c
392396
SKIP-WORKTREE BIT
393397
-----------------
394398

395-
Skip-worktree bit can be defined in one (long) sentence: When reading
396-
an entry, if it is marked as skip-worktree, then Git pretends its
397-
working directory version is up to date and read the index version
398-
instead.
399-
400-
To elaborate, "reading" means checking for file existence, reading
401-
file attributes or file content. The working directory version may be
402-
present or absent. If present, its content may match against the index
403-
version or not. Writing is not affected by this bit, content safety
404-
is still first priority. Note that Git _can_ update working directory
405-
file, that is marked skip-worktree, if it is safe to do so (i.e.
406-
working directory version matches index version)
399+
Skip-worktree bit can be defined in one (long) sentence: Tell git to
400+
avoid writing the file to the working directory when reasonably
401+
possible, and treat the file as unchanged when it is not
402+
present in the working directory.
403+
404+
Note that not all git commands will pay attention to this bit, and
405+
some only partially support it.
406+
407+
The update-index flags and the read-tree capabilities relating to the
408+
skip-worktree bit predated the introduction of the
409+
linkgit:git-sparse-checkout[1] command, which provides a much easier
410+
way to configure and handle the skip-worktree bits. If you want to
411+
reduce your working tree to only deal with a subset of the files in
412+
the repository, we strongly encourage the use of
413+
linkgit:git-sparse-checkout[1] in preference to the low-level
414+
update-index and read-tree primitives.
415+
416+
The primary purpose of the skip-worktree bit is to enable sparse
417+
checkouts, i.e. to have working directories with only a subset of
418+
paths present. When the skip-worktree bit is set, Git commands (such
419+
as `switch`, `pull`, `merge`) will avoid writing these files.
420+
However, these commands will sometimes write these files anyway in
421+
important cases such as conflicts during a merge or rebase. Git
422+
commands will also avoid treating the lack of such files as an
423+
intentional deletion; for example `git add -u` will not not stage a
424+
deletion for these files and `git commit -a` will not make a commit
425+
deleting them either.
407426

408427
Although this bit looks similar to assume-unchanged bit, its goal is
409-
different from assume-unchanged bit's. Skip-worktree also takes
410-
precedence over assume-unchanged bit when both are set.
428+
different. The assume-unchanged bit is for leaving the file in the
429+
working tree but having Git omit checking it for changes and presuming
430+
that the file has not been changed (though if it can determine without
431+
stat'ing the file that it has changed, it is free to record the
432+
changes). skip-worktree tells Git to ignore the absence of the file,
433+
avoid updating it when possible with commands that normally update
434+
much of the working directory (e.g. `checkout`, `switch`, `pull`,
435+
etc.), and not have its absence be recorded in commits. Note that in
436+
sparse checkouts (setup by `git sparse-checkout` or by configuring
437+
core.sparseCheckout to true), if a file is marked as skip-worktree in
438+
the index but is found in the working tree, Git will clear the
439+
skip-worktree bit for that file.
411440

412441
SPLIT INDEX
413442
-----------

cache.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1003,6 +1003,7 @@ extern const char *core_fsmonitor;
10031003

10041004
extern int core_apply_sparse_checkout;
10051005
extern int core_sparse_checkout_cone;
1006+
extern int sparse_expect_files_outside_of_patterns;
10061007

10071008
/*
10081009
* Returns the boolean value of $GIT_OPTIONAL_LOCKS (or the default value).

config.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1654,6 +1654,17 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
16541654
return platform_core_config(var, value, cb);
16551655
}
16561656

1657+
static int git_default_sparse_config(const char *var, const char *value)
1658+
{
1659+
if (!strcmp(var, "sparse.expectfilesoutsideofpatterns")) {
1660+
sparse_expect_files_outside_of_patterns = git_config_bool(var, value);
1661+
return 0;
1662+
}
1663+
1664+
/* Add other config variables here and to Documentation/config/sparse.txt. */
1665+
return 0;
1666+
}
1667+
16571668
static int git_default_i18n_config(const char *var, const char *value)
16581669
{
16591670
if (!strcmp(var, "i18n.commitencoding"))
@@ -1785,6 +1796,9 @@ int git_default_config(const char *var, const char *value, void *cb)
17851796
return 0;
17861797
}
17871798

1799+
if (starts_with(var, "sparse."))
1800+
return git_default_sparse_config(var, value);
1801+
17881802
/* Add other config variables here and to Documentation/config.txt. */
17891803
return 0;
17901804
}

environment.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ char *notes_ref_name;
7070
int grafts_replace_parents = 1;
7171
int core_apply_sparse_checkout;
7272
int core_sparse_checkout_cone;
73+
int sparse_expect_files_outside_of_patterns;
7374
int merge_log_config = -1;
7475
int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
7576
unsigned long pack_size_limit_cfg;

repository.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,13 @@ int repo_read_index(struct repository *repo)
301301
if (repo->settings.command_requires_full_index)
302302
ensure_full_index(repo->index);
303303

304+
/*
305+
* If sparse checkouts are in use, check whether paths with the
306+
* SKIP_WORKTREE attribute are missing from the worktree; if not,
307+
* clear that attribute for that path.
308+
*/
309+
clear_skip_worktree_from_present_files(repo->index);
310+
304311
return res;
305312
}
306313

sparse-index.c

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -337,6 +337,80 @@ void ensure_correct_sparsity(struct index_state *istate)
337337
ensure_full_index(istate);
338338
}
339339

340+
static int path_found(const char *path, const char **dirname, size_t *dir_len,
341+
int *dir_found)
342+
{
343+
struct stat st;
344+
char *newdir;
345+
char *tmp;
346+
347+
/*
348+
* If dirname corresponds to a directory that doesn't exist, and this
349+
* path starts with dirname, then path can't exist.
350+
*/
351+
if (!*dir_found && !memcmp(path, *dirname, *dir_len))
352+
return 0;
353+
354+
/*
355+
* If path itself exists, return 1.
356+
*/
357+
if (!lstat(path, &st))
358+
return 1;
359+
360+
/*
361+
* Otherwise, path does not exist so we'll return 0...but we'll first
362+
* determine some info about its parent directory so we can avoid
363+
* lstat calls for future cache entries.
364+
*/
365+
newdir = strrchr(path, '/');
366+
if (!newdir)
367+
return 0; /* Didn't find a parent dir; just return 0 now. */
368+
369+
/*
370+
* If path starts with directory (which we already lstat'ed and found),
371+
* then no need to lstat parent directory again.
372+
*/
373+
if (*dir_found && *dirname && memcmp(path, *dirname, *dir_len))
374+
return 0;
375+
376+
/* Free previous dirname, and cache path's dirname */
377+
*dirname = path;
378+
*dir_len = newdir - path + 1;
379+
380+
tmp = xstrndup(path, *dir_len);
381+
*dir_found = !lstat(tmp, &st);
382+
free(tmp);
383+
384+
return 0;
385+
}
386+
387+
void clear_skip_worktree_from_present_files(struct index_state *istate)
388+
{
389+
const char *last_dirname = NULL;
390+
size_t dir_len = 0;
391+
int dir_found = 1;
392+
393+
int i;
394+
395+
if (!core_apply_sparse_checkout ||
396+
sparse_expect_files_outside_of_patterns)
397+
return;
398+
399+
restart:
400+
for (i = 0; i < istate->cache_nr; i++) {
401+
struct cache_entry *ce = istate->cache[i];
402+
403+
if (ce_skip_worktree(ce) &&
404+
path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
405+
if (S_ISSPARSEDIR(ce->ce_mode)) {
406+
ensure_full_index(istate);
407+
goto restart;
408+
}
409+
ce->ce_flags &= ~CE_SKIP_WORKTREE;
410+
}
411+
}
412+
}
413+
340414
/*
341415
* This static global helps avoid infinite recursion between
342416
* expand_to_path() and index_file_exists().

0 commit comments

Comments
 (0)