Skip to content

Commit 6aaf956

Browse files
peffgitster
authored andcommitted
is_hfs_dotgit: loosen over-eager match of \u{..47}
Our is_hfs_dotgit function relies on the hackily-implemented next_hfs_char to give us the next character that an HFS+ filename comparison would look at. It's hacky because it doesn't implement the full case-folding table of HFS+; it gives us just enough to see if the path matches ".git". At the end of next_hfs_char, we use tolower() to convert our 32-bit code point to lowercase. Our tolower() implementation only takes an 8-bit char, though; it throws away the upper 24 bits. This means we can't have any false negatives for is_hfs_dotgit. We only care about matching 7-bit ASCII characters in ".git", and we will correctly process 'G' or 'g'. However, we _can_ have false positives. Because we throw away the upper bits, code point \u{0147} (for example) will look like 'G' and get downcased to 'g'. It's not known whether a sequence of code points whose truncation ends up as ".git" is meaningful in any language, but it does not hurt to be more accurate here. We can just pass out the full 32-bit code point, and compare it manually to the upper and lowercase characters we care about. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent d08c13b commit 6aaf956

File tree

2 files changed

+35
-12
lines changed

2 files changed

+35
-12
lines changed

t/t1450-fsck.sh

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,4 +273,19 @@ dot-backslash-case .\\\\.GIT\\\\foobar
273273
dotgit-case-backslash .git\\\\foobar
274274
EOF
275275

276+
test_expect_success 'fsck allows .Ňit' '
277+
(
278+
git init not-dotgit &&
279+
cd not-dotgit &&
280+
echo content >file &&
281+
git add file &&
282+
git commit -m base &&
283+
blob=$(git rev-parse :file) &&
284+
printf "100644 blob $blob\t.\\305\\207it" >tree &&
285+
tree=$(git mktree <tree) &&
286+
git fsck 2>err &&
287+
test_line_count = 0 err
288+
)
289+
'
290+
276291
test_done

utf8.c

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -630,8 +630,8 @@ int mbs_chrlen(const char **text, size_t *remainder_p, const char *encoding)
630630
}
631631

632632
/*
633-
* Pick the next char from the stream, folding as an HFS+ filename comparison
634-
* would. Note that this is _not_ complete by any means. It's just enough
633+
* Pick the next char from the stream, ignoring codepoints an HFS+ would.
634+
* Note that this is _not_ complete by any means. It's just enough
635635
* to make is_hfs_dotgit() work, and should not be used otherwise.
636636
*/
637637
static ucs_char_t next_hfs_char(const char **in)
@@ -668,23 +668,31 @@ static ucs_char_t next_hfs_char(const char **in)
668668
continue;
669669
}
670670

671-
/*
672-
* there's a great deal of other case-folding that occurs,
673-
* but this is enough to catch anything that will convert
674-
* to ".git"
675-
*/
676-
return tolower(out);
671+
return out;
677672
}
678673
}
679674

680675
int is_hfs_dotgit(const char *path)
681676
{
682677
ucs_char_t c;
683678

684-
if (next_hfs_char(&path) != '.' ||
685-
next_hfs_char(&path) != 'g' ||
686-
next_hfs_char(&path) != 'i' ||
687-
next_hfs_char(&path) != 't')
679+
c = next_hfs_char(&path);
680+
if (c != '.')
681+
return 0;
682+
c = next_hfs_char(&path);
683+
684+
/*
685+
* there's a great deal of other case-folding that occurs
686+
* in HFS+, but this is enough to catch anything that will
687+
* convert to ".git"
688+
*/
689+
if (c != 'g' && c != 'G')
690+
return 0;
691+
c = next_hfs_char(&path);
692+
if (c != 'i' && c != 'I')
693+
return 0;
694+
c = next_hfs_char(&path);
695+
if (c != 't' && c != 'T')
688696
return 0;
689697
c = next_hfs_char(&path);
690698
if (c && !is_dir_sep(c))

0 commit comments

Comments
 (0)