Skip to content

Commit acf9de4

Browse files
avargitster
authored andcommitted
mktag: use fsck instead of custom verify_tag()
Change the validation logic in "mktag" to use fsck's fsck_tag() instead of its own custom parser. Curiously the logic for both dates back to the same commit[1]. Let's unify them so we're not maintaining two sets functions to verify that a tag is OK. The behavior of fsck_tag() and the old "mktag" code being removed here is different in few aspects. I think it makes sense to remove some of those checks, namely: A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag code disallowed values larger than 1400. Yes there's currently no timezone with a greater offset[2], but since we allow any number of non-offical timezones (e.g. +1234) passing this through seems fine. Git also won't break in the future if e.g. French Polynesia decides it needs to outdo the Line Islands when it comes to timezone extravagance. B. fsck allows missing author names such as "tagger <email>", mktag wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but not "tagger [1 space] <email>"). Now we allow all of these. C. Like B, but "mktag" disallowed spaces in the <email> part, fsck allows it. In some ways fsck_tag() is stricter than "mktag" was, namely: D. fsck disallows zero-padded dates, but mktag didn't care. So e.g. the timestamp "0000000000 +0000" produces an error now. A test in "t1006-cat-file.sh" relied on this, it's been changed to use "hash-object" (without fsck) instead. There was one check I deemed worth keeping by porting it over to fsck_tag(): E. "mktag" did not allow any custom headers, and by extension (as an empty commit is allowed) also forbade an extra stray trailing newline after the headers it knew about. Add a new check in the "ignore" category to fsck and use it. This somewhat abuses the facility added in efaba7c (fsck: optionally ignore specific fsck issues completely, 2015-06-22). This is somewhat of hack, but probably the least invasive change we can make here. The fsck command will shuffle these categories around, e.g. under --strict the "info" becomes a "warn" and "warn" becomes "error". Existing users of fsck's (and others, e.g. index-pack) --strict option rely on this. So we need to put something into a category that'll be ignored by all existing users of the API. Pretending that fsck.extraHeaderEntry=error ("ignore" by default) was set serves to do this for us. 1. ec4465a (Add "tag" objects that can be used to sign other objects., 2005-04-25) 2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 40ef015 commit acf9de4

File tree

6 files changed

+127
-184
lines changed

6 files changed

+127
-184
lines changed

Documentation/git-mktag.txt

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,14 @@ write a tag found in `my-tag`:
2525
git hash-object -t tag -w --stdin <my-tag
2626

2727
The difference is that mktag will die before writing the tag if the
28-
tag doesn't pass a sanity check.
28+
tag doesn't pass a linkgit:git-fsck[1] check.
29+
30+
The "fsck" check done mktag is stricter than what linkgit:git-fsck[1]
31+
would run by default in that all `fsck.<msg-id>` messages are promoted
32+
from warnings to errors (so e.g. a missing "tagger" line is an error).
33+
34+
Extra headers in the object are also an error under mktag, but ignored
35+
by linkgit:git-fsck[1]
2936

3037
Tag Format
3138
----------

builtin/mktag.c

Lines changed: 50 additions & 146 deletions
Original file line numberDiff line numberDiff line change
@@ -2,160 +2,60 @@
22
#include "tag.h"
33
#include "replace-object.h"
44
#include "object-store.h"
5+
#include "fsck.h"
56

6-
/*
7-
* A signature file has a very simple fixed format: four lines
8-
* of "object <sha1>" + "type <typename>" + "tag <tagname>" +
9-
* "tagger <committer>", followed by a blank line, a free-form tag
10-
* message and a signature block that git itself doesn't care about,
11-
* but that can be verified with gpg or similar.
12-
*
13-
* The first four lines are guaranteed to be at least 83 bytes:
14-
* "object <sha1>\n" is 48 bytes, "type tag\n" at 9 bytes is the
15-
* shortest possible type-line, "tag .\n" at 6 bytes is the shortest
16-
* single-character-tag line, and "tagger . <> 0 +0000\n" at 20 bytes is
17-
* the shortest possible tagger-line.
18-
*/
19-
20-
/*
21-
* We refuse to tag something we can't verify. Just because.
22-
*/
23-
static int verify_object(const struct object_id *oid, const char *expected_type)
7+
static int mktag_fsck_error_func(struct fsck_options *o,
8+
const struct object_id *oid,
9+
enum object_type object_type,
10+
int msg_type, const char *message)
2411
{
25-
int ret = -1;
26-
enum object_type type;
27-
unsigned long size;
28-
void *buffer = read_object_file(oid, &type, &size);
29-
const struct object_id *repl = lookup_replace_object(the_repository, oid);
30-
31-
if (buffer) {
32-
if (type == type_from_string(expected_type)) {
33-
ret = check_object_signature(the_repository, repl,
34-
buffer, size,
35-
expected_type);
36-
}
37-
free(buffer);
12+
switch (msg_type) {
13+
case FSCK_WARN:
14+
case FSCK_ERROR:
15+
/*
16+
* We treat both warnings and errors as errors, things
17+
* like missing "tagger" lines are "only" warnings
18+
* under fsck, we've always considered them an error.
19+
*/
20+
fprintf_ln(stderr, "error: tag input does not pass fsck: %s", message);
21+
return 1;
22+
default:
23+
BUG("%d (FSCK_IGNORE?) should never trigger this callback",
24+
msg_type);
3825
}
39-
return ret;
4026
}
4127

42-
static int verify_tag(char *buffer, unsigned long size)
28+
static int verify_object_in_tag(struct object_id *tagged_oid, int *tagged_type)
4329
{
44-
int typelen;
45-
char type[20];
46-
struct object_id oid;
47-
const char *object, *type_line, *tag_line, *tagger_line, *lb, *rb, *p;
48-
size_t len;
49-
50-
if (size < 84)
51-
return error("wanna fool me ? you obviously got the size wrong !");
52-
53-
buffer[size] = 0;
54-
55-
/* Verify object line */
56-
object = buffer;
57-
if (memcmp(object, "object ", 7))
58-
return error("char%d: does not start with \"object \"", 0);
59-
60-
if (parse_oid_hex(object + 7, &oid, &p))
61-
return error("char%d: could not get SHA1 hash", 7);
62-
63-
/* Verify type line */
64-
type_line = p + 1;
65-
if (memcmp(type_line - 1, "\ntype ", 6))
66-
return error("char%d: could not find \"\\ntype \"", 47);
67-
68-
/* Verify tag-line */
69-
tag_line = strchr(type_line, '\n');
70-
if (!tag_line)
71-
return error("char%"PRIuMAX": could not find next \"\\n\"",
72-
(uintmax_t) (type_line - buffer));
73-
tag_line++;
74-
if (memcmp(tag_line, "tag ", 4) || tag_line[4] == '\n')
75-
return error("char%"PRIuMAX": no \"tag \" found",
76-
(uintmax_t) (tag_line - buffer));
77-
78-
/* Get the actual type */
79-
typelen = tag_line - type_line - strlen("type \n");
80-
if (typelen >= sizeof(type))
81-
return error("char%"PRIuMAX": type too long",
82-
(uintmax_t) (type_line+5 - buffer));
83-
84-
memcpy(type, type_line+5, typelen);
85-
type[typelen] = 0;
86-
87-
/* Verify that the object matches */
88-
if (verify_object(&oid, type))
89-
return error("char%d: could not verify object %s", 7, oid_to_hex(&oid));
90-
91-
/* Verify the tag-name: we don't allow control characters or spaces in it */
92-
tag_line += 4;
93-
for (;;) {
94-
unsigned char c = *tag_line++;
95-
if (c == '\n')
96-
break;
97-
if (c > ' ')
98-
continue;
99-
return error("char%"PRIuMAX": could not verify tag name",
100-
(uintmax_t) (tag_line - buffer));
101-
}
102-
103-
/* Verify the tagger line */
104-
tagger_line = tag_line;
105-
106-
if (memcmp(tagger_line, "tagger ", 7))
107-
return error("char%"PRIuMAX": could not find \"tagger \"",
108-
(uintmax_t) (tagger_line - buffer));
109-
110-
/*
111-
* Check for correct form for name and email
112-
* i.e. " <" followed by "> " on _this_ line
113-
* No angle brackets within the name or email address fields.
114-
* No spaces within the email address field.
115-
*/
116-
tagger_line += 7;
117-
if (!(lb = strstr(tagger_line, " <")) || !(rb = strstr(lb+2, "> ")) ||
118-
strpbrk(tagger_line, "<>\n") != lb+1 ||
119-
strpbrk(lb+2, "><\n ") != rb)
120-
return error("char%"PRIuMAX": malformed tagger field",
121-
(uintmax_t) (tagger_line - buffer));
122-
123-
/* Check for author name, at least one character, space is acceptable */
124-
if (lb == tagger_line)
125-
return error("char%"PRIuMAX": missing tagger name",
126-
(uintmax_t) (tagger_line - buffer));
127-
128-
/* timestamp, 1 or more digits followed by space */
129-
tagger_line = rb + 2;
130-
if (!(len = strspn(tagger_line, "0123456789")))
131-
return error("char%"PRIuMAX": missing tag timestamp",
132-
(uintmax_t) (tagger_line - buffer));
133-
tagger_line += len;
134-
if (*tagger_line != ' ')
135-
return error("char%"PRIuMAX": malformed tag timestamp",
136-
(uintmax_t) (tagger_line - buffer));
137-
tagger_line++;
138-
139-
/* timezone, 5 digits [+-]hhmm, max. 1400 */
140-
if (!((tagger_line[0] == '+' || tagger_line[0] == '-') &&
141-
strspn(tagger_line+1, "0123456789") == 4 &&
142-
tagger_line[5] == '\n' && atoi(tagger_line+1) <= 1400))
143-
return error("char%"PRIuMAX": malformed tag timezone",
144-
(uintmax_t) (tagger_line - buffer));
145-
tagger_line += 6;
146-
147-
/* Verify the blank line separating the header from the body */
148-
if (*tagger_line != '\n')
149-
return error("char%"PRIuMAX": trailing garbage in tag header",
150-
(uintmax_t) (tagger_line - buffer));
30+
int ret;
31+
enum object_type type;
32+
unsigned long size;
33+
void *buffer;
34+
const struct object_id *repl;
35+
36+
buffer = read_object_file(tagged_oid, &type, &size);
37+
if (!buffer)
38+
die("could not read tagged object '%s'",
39+
oid_to_hex(tagged_oid));
40+
if (type != *tagged_type)
41+
die("object '%s' tagged as '%s', but is a '%s' type",
42+
oid_to_hex(tagged_oid),
43+
type_name(*tagged_type), type_name(type));
44+
45+
repl = lookup_replace_object(the_repository, tagged_oid);
46+
ret = check_object_signature(the_repository, repl,
47+
buffer, size, type_name(*tagged_type));
48+
free(buffer);
15149

152-
/* The actual stuff afterwards we don't care about.. */
153-
return 0;
50+
return ret;
15451
}
15552

15653
int cmd_mktag(int argc, const char **argv, const char *prefix)
15754
{
15855
struct strbuf buf = STRBUF_INIT;
56+
struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
57+
struct object_id tagged_oid;
58+
int tagged_type;
15959
struct object_id result;
16060

16161
if (argc != 1)
@@ -164,10 +64,14 @@ int cmd_mktag(int argc, const char **argv, const char *prefix)
16464
if (strbuf_read(&buf, 0, 0) < 0)
16565
die_errno("could not read from stdin");
16666

167-
/* Verify it for some basic sanity: it needs to start with
168-
"object <sha1>\ntype\ntagger " */
169-
if (verify_tag(buf.buf, buf.len) < 0)
170-
die("invalid tag signature file");
67+
fsck_options.error_func = mktag_fsck_error_func;
68+
fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn");
69+
if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options,
70+
&tagged_oid, &tagged_type))
71+
die("tag on stdin did not pass our strict fsck check");
72+
73+
if (verify_object_in_tag(&tagged_oid, &tagged_type))
74+
die("tag on stdin did not refer to a valid object");
17175

17276
if (write_object_file(buf.buf, buf.len, tag_type, &result) < 0)
17377
die("unable to write tag file");

fsck.c

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,9 @@ static struct oidset gitmodules_done = OIDSET_INIT;
8080
/* infos (reported as warnings, but ignored by default) */ \
8181
FUNC(GITMODULES_PARSE, INFO) \
8282
FUNC(BAD_TAG_NAME, INFO) \
83-
FUNC(MISSING_TAGGER_ENTRY, INFO)
83+
FUNC(MISSING_TAGGER_ENTRY, INFO) \
84+
/* ignored (elevated when requested) */ \
85+
FUNC(EXTRA_HEADER_ENTRY, IGNORE)
8486

8587
#define MSG_ID(id, msg_type) FSCK_MSG_##id,
8688
enum fsck_msg_id {
@@ -911,6 +913,16 @@ static int fsck_tag(const struct object_id *oid, const char *buffer,
911913
unsigned long size, struct fsck_options *options)
912914
{
913915
struct object_id tagged_oid;
916+
int tagged_type;
917+
return fsck_tag_standalone(oid, buffer, size, options, &tagged_oid,
918+
&tagged_type);
919+
}
920+
921+
int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
922+
unsigned long size, struct fsck_options *options,
923+
struct object_id *tagged_oid,
924+
int *tagged_type)
925+
{
914926
int ret = 0;
915927
char *eol;
916928
struct strbuf sb = STRBUF_INIT;
@@ -924,7 +936,7 @@ static int fsck_tag(const struct object_id *oid, const char *buffer,
924936
ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
925937
goto done;
926938
}
927-
if (parse_oid_hex(buffer, &tagged_oid, &p) || *p != '\n') {
939+
if (parse_oid_hex(buffer, tagged_oid, &p) || *p != '\n') {
928940
ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
929941
if (ret)
930942
goto done;
@@ -940,7 +952,8 @@ static int fsck_tag(const struct object_id *oid, const char *buffer,
940952
ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
941953
goto done;
942954
}
943-
if (type_from_string_gently(buffer, eol - buffer, 1) < 0)
955+
*tagged_type = type_from_string_gently(buffer, eol - buffer, 1);
956+
if (*tagged_type < 0)
944957
ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_TYPE, "invalid 'type' value");
945958
if (ret)
946959
goto done;
@@ -975,6 +988,19 @@ static int fsck_tag(const struct object_id *oid, const char *buffer,
975988
else
976989
ret = fsck_ident(&buffer, oid, OBJ_TAG, options);
977990

991+
if (!starts_with(buffer, "\n")) {
992+
/*
993+
* The verify_headers() check will allow
994+
* e.g. "[...]tagger <tagger>\nsome
995+
* garbage\n\nmessage" to pass, thinking "some
996+
* garbage" could be a custom header. E.g. "mktag"
997+
* doesn't want any unknown headers.
998+
*/
999+
ret = report(options, oid, OBJ_TAG, FSCK_MSG_EXTRA_HEADER_ENTRY, "invalid format - extra header(s) after 'tagger'");
1000+
if (ret)
1001+
goto done;
1002+
}
1003+
9781004
done:
9791005
strbuf_release(&sb);
9801006
return ret;

fsck.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,15 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options);
6262
int fsck_object(struct object *obj, void *data, unsigned long size,
6363
struct fsck_options *options);
6464

65+
/*
66+
* fsck a tag, and pass info about it back to the caller. This is
67+
* exposed fsck_object() internals for git-mktag(1).
68+
*/
69+
int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
70+
unsigned long size, struct fsck_options *options,
71+
struct object_id *tagged_oid,
72+
int *tag_type);
73+
6574
/*
6675
* Some fsck checks are context-dependent, and may end up queued; run this
6776
* after completing all fsck_object() calls in order to resolve any remaining

t/t1006-cat-file.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ tag_content="$tag_header_without_timestamp 0000000000 +0000
166166
167167
$tag_description"
168168

169-
tag_sha1=$(echo_without_newline "$tag_content" | git mktag)
169+
tag_sha1=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w)
170170
tag_size=$(strlen "$tag_content")
171171

172172
run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" 1

0 commit comments

Comments
 (0)