Skip to content

Commit a351cc0

Browse files
nasamuffingitster
authored andcommitted
doc: propose hooks managed by the config
Begin a design document for config-based hooks, managed via git-hook. Focus on an overview of the implementation and motivation for design decisions. Briefly discuss the alternatives considered before this point. Also, attempt to redefine terms to fit into a multihook world. Signed-off-by: Emily Shaffer <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent ae92ac8 commit a351cc0

File tree

2 files changed

+321
-0
lines changed

2 files changed

+321
-0
lines changed

Documentation/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
8080
TECH_DOCS += MyFirstContribution
8181
TECH_DOCS += MyFirstObjectWalk
8282
TECH_DOCS += SubmittingPatches
83+
TECH_DOCS += technical/config-based-hooks
8384
TECH_DOCS += technical/hash-function-transition
8485
TECH_DOCS += technical/http-protocol
8586
TECH_DOCS += technical/index-format
Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
Configuration-based hook management
2+
===================================
3+
4+
== Motivation
5+
6+
Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
7+
the only source of hooks to execute, in a way which is friendly to users with
8+
multiple repos which have similar needs.
9+
10+
Redefine "hook" as an event rather than a single script, allowing users to
11+
perform unrelated actions on a single event.
12+
13+
Take a step closer to safety when copying zipped Git repositories from untrusted
14+
users.
15+
16+
Make it easier for users to discover Git's hook feature and automate their
17+
workflows.
18+
19+
== User interfaces
20+
21+
=== Config schema
22+
23+
Hooks can be introduced by editing the configuration manually. There are two new
24+
sections added, `hook` and `hookcmd`.
25+
26+
==== `hook`
27+
28+
Primarily contains subsections for each hook event. These subsections define
29+
hook command execution order; hook commands can be specified by passing the
30+
command directly if no additional configuration is needed, or by passing the
31+
name of a `hookcmd`. If Git does not find a `hookcmd` whose subsection matches
32+
the value of the given command string, Git will try to execute the string
33+
directly. Hooks are executed by passing the resolved command string to the
34+
shell. Hook event subsections can also contain per-hook-event settings.
35+
36+
Also contains top-level hook execution settings, for example,
37+
`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
38+
39+
----
40+
[hook "pre-commit"]
41+
command = perl-linter
42+
command = /usr/bin/git-secrets --pre-commit
43+
44+
[hook "pre-applypatch"]
45+
command = perl-linter
46+
error = ignore
47+
48+
[hook]
49+
runHookDir = interactive
50+
----
51+
52+
==== `hookcmd`
53+
54+
Defines a hook command and its attributes, which will be used when a hook event
55+
occurs. Unqualified attributes are assumed to apply to this hook during all hook
56+
events, but event-specific attributes can also be supplied. The example runs
57+
`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
58+
include this config, the hook command will be skipped for all events to which
59+
it's normally subscribed _except_ `pre-commit`.
60+
61+
----
62+
[hookcmd "perl-linter"]
63+
command = /usr/bin/lint-it --language=perl
64+
skip = true
65+
pre-commit-skip = false
66+
----
67+
68+
=== Command-line API
69+
70+
Users should be able to view, reorder, and create hook commands via the command
71+
line. External tools should be able to view a list of hooks in the correct order
72+
to run.
73+
74+
*`git hook list <hook-event>`*
75+
76+
*`git hook list (--system|--global|--local|--worktree)`*
77+
78+
*`git hook edit <hook-event>`*
79+
80+
*`git hook add <hook-command> <hook-event> <options...>`*
81+
82+
=== Hook editor
83+
84+
The tool which is presented by `git hook edit <hook-command>`. Ideally, this
85+
tool should be easier to use than manually editing the config, and then produce
86+
a concise config afterwards. It may take a form similar to `git rebase
87+
--interactive`.
88+
89+
== Implementation
90+
91+
=== Library
92+
93+
`hook.c` and `hook.h` are responsible for interacting with the config files. In
94+
the case when the code generating a hook event doesn't have special concerns
95+
about how to run the hooks, the hook library will provide a basic API to call
96+
all hooks in config order with an `argv_array` provided by the code which
97+
generates the hook event:
98+
99+
*`int run_hooks(const char *hookname, struct argv_array *args)`*
100+
101+
This call includes the hook command provided by `run-command.h:find_hook()`;
102+
eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
103+
config is checked against a number of cases:
104+
105+
- "no": the legacy hook will not be run
106+
- "interactive": Git will prompt the user before running the legacy hook
107+
- "warn": Git will print a warning to stderr before running the legacy hook
108+
- "yes" (default): Git will silently run the legacy hook
109+
110+
In case this list is expanded in the future, if a value for `hook.runHookDir` is
111+
given which Git does not recognize, Git should discard that config entry. For
112+
example, if "warn" was specified at system level and "junk" was specified at
113+
global level, Git would resolve the value to "warn"; if the only time the config
114+
was set was to "junk", Git would use the default value of "yes".
115+
116+
If the caller wants to do something more complicated, the hook library can also
117+
provide a callback API:
118+
119+
*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
120+
121+
Finally, to facilitate the builtin, the library will also provide the following
122+
APIs to interact with the config:
123+
124+
----
125+
int set_hook_commands(const char *hookname, struct string_list *commands,
126+
enum config_scope scope);
127+
int set_hookcmd(const char *hookcmd, struct hookcmd options);
128+
129+
int list_hook_commands(const char *hookname, struct string_list *commands);
130+
int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
131+
----
132+
133+
`struct hookcmd` is expected to grow in size over time as more functionality is
134+
added to hooks; so that other parts of the code don't need to understand the
135+
config schema, `struct hookcmd` should contain logical values instead of string
136+
pairs.
137+
138+
----
139+
struct hookcmd {
140+
const char *name;
141+
const char *command;
142+
143+
/* for illustration only; not planned at present */
144+
int parallelizable;
145+
const char *hookcmd_before;
146+
const char *hookcmd_after;
147+
enum recovery_action on_fail;
148+
}
149+
----
150+
151+
=== Builtin
152+
153+
`builtin/hook.c` is responsible for providing the frontend. It's responsible for
154+
formatting user-provided data and then calling the library API to set the
155+
configs as appropriate. The builtin frontend is not responsible for calling the
156+
config directly, so that other areas of Git can rely on the hook library to
157+
understand the most recent config schema for hooks.
158+
159+
=== Migration path
160+
161+
==== Stage 0
162+
163+
Hooks are called by running `run-command.h:find_hook()` with the hookname and
164+
executing the result. The hook library and builtin do not exist. Hooks only
165+
exist as specially named scripts within `.git/hooks/`.
166+
167+
==== Stage 1
168+
169+
`git hook list --porcelain <hook-event>` is implemented. Users can replace their
170+
`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
171+
output. Modifier commands like `git hook add` and `git hook edit` can be
172+
implemented around this time as well.
173+
174+
==== Stage 2
175+
176+
`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
177+
end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
178+
opt-in to config-based hooks simply by creating some in their config; otherwise
179+
users should remain unaffected by the change.
180+
181+
==== Stage 3
182+
183+
The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
184+
`hook.runHookDir`. Users can opt into managing their hooks completely via the
185+
config this way.
186+
187+
==== Stage 4
188+
189+
`.git/hooks` is removed from the template and the hook directory is considered
190+
deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
191+
not changed, and `find_hook()` is not removed.
192+
193+
== Caveats
194+
195+
=== Security and repo config
196+
197+
Part of the motivation behind this refactor is to mitigate hooks as an attack
198+
vector;footnote:[https://lore.kernel.org/git/[email protected]/]
199+
however, as the design stands, users can still provide hooks in the repo-level
200+
config, which is included when a repo is zipped and sent elsewhere. The
201+
security of the repo-level config is still under discussion; this design
202+
generally assumes the repo-level config is secure, which is not true yet. The
203+
goal is to avoid an overcomplicated design to work around a problem which has
204+
ceased to exist.
205+
206+
=== Ease of use
207+
208+
The config schema is nontrivial; that's why it's important for the `git hook`
209+
modifier commands to be usable. Contributors with UX expertise are encouraged to
210+
share their suggestions.
211+
212+
== Alternative approaches
213+
214+
A previous summary of alternatives exists in the
215+
archives.footnote:[https://lore.kernel.org/git/[email protected]]
216+
217+
=== Status quo
218+
219+
Today users can implement multihooks themselves by using a "trampoline script"
220+
as their hook, and pointing that script to a directory or list of other scripts
221+
they wish to run.
222+
223+
=== Hook directories
224+
225+
Other contributors have suggested Git learn about the existence of a directory
226+
such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
227+
228+
=== Comparison table
229+
230+
.Comparison of alternatives
231+
|===
232+
|Feature |Config-based hooks |Hook directories |Status quo
233+
234+
|Supports multiple hooks
235+
|Natively
236+
|Natively
237+
|With user effort
238+
239+
|Safer for zipped repos
240+
|A little
241+
|No
242+
|No
243+
244+
|Previous hooks just work
245+
|If configured
246+
|Yes
247+
|Yes
248+
249+
|Can install one hook to many repos
250+
|Yes
251+
|No
252+
|No
253+
254+
|Discoverability
255+
|Better (in `git help git`)
256+
|Same as before
257+
|Same as before
258+
259+
|Hard to run unexpected hook
260+
|If configured
261+
|No
262+
|No
263+
|===
264+
265+
== Future work
266+
267+
=== Execution ordering
268+
269+
We may find that config order is insufficient for some users; for example,
270+
config order makes it difficult to add a new hook to the system or global config
271+
which runs at the end of the hook list. A new ordering schema should be:
272+
273+
1) Specified by a `hook.order` config, so that users will not unexpectedly see
274+
their order change;
275+
276+
2) Either dependency or numerically based.
277+
278+
Dependency-based ordering is prone to classic linked-list problems, like a
279+
cycles and handling of missing dependencies. But, it paves the way for enabling
280+
parallelization if some tasks truly depend on others.
281+
282+
Numerical ordering makes it tricky for Git to generate suggested ordering
283+
numbers for each command, but is easy to determine a definitive order.
284+
285+
=== Parallelization
286+
287+
Users with many hooks might want to run them simultaneously, if the hooks don't
288+
modify state; if one hook depends on another's output, then users will want to
289+
specify those dependencies. If we decide to solve this problem, we may want to
290+
look to modern build systems for inspiration on how to manage dependencies and
291+
parallel tasks.
292+
293+
=== Securing hookdir hooks
294+
295+
With the design as written in this doc, it's still possible for a malicious user
296+
to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
297+
zip their repo and send it to another user. It may be necessary to teach Git to
298+
only allow one-line hooks like this if they were configured outside of the local
299+
scope; or another approach, like a list of safe projects, might be useful. It
300+
may also be sufficient (or at least useful) to teach a `hook.disableAll` config
301+
or similar flag to the Git executable.
302+
303+
=== Submodule inheritance
304+
305+
It's possible some submodules may want to run the identical set of hooks that
306+
their superrepo runs. While a globally-configured hook set is helpful, it's not
307+
a great solution for users who have multiple repos-with-submodules under the
308+
same user. It would be useful for submodules to learn how to run hooks from
309+
their superrepo's config, or inherit that hook setting.
310+
311+
== Glossary
312+
313+
*hook event*
314+
315+
A point during Git's execution where user scripts may be run, for example,
316+
_prepare-commit-msg_ or _pre-push_.
317+
318+
*hook command*
319+
320+
A user script or executable which will be run on one or more hook events.

0 commit comments

Comments
 (0)