|
| 1 | +Configuration-based hook management |
| 2 | +=================================== |
| 3 | + |
| 4 | +== Motivation |
| 5 | + |
| 6 | +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as |
| 7 | +the only source of hooks to execute, in a way which is friendly to users with |
| 8 | +multiple repos which have similar needs. |
| 9 | + |
| 10 | +Redefine "hook" as an event rather than a single script, allowing users to |
| 11 | +perform unrelated actions on a single event. |
| 12 | + |
| 13 | +Take a step closer to safety when copying zipped Git repositories from untrusted |
| 14 | +users. |
| 15 | + |
| 16 | +Make it easier for users to discover Git's hook feature and automate their |
| 17 | +workflows. |
| 18 | + |
| 19 | +== User interfaces |
| 20 | + |
| 21 | +=== Config schema |
| 22 | + |
| 23 | +Hooks can be introduced by editing the configuration manually. There are two new |
| 24 | +sections added, `hook` and `hookcmd`. |
| 25 | + |
| 26 | +==== `hook` |
| 27 | + |
| 28 | +Primarily contains subsections for each hook event. These subsections define |
| 29 | +hook command execution order; hook commands can be specified by passing the |
| 30 | +command directly if no additional configuration is needed, or by passing the |
| 31 | +name of a `hookcmd`. If Git does not find a `hookcmd` whose subsection matches |
| 32 | +the value of the given command string, Git will try to execute the string |
| 33 | +directly. Hooks are executed by passing the resolved command string to the |
| 34 | +shell. Hook event subsections can also contain per-hook-event settings. |
| 35 | + |
| 36 | +Also contains top-level hook execution settings, for example, |
| 37 | +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. |
| 38 | + |
| 39 | +---- |
| 40 | +[hook "pre-commit"] |
| 41 | + command = perl-linter |
| 42 | + command = /usr/bin/git-secrets --pre-commit |
| 43 | + |
| 44 | +[hook "pre-applypatch"] |
| 45 | + command = perl-linter |
| 46 | + error = ignore |
| 47 | + |
| 48 | +[hook] |
| 49 | + runHookDir = interactive |
| 50 | +---- |
| 51 | + |
| 52 | +==== `hookcmd` |
| 53 | + |
| 54 | +Defines a hook command and its attributes, which will be used when a hook event |
| 55 | +occurs. Unqualified attributes are assumed to apply to this hook during all hook |
| 56 | +events, but event-specific attributes can also be supplied. The example runs |
| 57 | +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which |
| 58 | +include this config, the hook command will be skipped for all events to which |
| 59 | +it's normally subscribed _except_ `pre-commit`. |
| 60 | + |
| 61 | +---- |
| 62 | +[hookcmd "perl-linter"] |
| 63 | + command = /usr/bin/lint-it --language=perl |
| 64 | + skip = true |
| 65 | + pre-commit-skip = false |
| 66 | +---- |
| 67 | + |
| 68 | +=== Command-line API |
| 69 | + |
| 70 | +Users should be able to view, reorder, and create hook commands via the command |
| 71 | +line. External tools should be able to view a list of hooks in the correct order |
| 72 | +to run. |
| 73 | + |
| 74 | +*`git hook list <hook-event>`* |
| 75 | + |
| 76 | +*`git hook list (--system|--global|--local|--worktree)`* |
| 77 | + |
| 78 | +*`git hook edit <hook-event>`* |
| 79 | + |
| 80 | +*`git hook add <hook-command> <hook-event> <options...>`* |
| 81 | + |
| 82 | +=== Hook editor |
| 83 | + |
| 84 | +The tool which is presented by `git hook edit <hook-command>`. Ideally, this |
| 85 | +tool should be easier to use than manually editing the config, and then produce |
| 86 | +a concise config afterwards. It may take a form similar to `git rebase |
| 87 | +--interactive`. |
| 88 | + |
| 89 | +== Implementation |
| 90 | + |
| 91 | +=== Library |
| 92 | + |
| 93 | +`hook.c` and `hook.h` are responsible for interacting with the config files. In |
| 94 | +the case when the code generating a hook event doesn't have special concerns |
| 95 | +about how to run the hooks, the hook library will provide a basic API to call |
| 96 | +all hooks in config order with an `argv_array` provided by the code which |
| 97 | +generates the hook event: |
| 98 | + |
| 99 | +*`int run_hooks(const char *hookname, struct argv_array *args)`* |
| 100 | + |
| 101 | +This call includes the hook command provided by `run-command.h:find_hook()`; |
| 102 | +eventually, this legacy hook will be gated by a config `hook.runHookDir`. The |
| 103 | +config is checked against a number of cases: |
| 104 | + |
| 105 | +- "no": the legacy hook will not be run |
| 106 | +- "interactive": Git will prompt the user before running the legacy hook |
| 107 | +- "warn": Git will print a warning to stderr before running the legacy hook |
| 108 | +- "yes" (default): Git will silently run the legacy hook |
| 109 | + |
| 110 | +In case this list is expanded in the future, if a value for `hook.runHookDir` is |
| 111 | +given which Git does not recognize, Git should discard that config entry. For |
| 112 | +example, if "warn" was specified at system level and "junk" was specified at |
| 113 | +global level, Git would resolve the value to "warn"; if the only time the config |
| 114 | +was set was to "junk", Git would use the default value of "yes". |
| 115 | + |
| 116 | +If the caller wants to do something more complicated, the hook library can also |
| 117 | +provide a callback API: |
| 118 | + |
| 119 | +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`* |
| 120 | + |
| 121 | +Finally, to facilitate the builtin, the library will also provide the following |
| 122 | +APIs to interact with the config: |
| 123 | + |
| 124 | +---- |
| 125 | +int set_hook_commands(const char *hookname, struct string_list *commands, |
| 126 | + enum config_scope scope); |
| 127 | +int set_hookcmd(const char *hookcmd, struct hookcmd options); |
| 128 | + |
| 129 | +int list_hook_commands(const char *hookname, struct string_list *commands); |
| 130 | +int list_hooks_in_scope(enum config_scope scope, struct string_list *commands); |
| 131 | +---- |
| 132 | + |
| 133 | +`struct hookcmd` is expected to grow in size over time as more functionality is |
| 134 | +added to hooks; so that other parts of the code don't need to understand the |
| 135 | +config schema, `struct hookcmd` should contain logical values instead of string |
| 136 | +pairs. |
| 137 | + |
| 138 | +---- |
| 139 | +struct hookcmd { |
| 140 | + const char *name; |
| 141 | + const char *command; |
| 142 | + |
| 143 | + /* for illustration only; not planned at present */ |
| 144 | + int parallelizable; |
| 145 | + const char *hookcmd_before; |
| 146 | + const char *hookcmd_after; |
| 147 | + enum recovery_action on_fail; |
| 148 | +} |
| 149 | +---- |
| 150 | + |
| 151 | +=== Builtin |
| 152 | + |
| 153 | +`builtin/hook.c` is responsible for providing the frontend. It's responsible for |
| 154 | +formatting user-provided data and then calling the library API to set the |
| 155 | +configs as appropriate. The builtin frontend is not responsible for calling the |
| 156 | +config directly, so that other areas of Git can rely on the hook library to |
| 157 | +understand the most recent config schema for hooks. |
| 158 | + |
| 159 | +=== Migration path |
| 160 | + |
| 161 | +==== Stage 0 |
| 162 | + |
| 163 | +Hooks are called by running `run-command.h:find_hook()` with the hookname and |
| 164 | +executing the result. The hook library and builtin do not exist. Hooks only |
| 165 | +exist as specially named scripts within `.git/hooks/`. |
| 166 | + |
| 167 | +==== Stage 1 |
| 168 | + |
| 169 | +`git hook list --porcelain <hook-event>` is implemented. Users can replace their |
| 170 | +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s |
| 171 | +output. Modifier commands like `git hook add` and `git hook edit` can be |
| 172 | +implemented around this time as well. |
| 173 | + |
| 174 | +==== Stage 2 |
| 175 | + |
| 176 | +`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the |
| 177 | +end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can |
| 178 | +opt-in to config-based hooks simply by creating some in their config; otherwise |
| 179 | +users should remain unaffected by the change. |
| 180 | + |
| 181 | +==== Stage 3 |
| 182 | + |
| 183 | +The call to `find_hook()` inside of `run_hooks()` learns to check for a config, |
| 184 | +`hook.runHookDir`. Users can opt into managing their hooks completely via the |
| 185 | +config this way. |
| 186 | + |
| 187 | +==== Stage 4 |
| 188 | + |
| 189 | +`.git/hooks` is removed from the template and the hook directory is considered |
| 190 | +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is |
| 191 | +not changed, and `find_hook()` is not removed. |
| 192 | + |
| 193 | +== Caveats |
| 194 | + |
| 195 | +=== Security and repo config |
| 196 | + |
| 197 | +Part of the motivation behind this refactor is to mitigate hooks as an attack |
| 198 | +vector;footnote:[https://lore.kernel.org/git/ [email protected]/] |
| 199 | +however, as the design stands, users can still provide hooks in the repo-level |
| 200 | +config, which is included when a repo is zipped and sent elsewhere. The |
| 201 | +security of the repo-level config is still under discussion; this design |
| 202 | +generally assumes the repo-level config is secure, which is not true yet. The |
| 203 | +goal is to avoid an overcomplicated design to work around a problem which has |
| 204 | +ceased to exist. |
| 205 | + |
| 206 | +=== Ease of use |
| 207 | + |
| 208 | +The config schema is nontrivial; that's why it's important for the `git hook` |
| 209 | +modifier commands to be usable. Contributors with UX expertise are encouraged to |
| 210 | +share their suggestions. |
| 211 | + |
| 212 | +== Alternative approaches |
| 213 | + |
| 214 | +A previous summary of alternatives exists in the |
| 215 | +archives.footnote:[https://lore.kernel.org/git/ [email protected]] |
| 216 | + |
| 217 | +=== Status quo |
| 218 | + |
| 219 | +Today users can implement multihooks themselves by using a "trampoline script" |
| 220 | +as their hook, and pointing that script to a directory or list of other scripts |
| 221 | +they wish to run. |
| 222 | + |
| 223 | +=== Hook directories |
| 224 | + |
| 225 | +Other contributors have suggested Git learn about the existence of a directory |
| 226 | +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order. |
| 227 | + |
| 228 | +=== Comparison table |
| 229 | + |
| 230 | +.Comparison of alternatives |
| 231 | +|=== |
| 232 | +|Feature |Config-based hooks |Hook directories |Status quo |
| 233 | + |
| 234 | +|Supports multiple hooks |
| 235 | +|Natively |
| 236 | +|Natively |
| 237 | +|With user effort |
| 238 | + |
| 239 | +|Safer for zipped repos |
| 240 | +|A little |
| 241 | +|No |
| 242 | +|No |
| 243 | + |
| 244 | +|Previous hooks just work |
| 245 | +|If configured |
| 246 | +|Yes |
| 247 | +|Yes |
| 248 | + |
| 249 | +|Can install one hook to many repos |
| 250 | +|Yes |
| 251 | +|No |
| 252 | +|No |
| 253 | + |
| 254 | +|Discoverability |
| 255 | +|Better (in `git help git`) |
| 256 | +|Same as before |
| 257 | +|Same as before |
| 258 | + |
| 259 | +|Hard to run unexpected hook |
| 260 | +|If configured |
| 261 | +|No |
| 262 | +|No |
| 263 | +|=== |
| 264 | + |
| 265 | +== Future work |
| 266 | + |
| 267 | +=== Execution ordering |
| 268 | + |
| 269 | +We may find that config order is insufficient for some users; for example, |
| 270 | +config order makes it difficult to add a new hook to the system or global config |
| 271 | +which runs at the end of the hook list. A new ordering schema should be: |
| 272 | + |
| 273 | +1) Specified by a `hook.order` config, so that users will not unexpectedly see |
| 274 | +their order change; |
| 275 | + |
| 276 | +2) Either dependency or numerically based. |
| 277 | + |
| 278 | +Dependency-based ordering is prone to classic linked-list problems, like a |
| 279 | +cycles and handling of missing dependencies. But, it paves the way for enabling |
| 280 | +parallelization if some tasks truly depend on others. |
| 281 | + |
| 282 | +Numerical ordering makes it tricky for Git to generate suggested ordering |
| 283 | +numbers for each command, but is easy to determine a definitive order. |
| 284 | + |
| 285 | +=== Parallelization |
| 286 | + |
| 287 | +Users with many hooks might want to run them simultaneously, if the hooks don't |
| 288 | +modify state; if one hook depends on another's output, then users will want to |
| 289 | +specify those dependencies. If we decide to solve this problem, we may want to |
| 290 | +look to modern build systems for inspiration on how to manage dependencies and |
| 291 | +parallel tasks. |
| 292 | + |
| 293 | +=== Securing hookdir hooks |
| 294 | + |
| 295 | +With the design as written in this doc, it's still possible for a malicious user |
| 296 | +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then |
| 297 | +zip their repo and send it to another user. It may be necessary to teach Git to |
| 298 | +only allow one-line hooks like this if they were configured outside of the local |
| 299 | +scope; or another approach, like a list of safe projects, might be useful. It |
| 300 | +may also be sufficient (or at least useful) to teach a `hook.disableAll` config |
| 301 | +or similar flag to the Git executable. |
| 302 | + |
| 303 | +=== Submodule inheritance |
| 304 | + |
| 305 | +It's possible some submodules may want to run the identical set of hooks that |
| 306 | +their superrepo runs. While a globally-configured hook set is helpful, it's not |
| 307 | +a great solution for users who have multiple repos-with-submodules under the |
| 308 | +same user. It would be useful for submodules to learn how to run hooks from |
| 309 | +their superrepo's config, or inherit that hook setting. |
| 310 | + |
| 311 | +== Glossary |
| 312 | + |
| 313 | +*hook event* |
| 314 | + |
| 315 | +A point during Git's execution where user scripts may be run, for example, |
| 316 | +_prepare-commit-msg_ or _pre-push_. |
| 317 | + |
| 318 | +*hook command* |
| 319 | + |
| 320 | +A user script or executable which will be run on one or more hook events. |
0 commit comments