Thoughts on Pre-commit Hooks

ยท 1212 words ยท 6 minute read

Introduction ๐Ÿ”—

Git provides hooks as a means to customize Git and automate certain processes. Each hook is a script that is invoked before or after certain events happen. The pre-commit hook is probably the most commonly used one. As its name suggests, the pre-commit hook is run before a commit is created. For example, we can run the code formatter in the pre-commit hook to format the code before a commit. A pre-commit hook can also block a commit by returning a non-zero exit code. I call such a hook a blocking pre-commit hook. For example, we can run the linter or the test suite in the pre-commit hook and stop the commit if a test fails.

Typically, a pre-commit hook has to be installed manually – it does not come for free after a git clone. In the NodeJS ecosystem, a common pattern is to install the pre-commit hooks as part of the prepare script, which will run after the project dependencies are installed with npm install (or one of the alternative package managers). This effectively makes pre-commit hooks opt-out in those projects.

TL;DR: While pre-commit hooks are very common, I believe that they should be used very sparingly.

What are pre-commit hooks used for? ๐Ÿ”—

Ultimately, the purpose of a pre-commit hook is to enforce certain invariants at every commit. A pre-commit hook that runs the code formatter enforces that the code is formatted at every commit while a pre-commit hook that runs the test suit enforces that the test suite is passing at every commit.

Each organization and project has certain things that they care about. A pre-commit hook is a way to enforce them. But, I argue that:

  1. The pre-commit hook is not a very good tool to do that.
  2. The cost is probably not worth paying.

Pre-commit hooks are not sufficient ๐Ÿ”—

While Git is a distributed version control system, which implies that there are no client-server relationships among the participants, most organizations and projects do rely on a central forge, such as GitHub or GitLab, acting as the source of truth. The forge plays the role of the server, while the individual clones of the project are the clients.

In such a model, a pre-commit hook functions as client-side validation. As we know, client-side validations cannot be relied on for security purposes. This is true for a pre-commit hook too. For example, the pre-commit hook can be skipped very easily with the --no-verify flag.

Thus, to enforce the important things, such as a passing test suite, we have to rely on server-side validation. In the Git world, this is the pull request (PR) (or merge request, diff, changelist, etc) checks run by the continuous integration (CI) system. A PR can be merged only if CI gives a passing result, which usually involves running the test suite and enforcing the style guide.

Pre-commit hooks as a feedback mechanism ๐Ÿ”—

A common argument for pre-commit hooks is that they can provide regular feedback to the engineer without having to wait for CI builds. While I agree with the intention, I don’t think pre-commit hooks are a good tool to deliver feedback.

Instead, most modern text editors and IDEs might be a better option. For example, modern text editors can typically run the code formatter before saving. They can also surface lint warnings inline. As for running tests, many test runners can watch for file system changes and re-run the tests automatically. These alternatives provide a much tighter feedback loop while the feedback is delivered much more seamlessly. At the same time, they do not force when an engineer should action on that feedback, unlike pre-commit hooks, which force the engineer to action on that feedback right when it is delivered.

But no harm, right? ๐Ÿ”—

This is where we should consider the cost of pre-commit hooks. To recap, a pre-commit hook is used to enforce invariants at every commit. My problem is with the “at every commit” part. It requires that the code is sparkly โœจ before a commit can happen. Of course, we want our code to be formatted and well-tested, but do we need every commit, even those that are unpublished, to be perfect? In enforcing certain standards, the pre-commit hooks are also indirectly enforcing how an engineer is doing their work.

For example, while working on an ambiguous project where the path to completion is unclear, I tend to write a lot of half-working drafts, commit regularly, and write my findings in the commit message. Once I have gained enough clarity, I will then interactively rebase my branch to polish up my commits. Pre-commit hooks will severely hamper my workflow because it is unlikely that each of my commits will even compile let alone be passing the test suites in the first few iterations.

The root of the problem is that pre-commit hooks are enforcing the standards too regularly. Most projects care only that PRs, not their commits, are in good shape.

For others that do, such as those following Phabricator’s stacked diffs model, like my rebase workflow, a commit can be refined over time before it is ready to be mainlined. Again, we don’t need the code to be perfect every time we run git commit.

So no pre-commit hooks? ๐Ÿ”—

Well, no. It is clear, with the popularity of projects such as husky, that a lot of people love their pre-commit hooks. After all, only a Sith deals in absolutes. A blocking pre-commit hook is still useful for when we really do not want certain changes, such as API keys and other sensitive information, committed.

A couple of suggestions:

  1. Rely on other tools such as the text editor or IDE. As mentioned earlier, text editors or IDEs might be a better option than pre-commit hooks for many use cases. It might be worthwhile to provide configuration files that can be picked up by these tools. For example, projects such as Prettier and ESLint provide extensions for many popular editors and IDEs that do not need extra configuration beyond the standard ones.
  2. Make them opt-in or optional. Instead of silently installing the pre-commit hooks, ask if the engineer wants them. It can be a simple prompt in the existing prepare script that says “We have prepared a pre-commit hook that will do A, B, and C before you commit your changes. Do you want to install it now? [Y/n]”
  3. Make them a normal script. Expose whatever the pre-commit hook is doing in a plain script that the engineer can execute at any time. This gives the engineer greater autonomy to decide how they want to do their work.
  4. Make them non-blocking. Instead of blocking the commit, a pre-commit hook can simply print the results of its checks. The engineer can then decide what to do with that information.

Conclusion ๐Ÿ”—

The Git pre-commit hooks are increasingly popular. However, I believe that they should be used sparingly, especially if they are opt-out and blocking. While they can help to enforce coding standards and provide tighter feedback loops, they are not the best tools to achieve that. At the same time, they are also indirectly enforcing how engineers work, effectively taking autonomy away from them. I believe that this is a huge cost to pay for what is usually very little upside.