Tuesday, November 25, 2008

Version Control Hooks for a Professional Development Shop

Managing a professional development shop can be challenging. As an Applications Architect, I must be aware at all times of what is going on in the codebase. I need to ensure that each change is properly documented, does not break existing code, is optimized and follows the company's coding standards, among other things.

If you are using a version control system that supports hooks, such as Subversion, these tasks can be mostly automated (if you're not using a version control system, STOP READING and go get one now, I'll be here waiting). At the time of this post, I have set up the following hooks in our Subversion repository to manage a PHP shop (it's worth mentioning that these hooks are useful for any codebase, not just for PHP):

Pre-commit:
  • Requiring a log message for each committed changeset
  • Running lint checks on the committed code
  • Ensuring the committed code follows the company's coding standards
Post-commit:
  • Sending an email to the development group with the commit details

Requiring the developer to add a log message makes it easier track changes when viewing the change history of the codebase. This message is also used in the email that is sent post-commit (more on that later).

Lint checking, or automated code analysis, can help avoid potential bugs from being committed. At the very least, your lint hook should ensure that no syntax errors get committed. There is bound to be at least one code analyzer for your programming language if the language is widely used (check this list). If using PHP, the PHP CLI itself can be used as a lint checker by passing the -l option.

Opinions vary on forcing committed code to follow coding standards. Some say this should be done as a post-commit hook rather than pre-commit, sending an email out to the development group for review. Others say this is an unnecessary check and review of the code against coding standards belongs to a separate tool, as an outside process. Based on my experience, this hook is necessary, and should be done pre-commit. As a perfectionist, I can assure you that no matter how much you stress the importance of following standards, your developers will not care about this to the same degree you do and will commit immediately after testing their changes, no matter how their code looks like in its working state. Asking the developers to go back and fix their code so it conforms to the standard is a tiring process and rarely works. When there are tight deadlines, the developers' concerns are to implement all the needed features to get the project out the door quickly. The last thing on their minds is to "make the code look pretty". Before you know it, there will be no uniformity in your codebase and maintenance will become a nightmare.

A pre-commit hook will avoid all the issues above (for PHP, look at the PHP_CodeSniffer PEAR package). Even though the developers will hate you at first, this will force them to care about your coding standard, if only to avoid wasting time having all their changes be constantly rejected. The coding standard hook should not only error out and reject the commit, but it should also print out the lines in the code that break the standard and what specific rules were violated. This will help them immediately resolve the issues instead of them having them re-read the coding standards document. As they keep seeing these messages, they will memorize and apply the rules of the standard, which was your goal in the first place. Some might argue that if your developers don't find that conforming to the coding standard is important, that you have a personnel problem, and not a management one. To that, I say good luck finding developers that hold this in high regard, and by the time you find one, be prepared to pay them a higher salary, as developers who do care about this usually care about other important things, such as code design and performance, which means they're good and experienced.

Finally, sending an email post-commit helps all in the development group be aware of what's going on in the codebase. The email sent out should include the name of the developer which committed the change, the date of the commit, the log message the developer entered, and the diff output of the changes. This is not only useful for the other developers, but also for the person assigned to do code reviews (usually the architect). Unoptimized or badly-implemented code will be caught quickly in development, before it's too late and the change ends up in production.