From The Software Development Trenches: Code Reviews

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

Code reviews are a proven, effective way to minimize defects, improve code quality and keep code more maintainable. It encourages team collaboration and assists with mentoring developers. Yet, not many projects employ code reviews as a regular part of their development process. Why is this? Programmer egos and the hassles of packaging source code for review are often sited as some reasons for not doing code reviews.

I felt that code reviews should form part of a good code quality control process that includes:

Visual inspection/QA via peer code reviews
Automated code standards check (I prefer FxCop)
Automated code metrics check (I prefer NDepend)
Automated code coverage statistics (I prefer NCover and NCoverExplorer)

In this post I will spend some time looking at code reviews. I'll start by considering different styles of code review and some code review metrics. I'll then move on to some thoughts on review frequencies and best practices for peer code review. I'll finish off by considering some tools that can assist with the code review process itself.

References

I used the following books and articles to organize my thoughts on the topic:

Smart Bear Software's free, excellent book Best Kept Secrets of Peer Code Review. [Cohen01]
Smart Bear Software's whitepaper on Best Practices for Peer Code Review. [Cohen02]
Robert Bogue's article on Effective Code Reviews Without the Pain. [Bogue]
Wikipedia article on Code Review. [Wikipedia01]
Wikipedia article on Hawtorne Effect. [Wikipedia02]

Review Types

Code review practices often fall into two main categories: formal code review and lightweight code review.

Formal Code Review

Formal code review, such as a Fagan inspection, involves a careful and detailed process with multiple participants and multiple phases. Software developers attend a series of meetings and review code line by line, usually using printed copies of the material. An Introductory meeting is held where the goals and rules for the review are explained and the review material is handed out. Reviewers inspect the code privately and feedback is given in a subsequent Inspection meeting. The author fixes any defects identified and the reviewers verify that the defects are fixed in a further Verification meeting.

Formal inspections are extremely thorough and effective at finding defects in code but take a long time to complete.

Lightweight Code Review

Lightweight code review typically requires less overhead than formal code inspections. However, if done properly, it can be equally effective. Lightweight reviews are often conducted as part of the normal software development process and attempt to improve the cost-benefit factor – providing improved code quality without incurring the overhead of traditional meetings-based inspections [Wikipedia01] [Cohen01]:

Over-the-shoulder review – One developer standing over the author’s workstation while the author walks the reviewer through a set of code changes. Simple to execute, these kinds of reviews lend themselves to learning and sharing between developers and gets people interacting with each other instead of hiding behind impersonal e-mail and instant messages. The informality and simplicity unfortunately also leads to some shortcomings such as the process not being enforceable; lack of code review process metrics; does not work for distributed teams; the reviewer being led too hastily through the code and the reviewer not verifying that defects were fixed properly.
Email pass-around – The author or source code management system packages the code changes and sends an e-mail to the reviewers. Relatively simple to execute, these kinds of reviews have the added benefit of working equally well with distributed teams. Other people, like domain experts, can also be brought in to review certain areas or a reviewer may even defer to another reviewer. Some shortcomings include the difficulty to track the various threads of conversation or code changes; lack of code review process metrics and the reviewers not being able to verify that defects were fixed properly.
Tool-assisted reviews – Specialized tools are used in all aspects of the review such as collecting files, transmitting and displaying files, commentary, collecting metrics and controlling the code review process workflow. The major drawback of any tool-assisted review is the cost of buying a commercial offering or the development cost associated with developing an in-house tool.
Pair Programming – Two developers write code together at a single workstation using continuous free-form discussion and review such as is common in Extreme Programming. Some people argue in favour of the deep insight the reviewer has into the problem at hand, whilst other argue that the closeness is exactly what you do not want from a reviewer as you want a fresh, unbiased opinion. Some people suggest doing both pair programming and a follow-up standard review using a fresh pair of eyes.

In a current project that I am working on, we are using the over-the-shoulder kind of review process. Rather than doing no review, we opted for using this approach as the best way to balance the cost-benefit factor of our reviews. Every submission into the repository needs to be reviewed by a fellow developer. The details of the reviewer is added to the check-in note to provide the necessary traceability and visibility. Authors rotate the developers doing the reviews based on the area of the system changed and also to do some cross-skilling.

From our experience, we also find the informality of the process as its biggest downfall. As pressure starts to build, less time is spend on doing reviews and the process sometimes gets perilously close to becoming a mere formality.

Review Metrics

The following metrics are typically used to measure the effectiveness of a code review process:

Inspection Rate: How fast are we able to review code? Normally measured in kLOC (thousand Lines Of Code) per man-hour.
Defect Rate: How fast are we able to find defects? Normally measured in number of defects found per man-hour.
Defect Density: How many defects do we find in a given amount of code? Normally measured in number of defects found per kLOC. The higher the defect density, the more defects are being identified which usually implies a more effective review process.

This begs the question, what are good values for these metrics? [Cohen01] uses two examples to illustrate the difficulties with evaluating these metrics:

In Example A a few lines of code to a mission critical module are evaluated by 3 reviewers to absolutely ensure that no defects are introduced. The review results in a high defect density (say 4 defects in a few lines), slow inspection rate and a defect rate of 1 defect per hour. In Example B changes made to a GUI dialog box results in 120 lines of changed code, of which some of the code was generated by the GUI designer. One reviewer is assigned to verify the changes and the reviewer chooses to ignore the designer generated code and only evaluates the code behind the GUI elements added. This results in a low defect density (say 1 bug in 120 lines), fast inspection rate (say 30 minutes) and a defect rate of 2 defects per hour.

As evident, the metrics are quite different. Should the high defect density rate of Example A reflect badly on the developer? [Cohen01] argues probably not as the high defect density is the result of the multiple reviewers scrutinizing every line of code to make absolutely sure no defects are introduced. Should the low defect density rate of Example B reflect badly on the reviewer? [Cohen01] argues probably not as it is difficult to review designer generated code. As the code is also not mission critical like in Example A, the reviewer was perhaps intentionally tasked to spend less time reviewing the changes.

Review Frequency

There are no hard and fast rules for determining the frequency of code reviews. The frequency of code reviews can be influenced by quite a few factors and has to be determined contextually. Some of the factors to consider include:

Review type: Formal Code Review requires more time and effort to complete than Lightweight Code Review.
Review purpose: Code reviews used to mentor and improve team collaboration might need to be executed more frequently.
Frequency and scope of code changes: Frequent, small check-ins versus irregular, big check-ins.
Nature of the development effort: Open Source; Onshore; Offshore.
Business need for quality: Mission critical applications need to ensure a high level of quality.

Having said of all this, doing infrequent code reviews or waiting to do reviews till code complete seriously undermines the benefit and integrity of the code review process. Reviewing code frequently motivates the developers to ensure the quality of the code being delivered – better known as the Hawthorne Effect [Wikipedia02].

Best Practices for Peer Code Review

[Cohen02] and [Bogue] present several techniques to ensure that your code reviews improve your code without wasting your developer’s time:

Verify that defects are actually fixed!
Remember the purpose: Code reviews often start off on the wrong foot because they are perceived as an unnecessary step that has been forced upon the developers or, in some cases evidence that management doesn’t trust the developers. Code reviews are a proven, effective way to minimize defects and they are, at their core, an industry best practice.
A matter of approach: Prevent code reviews from becoming mental jousting matches where people take shots at each other. Consider the review process as a forum to collaborate and learn from one another. Reviewers should ask questions rather than making statements; remember to praise and also be mindful of the fact that there is often more than one way to approach a solution. Authors should try to hear past attacking comments and try to focus on the learning that they can get out of the process.
Review fewer than 200-400 lines of code at a time and aim for an inspection rate of less than 300-500 LOC/hour: Take your time when doing a review - faster is not better.
Never review for more than 60-90 minutes at a time.
Authors should consider how to best explain their changes before the review begins: [Cohen02] refers to this process as “annotating the source code”. As the author has to re-think and explain the changes during the annotation process, the author will uncover many defects before the review even begins.
Checklists substantially improve results for both authors and reviewers: Omissions are the hardest defects to find. A checklist is a great way to avoid this problem as it reminds the reviewer or author to take the time to look for something that might be missing. Publish the checklists on a wiki. As each person typically makes the same 15-20 mistakes, also consider creating personal checklists.
Management must foster a good code review culture in which finding defects is viewed positively: A negative attitude towards defects found can sour a whole team and sabotage the bug-finding process.
Beware of the “Big Brother” effect: Code review metrics should never be used to single out developers, particularly not in front of their peers. Metrics should be used to measure the efficiency or the effect of the process.
The Ego Effect: Do at least some code review, even if you don’t have time to review it all. The Ego effect drives developers to review their own work and write better code because they know others will be looking at their code.

Tools

I found the following tools to support a lightweight, tool-assisted peer code review process:

Code Collaborator - Commercial offering from Smart Bear Software that supports online reviews with inline source commenting, threaded contextual chat, asynchronous reviews for when participants are separated by many timezones, review metrics and reports, version control integration, customisable workflow and much more.
Crucible - Commercial offering from Atlassian that supports online reviews with inline source commenting, threaded comments, review metrics, version control integration, workflow and much more.
Codestriker - Open source project that supports online reviews with version control integration and review metrics.

Conclusion

Well that about covers my thoughts on the topic. I'm interested to know how many people are actually doing some form of peer code review. What type of reviews are you doing? How often do you do reviews? Do you use any tools to assist you with your reviews? What code review practices work best in your environment?

From The Software Development Trenches

Google Website Translator Gadget

Sunday, 01 November 2009

Code Reviews