Sunday 01 November 2009

Code Metrics

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

image I've been wanting to do a post about code metrics for quite a while now - mostly to organize my thoughts on the topic as it is something that I want to introduce at work, but also to get some feedback from other people as to if and how they are using metrics to assist them in crafting quality code.  After reading Jeremy Miller's post on the topic, I thought I might as well take the plunge.   I'll start by musing over what metrics I found to be useful, then continue with looking at tool support for generating these metrics and finish off with considering when to use these metrics.

Metrics

I am not going to cover all the different metrics in detail but instead highlight what seems to me to be the most useful metrics and refer to articles/links where other people who have done an excellent job on covering these metrics in detail.  Here are the reference articles that I used:

  1. Robert C. Martin's article on OO design quality metrics [1]
  2. Wikipedia's summary of these package metrics [2]
  3. Kirk Knoernschild's excellent introductory article on metrics with sample refactorings included [3]
  4. Patrick Smacchia's (developer of NDepend software) excellent coverage on all the types of metrics supported by NDepend [4]
  5. Write up on the software metrics supported by the Software Design Metrics software [5]

Size metrics

Size metrics are consistently good indicators of fault-proneness: large methods/classes/packages contain more faults [5].

  • Source Lines Of Code (SLOC) measures the amount of lines of code.  To be really useful comment lines and lines that have been broken into multiple lines need to be factored out.  Some people refer to this as the logical LOC vs. the physical LOC. 

"2 significant advantages of logical LOC over physical LOC are:

  • Coding style doesn’t interfere with logical LOC. For example the LOC won’t change because a method call is spawn on several lines because of a high number of argument.
  • logical LOC is independent from the language. Values obtained from assemblies written with different languages are comparable and can be summed." [4]

Complexity metrics

There is a direct correlation between complexity and the defect rate of software, so keeping code simple is a solid first step toward lowering the defect rate of software [3].

  • Cyclomatic Complexity (CC) measures code complexity by counting the number of linearly independent paths through code. Complex conditionals and boolean operators increase the number of linear paths, resulting in a higher CCN. Methods with a CCN of five or higher are good refactoring candidates to help ensure code remains easy to understand [3]

Package metrics

Coupling/Dependency metrics

Excessive dependencies between packages compromise architecture and design. Complex dependencies inhibit the testability of your system and presents numerous other challenges as presented in [1] and [3].

  • Afferent Coupling (Ca) measures the number of types outside a package that depend on types within the package (incoming dependencies). High afferent coupling indicates that the concerned packages have many responsibilities. [1]
  • Efferent Coupling (Ce) measures the number of types inside a package that depends on types outside of the package (outgoing dependencies). High efferent coupling indicates that the concerned package is dependant. [1]

It goes to reason that:

"...afferent and efferent coupling allows you to more effectively evaluate the cost of change and the likelihood of reuse. For instance, maintaining a module with many incoming dependencies is more costly and risky since there is greater risk of impacting other modules, requiring more thorough integration testing. Conversely, a module with many outgoing dependencies is more difficult to test and reuse since all dependent modules are required ... Concrete modules with high afferent coupling will be difficult to change because of the high number of incoming dependencies. Modules with many abstractions are typically more extensible, so long as the dependencies are on the abstract portion of a module." [3]

  • Instability (I) measures the ratio of efferent coupling (Ce) to total coupling. I = Ce / (Ce + Ca). This metric is an indicator of the package's resilience to change. The range for this metric is 0 to 1, with I=0 indicating a completely stable package and I=1 indicating a completely instable package. [1]
  • Abstractness (A) measures the ratio of the number of internal abstract types (i.e abstract classes and interfaces) to the number of internal types. The range for this metric is 0 to 1, with A=0 indicating a completely concrete package and A=1 indicating a completely abstract package. [1]
  • Distance from main sequence (D) measures the perpendicular normalized distance of a package from the idealized line A + I = 1 (called main sequence). This metric is an indicator of the package's balance between abstractness and stability. A package squarely on the main sequence is optimally balanced with respect to its abstractness and stability. Ideal packages are either completely abstract and stable (I=0, A=1) or completely concrete and instable (I=1, A=0). The range for this metric is 0 to 1. [1][4]

"A value approaching zero indicates a module is abstract is relation to its incoming dependencies. As distance approaches one, a module is either concrete with many incoming dependencies or abstract with many outgoing dependencies. The first case represents a lack of design integrity, while the second is useless design." [3]

Cohesion metrics

"A low cohesive design element has been assigned many unrelated responsibilities. Consequently, the design element is more difficult to understand and therefore also harder to maintain and reuse. Design elements with low cohesion should be considered for refactoring, for instance, by extracting parts of the functionality to separate classes with clearly defined responsibilities." [5]

  • Relational Cohesion (H) measures the average number of internal relationships per type. Let R be the number of type relationships that are internal to this package (i.e that do not connect to types outside the package). Let N be the number of types within the package. H = (R + 1)/ N. The extra 1 in the formula prevents H=0 when N=1. The relational cohesion represents the relationship that this package has to all its types.  As classes inside an package should be strongly related, the cohesion should be high. On the other hand, too high values may indicate over-coupling. A good range for RelationalCohesion is 1.5 to 4.0. Packages where RelationalCohesion < 1.5 or RelationalCohesion > 4.0 might be problematic. [4]
  • Lack of Cohesion of Methods (LCOM) The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class [4]
Inheritance metrics

"Deep inheritance structures are hypothesized to be more fault-prone. The information needed to fully understand a class situated deep in the inheritance tree is spread over several ancestor classes, thus more difficult to overview.  Similar to high export coupling, a modification to a design element with a large number of descendents can have a large effect on the system." [5]

  • Depth of Inheritance Tree (DIT) measures the number of base classes for a class or structure.  Types where DIT is higher than 6 might be hard to maintain. However it is not a rule since sometime your classes might inherit from tier classes which have a high value for depth of inheritance. [4]

Tools

When it comes to tools, the Mercedes Benz of .NET code metrics tools from my point of view has to be NDepend 2.0NDepend 2 provides more than 60 metrics (including all of the metrics listed above) and includes integration into your automated build via support for MSBuild, NAnt and CruiseControl.NET.  Browse to here for a sample report and here for a demo on how to integrate it into your build. 

There is a visual GUI (VisualNDepend) that allows you to browse your code structure and evaluate the metrics as well as a console application that you can generate the metrics with.  Patrick has also created a CQL (Code Query Language) which allows NDepend to consider your code as a database with CQL being the query language with which you can check some assertions on this database. As a consequence, CQL is similar to SQL and supports the typical SELECT TOP FROM WHERE ORDER BY patterns. Here is an example of a CQL query:

WARN IF Count > 0 IN SELECT METHODS WHERE NbILInstructions > 200 ORDER BY NbILInstructions DESC // METHODS WHERE NbILInstructions > 200 are extremely complex and // should be split in smaller methods.

How cool is this! To quote:

"CQL constraints are customisable and typically tied with a particular application. For example, they can allow the specification of customized encapsulation constraints, such as, I want to ensure that this layer will never use this other layer or I want to ensure that this class will never be instantiated outside this particular namespace."

VisualNDepend also provides a CQL editor which supports intellisense and verbose compile error descriptions to make writing CQL queries a lot easier.  Enough said!  Browse to here for a complete overview of the NDepend 2 features.

Other tools that that you can have a look at include SourceMonitor, DevMetrics, Software Design Metrics and vil to name a few. vil does not support .NET 2.0 and does not seem to be under active development. DevMetrics, after being open-sourced, seemed to have stagnated with no visible activity on SourceForge. SourceMonitor is actively under development and supports a variety of programming languages. However, it supports only a small subset of the metrics mentioned which does not include support for important metrics like efferent and afferent coupling etc. Software Design Metrics takes a novel approach in that it measures the complexity based on the UML models for the software. This has the advantage of being language independent, but you obviously need to have UML models to run
the analysis.

When to use

When should one use these metrics? I agree with Jeremy Miller in his post that the metrics should not replace the visual inspection/QA process and be performed in addition to it. It would be nice to have these metrics at hand to assist in the QA process though. I also agree with Frank Kelly in his post that a working system with no severity 1/2 errors and happy end users are more important than getting the right balance of Ca/Ce or whatever metric you are interested in.

I think I will stick with an approach of identifying a subset of useful metrics and using these as part of an overall process of static code analysis on a regular basis. When I say regular basis I feel it should be part of your continuous build process to prevent people from committing code into your repository that does not satisfy your constraints. With a tool like NDepend you can create your own custom level of acceptance criteria by which the build will fail/succeed and exclude metrics that you feel should not apply to your code base.

As mentioned, the code metrics should form part of a bigger quality process that includes:

  • Visual inspection/QA via peer code reviews (as mentioned having the metrics via a tool like VisualNDepend can greatly assist in the QA)
  • Automated code standards check (I prefer FxCop)
  • Automated code metric check (NDepend seems like the tool to use here)
  • Automated code coverage statistics (I prefer NCover and NCoverExplorer)

Well, that basically covers my thoughts on the topic for now. I'm interested to know how many people are actively using metrics and what metrics they are using. I'd also like to know what processes or tools people are using to generate these metrics on their code. Let me know what's working for you in your environment.

Code Reviews

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

image Code reviews are a proven, effective way to minimize defects, improve code quality and keep code more maintainable.  It encourages team collaboration and assists with mentoring developers. Yet, not many projects employ code reviews as a regular part of their development process. Why is this?  Programmer egos and the hassles of packaging source code for review are often sited as some reasons for not doing code reviews.

I felt that code reviews should form part of a good code quality control process that includes:

  • Visual inspection/QA via peer code reviews
  • Automated code standards check (I prefer FxCop)
  • Automated code metrics check (I prefer NDepend)
  • Automated code coverage statistics (I prefer NCover and NCoverExplorer)

In this post I will spend some time looking at code reviews.  I'll start by considering different styles of code review and some code review metrics. I'll then move on to some thoughts on review frequencies and best practices for peer code review.  I'll finish off by considering some tools that can assist with the code review process itself.

References

I used the following books and articles to organize my thoughts on the topic:

  1. Smart Bear Software's free, excellent book Best Kept Secrets of Peer Code Review. [Cohen01]
  2. Smart Bear Software's whitepaper on Best Practices for Peer Code Review. [Cohen02]
  3. Robert Bogue's article on Effective Code Reviews Without the Pain. [Bogue]
  4. Wikipedia article on Code Review. [Wikipedia01]
  5. Wikipedia article on Hawtorne Effect. [Wikipedia02]

Review Types

Code review practices often fall into two main categories: formal code review and lightweight code review.

Formal Code Review

Formal code review, such as a Fagan inspection, involves a careful and detailed process with multiple participants and multiple phases. Software developers attend a series of meetings and review code line by line, usually using printed copies of the material. An Introductory meeting is held where the goals and rules for the review are explained and the review material is handed out. Reviewers inspect the code privately and feedback is given in a subsequent Inspection meeting. The author fixes any defects identified and the reviewers verify that the defects are fixed in a further Verification meeting.

Formal inspections are extremely thorough and effective at finding defects in code but take a long time to complete.

Lightweight Code Review

Lightweight code review typically requires less overhead than formal code inspections. However, if done properly, it can be equally effective.  Lightweight reviews are often conducted as part of the normal software development process and attempt to improve the cost-benefit factor – providing improved code quality without incurring the overhead of traditional meetings-based inspections [Wikipedia01] [Cohen01]:

  • Over-the-shoulder review – One developer standing over the author’s workstation while the author walks the reviewer through a set of code changes. Simple to execute, these kinds of reviews lend themselves to learning and sharing between developers and gets people interacting with each other instead of hiding behind impersonal e-mail and instant messages. The informality and simplicity unfortunately also leads to some shortcomings such as the process not being enforceable; lack of code review process metrics; does not work for distributed teams; the reviewer being led too hastily through the code and the reviewer not verifying that defects were fixed properly.
  • Email pass-around – The author or source code management system packages the code changes and sends an e-mail to the reviewers. Relatively simple to execute, these kinds of reviews have the added benefit of working equally well with distributed teams. Other people, like domain experts, can also be brought in to review certain areas or a reviewer may even defer to another reviewer. Some shortcomings include the difficulty to track the various threads of conversation or code changes; lack of code review process metrics and the reviewers not being able to verify that defects were fixed properly.
  • Tool-assisted reviews – Specialized tools are used in all aspects of the review such as collecting files, transmitting and displaying files, commentary, collecting metrics and controlling the code review process workflow. The major drawback of any tool-assisted review is the cost of buying a commercial offering or the development cost associated with developing an in-house tool.
  • Pair Programming – Two developers write code together at a single workstation using continuous free-form discussion and review such as is common in Extreme Programming. Some people argue in favour of the deep insight the reviewer has into the problem at hand, whilst other argue that the closeness is exactly what you do not want from a reviewer as you want a fresh, unbiased opinion. Some people suggest doing both pair programming and a follow-up standard review using a fresh pair of eyes.

In a current project that I am working on, we are using the over-the-shoulder kind of review process.  Rather than doing no review, we opted for using this approach as the best way to balance the cost-benefit factor of our reviews.  Every submission into the repository needs to be reviewed by a fellow developer.  The details of the reviewer is added to the check-in note to provide the necessary traceability and visibility.  Authors rotate the developers doing the reviews based on the area of the system changed and also to do some cross-skilling. 

From our experience, we also find the informality of the process as its biggest downfall.  As pressure starts to build, less time is spend on doing reviews and the process sometimes gets perilously close to becoming a mere formality.

Review Metrics

The following metrics are typically used to measure the effectiveness of a code review process:

  • Inspection Rate: How fast are we able to review code? Normally measured in kLOC (thousand Lines Of Code) per man-hour.
  • Defect Rate: How fast are we able to find defects? Normally measured in number of defects found per man-hour.
  • Defect Density: How many defects do we find in a given amount of code? Normally measured in number of defects found per kLOC.  The higher the defect density, the more defects are being identified which usually implies a more effective review process.

This begs the question, what are good values for these metrics?  [Cohen01] uses two examples to illustrate the difficulties with evaluating these metrics: 

In Example A a few lines of code to a mission critical module are evaluated by 3 reviewers to absolutely ensure that no defects are introduced.  The review results in a high defect density (say 4 defects in a few lines), slow inspection rate and a defect rate of 1 defect per hour.  In Example B changes made to a GUI dialog box results in 120  lines of changed code, of which some of the code was generated by the GUI designer.  One reviewer is assigned to verify the changes and the reviewer chooses to ignore the designer generated code and only evaluates the code behind the GUI elements added.  This results in a low defect density (say 1 bug in 120 lines), fast inspection rate (say 30 minutes) and a defect rate of 2 defects per hour.

As evident, the metrics are quite different. Should the high defect density rate of Example A reflect badly on the developer? [Cohen01] argues probably not as the high defect density is the result of the multiple reviewers scrutinizing every line of code to make absolutely sure no defects are introduced.  Should the low defect density rate of Example B reflect badly on the reviewer?  [Cohen01] argues probably not as it is difficult to review designer generated code.  As the code is also not mission critical like in Example A, the reviewer was perhaps intentionally tasked to spend less time reviewing the changes.

Review Frequency

There are no hard and fast rules for determining the frequency of code reviews. The frequency of code reviews can be influenced by quite a few factors and has to be determined contextually. Some of the factors to consider include:

  • Review type: Formal Code Review requires more time and effort to complete than Lightweight Code Review.
  • Review purpose: Code reviews used to mentor and improve team collaboration might need to be executed more frequently.
  • Frequency and scope of code changes: Frequent, small check-ins versus irregular, big check-ins.
  • Nature of the development effort: Open Source; Onshore; Offshore.
  • Business need for quality: Mission critical applications need to ensure a high level of quality.

Having said of all this, doing infrequent code reviews or waiting to do reviews till code complete seriously undermines the benefit and integrity of the code review process. Reviewing code frequently motivates the developers to ensure the quality of the code being delivered – better known as the Hawthorne Effect [Wikipedia02].

Best Practices for Peer Code Review

[Cohen02] and [Bogue] present several techniques to ensure that your code reviews improve your code without wasting your developer’s time:

  • Verify that defects are actually fixed!
  • Remember the purpose: Code reviews often start off on the wrong foot because they are perceived as an unnecessary step that has been forced upon the developers or, in some cases evidence that management doesn’t trust the developers. Code reviews are a proven, effective way to minimize defects and they are, at their core, an industry best practice.
  • A matter of approach: Prevent code reviews from becoming mental jousting matches where people take shots at each other. Consider the review process as a forum to collaborate and learn from one another. Reviewers should ask questions rather than making statements; remember to praise and also be mindful of the fact that there is often more than one way to approach a solution. Authors should try to hear past attacking comments and try to focus on the learning that they can get out of the process.
  • Review fewer than 200-400 lines of code at a time and aim for an inspection rate of less than 300-500 LOC/hour: Take your time when doing a review - faster is not better.
  • Never review for more than 60-90 minutes at a time. 
  • Authors should consider how to best explain their changes before the review begins: [Cohen02] refers to this process as “annotating the source code”. As the author has to re-think and explain the changes during the annotation process, the author will uncover many defects before the review even begins.
  • Checklists substantially improve results for both authors and reviewers: Omissions are the hardest defects to find. A checklist is a great way to avoid this problem as it reminds the reviewer or author to take the time to look for something that might be missing. Publish the checklists on a wiki. As each person typically makes the same 15-20 mistakes, also consider creating personal checklists.
  • Management must foster a good code review culture in which finding defects is viewed positively: A negative attitude towards defects found can sour a whole team and sabotage the bug-finding process.
  • Beware of the “Big Brother” effect: Code review metrics should never be used to single out developers, particularly not in front of their peers. Metrics should be used to measure the efficiency or the effect of the process.
  • The Ego Effect: Do at least some code review, even if you don’t have time to review it all. The Ego effect drives developers to review their own work and write better code because they know others will be looking at their code.

Tools

I found the following tools to support a lightweight, tool-assisted peer code review process:

  1. Code Collaborator - Commercial offering from Smart Bear Software that supports online reviews with inline source commenting, threaded contextual chat, asynchronous reviews for when participants are separated by many timezones, review metrics and reports, version control integration, customisable workflow and much more.
  2. Crucible - Commercial offering from Atlassian that supports online reviews with inline source commenting, threaded comments, review metrics, version control integration, workflow and much more.
  3. Codestriker - Open source project that supports online reviews with version control integration and review metrics.

Conclusion

Well that about covers my thoughts on the topic.  I'm interested to know how many people are actually doing some form of peer code review.  What type of reviews are you doing?  How often do you do reviews? Do you use any tools to assist you with your reviews?  What code review practices work best in your environment?

FxCop

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

image In previous posts about Code Metrics and Code Reviews, I explored some metrics and techniques that I felt should form part of any good software quality control process.  One of the tools that I mentioned is FxCop. 

In this post I take a closer look at FxCop.  I start by looking at how FxCop works and how you can fit it into your development process.  I then consider the different rule sets to use and look at how you can utilise FxCop to guide your VS 2003/2005/2008 development efforts.  I finish off the article by linking to articles that show you how to develop your own custom FxCop rules.

Introduction

FxCop analyses .NET assemblies for potential code compliance problems and forms part of static code analysis. With static code analysis the compiled code is checked for compliance to identify possible defects before executing the code as illustrated in the following diagram.

image

FxCop employs assembly analysis of generated assemblies using an introspection engine. Since the analysis is performed on the generated intermediate language (MSIL) code, FxCop is not dependent on any particular .NET implementation language.  However, it is important to remember that assembly analysis cannot analyse certain aspects of the original source code, such as code comments, since these are not carried over in the compilation process.   You may also run into a scenarios where there are differences between the FxCop violations in Debug and Release configurations due to the compiler optimisations being applied in Release mode.

FxCop ships with a large rule set analysing library design, globalization, interoperability, maintainability, mobility, naming, performance, portability, reliability, security and usage aspects of the assembly. The rule set can be extended to include new custom rules and existing rules can be switched off if necessary.

Process

FxCop will typically be used by the following people:

  • Developers use FxCop continuously during development, because it helps with familiarising themselves with the .NET coding best practices. As they progress they should find that they break fewer rules and can consequently rely on the tool less often. However, they should still continue to evaluate their code at predefined intervals in the SDLC. By actively using this tool, developers raise the standard of code going into code review sessions. I advocate for using the tool before every commit.
  • Code reviewers use the tool to verify that developers are indeed complying with the best practices as defined in the rule set. They should also check that they agree with any "excluded" defects. Use of FxCop should rapidly improve both the speed and breadth of code review sessions by instantly highlighting the most obvious problems.

I strongly recommend using FxCop together with a refactoring tool like ReSharper which makes fixing the rule violations a lot easier and less error prone.

FxCop Backlogs

The following articles present some ideas on how to introduce FxCop into your SDLC:

  1. FxCop and the big-bad-backlog
  2. FxCop backlogs: Some rules for rule activation

Rule sets

The rule sets provided with FxCop are quite extensive.  Some rule sets (i.e Globalization/Mobility rule set) also includes rules that only apply to applications that target certain platforms/cultures/languages etc. Microsoft itself uses a sub-set of the complete FxCop rules to guide their own internal development efforts. From experience, I have created two rule sets to guide my own development efforts.

Base Rule Set

The Base Rule Set is the rule set that I use for all new .NET development efforts. This rule set excludes the following rules:

  • Globalization Rules: CA1300: Specify MessageBoxOptions
  • Globalization Rules: CA1301: Avoid duplicate accelerators
  • Globalization Rules: CA1302: Do not hardcode locale specific strings
  • Globalization Rules+: CA1303: Do not pass literals as localized parameters
  • Naming Rules: CA1701: Resource string compound words should be cased correctly
  • Naming Rules: CA1702: Compound words should be cased correctly
  • Naming Rules: CA1703: Resource strings should be spelled correctly
  • Naming Rules: CA1704: Identifiers should be spelled correctly
  • Naming Rules: CA1726: Use preferred terms
  • Usage Rules: CA2204+: Literals should be spelled correctly
  • Usage Rules: CA2243: Attribute string literals should parse correctly

+ Not available in VS 2008

Minimum Rule Set

The Minimum Rule Set is the rule set that I use for all existing .NET development efforts. The idea is that these projects will most likely hit more rule violations on their existing code base and I therefore want to exclude some rules that add a lot of noise without providing IMO a lot of benefit. This rule set adds the following additional exclusions to those already excluded within the Base Rule Set:

  • Design Rules: CA1002: Do not expose generic lists
  • Design Rules: CA1003: Use generic event handler instances
  • Design Rules: CA1004: Generic method should provide type parameter
  • Design Rules: CA1005: Avoid excessive parameters on generic types
  • Design Rules: CA1007: Use generics where appropriate
  • Design Rules: CA1010: Collections should implement generic interface
  • Design Rules: CA1020: Use properties where appropriate
  • Design Rules: CA1024: Avoid namespaces with few types
  • Design Rules: CA1031: Do not catch general exception types
  • Design Rules: CA1064: Exceptions should be public
  • Globalization Rules: CA1304: Specify CultureInfo
  • Globalization Rules: CA1305: Specify IFormatProvider
  • Globalization Rules: CA1306: Set locale for data types
  • Naming Rules: CA1705+: Long acronyms should be pascal-cased
  • Naming Rules: CA1706+: Short acronyms should be uppercase
  • Naming Rules: CA1707: Identifiers should not contain underscores
  • Naming Rules: CA1713: Events should not have before or after prefix
  • Naming Rules: CA1714: Flags enums should have plural names
  • Naming Rules: CA1717: Only FlagsAttribute enums should have plural names
  • Mobility Rules: CA1600: Do not use idle process priority
  • Mobility Rules: CA1601: Do not use timers that prevent power state changes
  • Performance Rules: CA1800: Do not cast unnecessary
  • Performance Rules: CA1802: Use literals where appropriate
  • Performance Rules: CA1805: Do not initialize unnecessary
  • Performance Rules: CA1822: Mark members as static
  • Usage Rules: CA2201: Do not raise reserved exception types
  • Usage Rules: CA2205: Use managed equivalent of win32 api
  • Usage Rules: CA2225: Operator overloads have named alternates
  • Usage Rules: CA2242: Test for NaN correctly

+ Not available in VS 2008 

Implementation

There are differences between using FxCop for Visual Studio 2003 and Visual Studio 2005/2008 development. For projects that have been upgraded from VS 2003 to VS 2005, read the following article for tips on how to migrate your existing VS 2003 FxCop exclusions file over to VS 2005 FxCop code exclusions.

VS 2003

Obtain FxCop from here. Analyses can be performed from either using the FxCop GUI directly or from the FxCop command-line utility which allows you to integrate into a continuous integration process.

VS 2005

VS 2005 adds static code analysis through integrating FxCop directly into the IDE. A developer can select to include static code analysis as part of the compilation process which will fire off FxCop and produce a list of errors/warnings in the Error list pane that prevents the code from compiling successfully. Code Analysis Policy for VS Team System can also be created that makes this code analysis process a compulsory step within the compilation process. This will prevent developers from skipping the static code analysis process.

VS 2008

VS 2008 expands and improves on the code analysis provided in the VS 2005 IDE. Some of the improvements made are:

Also be sure to check out the custom FxCop rule to support the multi-targeting feature of VS 2008.

Continuous Integration

FxCop can easily be integrated into a continuous integration process by using the standalone FxCop command-line utility for VS 2003 or by using VS 2005/2008 Code Analysis.  As already mentioned, development teams using Visual Studio Team System can include Code Analysis Policy to enforce the continuous use of FxCop in their development environments.

Custom rule development

It is possible to write your own rules to integrate with FxCop running standalone or integrated into the VS IDE. Here are some articles on the topic:

Further reading

Continuous Integration: From Theory to Practice

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

image Continuous Integration (CI) is a popular incremental integration process whereby each change made to the system is integrated into a latest build. These integration points can occur continuously on every commit to the repository or on regular intervals like every 30 minutes.  They should however not take longer than a day i.e. you need to integrate at least once daily.

In this article I take a closer look at CI.  The article is divided into two main sections: theory and practice.  In the theory section, I consider some CI best practices; look at the benefits of integrating frequently and reflect on some recommendations for introducing CI into your environment.  The practice section provides an in-depth example of how to implement a CI process using .NET development tooling.  I conclude the article by providing some additional, recommended reading.

References

I used the following references in creating the article:

Practices

Martin Fowler presents the following set of practices for CI [Fowler]:

  • Maintain a single source repository - Use a decent source code management system and make sure its location is well known as the place where everyone can go to get source code.  Also ensure that everything is put into the repository (test scripts, database schema, third party libraries etc.)
  • Automate the build - Make sure you can build and launch your system using MSBuild/NAnt scripts in a single command.  Include everything into your build.  As a simple rule of thumb: anyone should be able to bring in a clean machine, check the sources out of the repository and issue a single command to have a running system on their machine.
  • Make your build self-testing - Include automated tests in your build process. 
  • Everyone commits every day - "A commit a day keeps the integration woes away" [Duvall].  Frequent commits encourage developers to break down their work into small chunks of a few hours each. Before committing their code, they need to update their working copy with the mainline, resolve any conflicts and ensure that everything still works fine.  Jeremy Miller refers to this process as the "Check In Dance" [Miller].
  • Every commit should build the mainline on an integration machine - Use a CI server like or CruiseControl.NET or Team Foundation Build.  If the mainline build fails, it needs to be fixed right away.
  • Keep the build fast - If the build takes too long to execute, try and create a staged build/build pipeline where multiple builds are done in a sequence.  The first build (aka ''commit build'') executes quickly and gives other people confidence that they can work with the code in the repository.  Further builds can run additional, slower running unit tests, create code metrics, check the code against coding standards, create documentation etc.
  • Test in a clone of the production environment - Try to set up your test environment to be as exact a mimic of your production environment as possible.  Use the same versions of third party software, the same operating system version etc.   Consider using virtualization to make it easy to put together test environments.
  • Make it easy for anyone to get the latest executable - Make sure that there is a well known place where people can find the latest executable.
  • Everyone can see what's happening - Make the state of the mainline build as visible as possible. Use various feedback mechanisms to relate build status information [Duvall].  Some fun examples include using lava lamps, a big screen LCD and an ambient orb.
  • Automate deployment - Have scripts that allow you to automatically deploy into different environments, including production.  For web applications, consider deploying a trial build to a subset of users before deploying to the full user base.

Jeremy Miller adds the following advice [Miller]:

  • Check in as often as you can - Try breaking down your workload into meaningful chunks of code and integrate these pieces of code when you have a collection of code in a consistent state. Checking in once a week seriously compromises the effectiveness of a CI process.
  • Don't leave the build broken overnight - Developers need to be immediately notified upon a build breakage and make it a top priority to fix a broken build.
  • Don't ever check into a busted build.
  • If you are working on fixing the build, let the rest of the team know.
  • Every developer needs to know how to execute a build locally and troubleshoot a broken build.

Paul Duvall also warns against some additional CI anti-patterns that tend to produce adverse effects [Duvall]. The ones that have not been covered above are:

  • The cold shoulder of spam feedback - Team members sometimes quickly become inundated with build status e-mails (success and failure and everything in between) to the point where they start to ignore messages. Try to make the feedback succinctly targeted so that people don't receive irrelevant information.
  • Don't delay feedback with a slow machine - Get a build machine that has optimal disk speed, processor, and RAM resources for speedy builds.

Benefits

Numerous benefits result from integrating continuously [McConnell]:

  • Errors are easier to locate - New problems can be narrowed down to the small part that was recently integrated.
  • Improved team morale - Programmers see early results from their work.
  • Better customer relations - Customers like signs of progress and incremental builds provide signs of progress frequently.
  • More reliable schedule estimates & more accurate status reporting - Management gets a better sense of progress than the typical "coding is 99% percent complete" message.
  • Units of the system are tested more fully - As integration starts early in the project the code is exercised as part of the overall system more frequently.
  • Work that sometimes surfaces unexpectedly at the end of a project is exposed early on

Where do I start?

Here are some steps to consider for introducing a CI process [Fowler]:

  • Get the build automated - Get everything into source control and make sure you can build the whole system with a single command.
  • Introduce automated testing in the build - Identify major areas where things go wrong and start adding automated tests to expose these failures.
  • Speed up the build - Try aiming at creating a build that runs to completion within ten minutes.  Constantly monitor your build and take action as soon as your start going slower than the ten minute rule.
  • Start all new projects with CI from the beginning

.NET Implementation & Tools

I have written a set of articles that describe a complete setup of a .NET CI solution.  The CI process uses a wide variety of tools including CruiseControl.NET, Subversion, MSBuild, Visual Studio 2008, NUnit, FxCop, NCover, WiX and SandCastle.  You can download the series as a fully searchable PDF with PDF bookmarks and PDF links.

Team Foundation Server users can check out the new and improved out-of-the-box support for continuous integration provided by Team Foundation Server 2008.  This, coupled with the recent announcement on the TFS Licensing Change for TFS 2008, makes using Team Foundation Server less expensive and a more viable option to consider.

Recommended Reading

  1. Continuous Integration: Improving Software Quality and Reducing Risk by Paul Duvall.
  2. Software Configuration Management Patterns by Steve Berczuk and Brad Appleton.
  3. Refactoring Databases by Scott Ambler and Pramodkumar Sadalage.
  4. Visual Studio Team System: Better Software Development for Agile Teams by Will Stott and James Newkirk.

Continuous Integration: From Theory to Practice 2nd Edition

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

image During last year I created a guide on implementing Continuous Integration (CI) for a .NET 2.0 development environment.  The guide illustrates how to create a complete CI setup using VS 2005 and MSBuild (no NAnt) together with tools like FxCop, NCover, TypeMock, NUnit, Subversion, InstallShield, QTP, NDepend, Sandcastle and CruiseControl.NET.

The good news is that I spend some time during the last 2 weeks greatly improving the setup for use on a new VS 2008 project and I have decided to release a 2nd Edition of the guide covering the much improved setup.  Instead of creating another series of blog posts to cover the content, I'm releasing the 2nd edition only as a downloadable PDF guide together with all the associated code and build artefacts.  This will allow new teams to get up and running with CI a lot quicker.

For readers of the first edition of the guide, the most notable differences between the second edition and the first edition of the guide are:

  1. Updated to use VS 2008, .NET 3.5 and MSBuild 3.5 (including new MSBuild features like parallel builds and multi-targeting).
  2. All tools (NUnit, NDepend, NCover etc.) are now stored in a separate Tools folder and kept under source control. The only development tools a developer needs to install are VS 2008, SQL Server 2005 and Subversion. The rest of the tools are retrieved form the mainline along with the latest version of the source code.
  3. Added the CruiseControl.NET configuration (custom style sheets, server setup etc.) to source control and created a single step setup process for the build server. This greatly simplifies the process of setting up a new build server.
  4. Changed from using InstallShield to Windows Installer XML (WiX) for creating a Windows installer (msi).
  5. Added support for running MbUnit tests in addition to the NUnit tests.
  6. Added support for running standalone FxCop in addition to running VS 2008 Managed Code Analysis.
  7. Added targets to test the install and uninstall of the Windows installer created.
  8. Consolidated the CodeDocumentationBuild to become part of the DeploymentBuild.
  9. Removed the QTP integration as this was not a requirement for the new project. If you want to integrate QTP, please refer to the QtpBuild of the first edition of the guide.
  10. Used the latest version of all the tools available.  The tools used in the guide are VS 2008, Subversion, CruiseControl.NET, MSBuild, MSBuild.Community.Tasks, NUnit/MbUnit, FxCop, TypeMock/Rhino.Mocks, WiX, Sandcastle, NCover, NCoverExplorer and NDepend.

I hope you find it to be a useful resource for assisting you with creating your own CI process by harnessing the power of MSBuild!  If you have any questions, additional remarks or any suggestions, feel free to drop me a comment.

Download

Here are the links:

  1. PDF Guide
  2. Code and Build artifacts

Saturday 31 October 2009

Serious Software Developer Reading List

image As software developers we constantly need to keep up with the latest technology to keep ourselves marketable.  Unfortunately we are sometimes so busy keeping up with the new stuff coming out of Redmond that I fear we are not spending enough time learning the craft of software development.  Here is a list of non-technology books that I think every serious developer should at some stage read to become better at writing quality software.  As software development is a team discipline the list includes some books that highlight how to create and be part of effective software development teams:

  1. Agile Principles, Patterns and Practices in C#. Robert C. Martin: If you can only read one book out of the list, this is my number one.
  2. Patterns of Enterprise Application Architecture. Martin Fowler: Great read for getting into enterprise application architecture.
  3. Domain Driven Design: Tackling complexity in the heart of software. Eric Evans.  Also see the shortened e-book provided by InfoQ.
  4. Applying Domain Driven Design and Patterns: With Examples in C# and .NET. Jimmy Nillson.
  5. Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3rd Edition. Craig Larman
  6. Refactoring. Martin Fowler
  7. Refactoring to Patterns. Joshua Kerievsky: A better way to learning OO patterns may be to start with existing code smells and refactor these using OO patterns.
  8. Code Complete, 2nd Edition. Steve McConnell:  Every thing on crafting quality software.
  9. Framework Design Guidelines. Krzystof Cwalina, Brad Abrams: The book may be specific to the .NET framework but still contains a lot of excellent guidelines for any framework type of development.  Also see the 2nd Edition.
  10. Pragmatic unit testing in C# using NUnit, 2nd Edition. Andy Hunt:  The nice thing about all the Pragmatic Programmers books are that they are short and to the point. Also see the excellent Art of Unit Testing by Roy Osherove.
  11. Peopleware: Productive Projects and Teams, 2nd Edition. Tom DeMarco and Timothy Lister: 20 years old but oh still so relevant.
  12. The Mythical Man-Month: Essays on Software Engineering, 20 anniversary edition.  Frederick P. Brooks
  13. Designing Interfaces. Jenifer Tidwell: My favourite book on designing user interfaces
  14. Working Effectively with Legacy Code.  Michael C. Feathers.
  15. xUnit Test Patterns: Refactoring Test Code. Gerard Meszaros.
  16. The Art of Agile Development. James Shore.
  17. Software Configuration Management Patterns. Steve Berczuk and Brad Appleton.
  18. Release It! Design and Deploy Production-Ready Software. Michael T. Nygard: Another Pragmatic Bookshelf gem.

Ultimate .NET development tools

image I've been wanting to make a list for my own reference of all the best-of-breed tools that I prefer to use when doing .NET development.  I specifically decided to not include any third party control/report libraries.  I focus instead on the tools that assist me in crafting high-quality code quickly and effectively.

Categories

  • IDE = Develop/generate/refactor code within the VS IDE or separate IDE
  • SCM = Software Configuration Management (Source Control etc.)
  • TDD = Test Driven Development
  • DBMS = Database Management Systems
  • CI = Continuous Integration
  • FR = Frameworks (Persistence, AOP, Inversion of Control, Logging etc.)
  • UT = Utility Tools
  • CA = Code Analysis (Static + Dynamic)
  • TC = Team Collaboration (Bug tracking, Project management etc.)
  • MD = Modelling
  • QA = Testing Tools
  • DP = Deployment (Installations etc.)

Tools

* = free/open source

  1. [IDE] Visual Studio 2008 Team Edition for Software Developers
  2. [IDE] ReSharper for refactoring and to "develop with pleasure"
  3. [IDE] CodeSmith for generating code.  Also consider T4 with Clarius’s Visual T4 Editor.  
  4. [IDE]* GhostDoc for inserting xml code comments
  5. [IDE] Altova Xml Suite for any xml related work.  XmlPad is the best, free alternative I know of.
  6. [DBMS] SqlServer 2008 for DBMS
  7. [DBMS] Visual Studio 2008 Team Edition for Database Professionals for managing databases as code artifacts
  8. [SCM]* Subversion for source control
  9. [SCM]* TortoiseSVN as windows shell extension for Subversion
  10. [SCM] VisualSVN for integration of TortoiseSVN into VS.  Ankh is the best, free alternative I know of.
  11. [SCM]* KDiff3 for merging
  12. [TDD]* NUnit as preferred xUnit testing framework
  13. [TDD] TestDriven.NET as test runner for "zero-friction unit testing"! 
  14. [TDD]* moq as mock framework.
  15. [TDD] NCover for code coverage stats
  16. [CI]* TeamCity as build server
  17. [CI]* MSBuild Extension Pack for additional MSBuild tasks.  Also see the MSBuild.Community.Tasks
  18. [FR]* log4net as logging framework.  Also see Log4View for an excellent UI for the log files.
  19. [FR]* PostSharp as Aspect Oriented Programming framework
  20. [FR]* Ninject as IoC container
  21. [FR]* NHibernate as Object Relational Mapper.  MindScape LightSpeed also seems to be maturing very nicely.
  22. [UT]* Reflector to drill down to the guts of any code library (also check-out the nice plug-ins)
  23. [UT] Silverlight Spy to dissect any Silverlight application.
  24. [UT] RegexBuddy for managing those difficult regular expressions.  Regulator is the best, free alternative I know of. 
  25. [UT]* SnippetCompiler to quickly test snippets of .NET code.
  26. [UT]* LINQPad as a easy way to query SQL databases using LINQ.
  27. [UT]* Fiddler to debug all your HTTP traffic in IE.   Also see the neXpert plugin for monitoring performance problems.
  28. [UT]* Web Development Helper to assist with testing ASP.NET applications running in IE.  Also see the IE Developer Toolbar for additional IE web development tools.
  29. [UT]* Firebug to assist with testing web applications running in Firefox. Also see YSlow add-on for performance testing and Web Developer add-on for additional Firefox web development tools.
  30. [CA]* FxCop to enforce .NET coding guidelines
  31. [CA] NDepend to get all the static code metrics I'd ever want
  32. [CA] ANTS Profiler for performance and memory profiling
  33. [MD] Enterprise Architect to do UML Modelling and Model Driven Design if required. Alternatively use Visio with these simple templates
  34. [MD]* FreeMind as mind mapping tool
  35. [TC]* ScrewTurn Wiki for team collaboration
  36. [QA]* Eviware soapUI for testing of SOA web services
  37. [QA]* WebAii for automated regression testing of Web 2.0 apps
  38. [DP]* Windows Installer XML (WiX) for creating Windows Installers