Google Website Translator Gadget

Sunday, 01 November 2009

Code Metrics

NOTE: This is a repost of on old post as I am moving onto the Blogger platform


I've been wanting to do a post about code metrics for quite a while now - mostly to organize my thoughts on the topic as it is something that I want to introduce at work, but also to get some feedback from other people as to if and how they are using metrics to assist them in crafting quality code.  After reading Jeremy Miller's post on the topic, I thought I might as well take the plunge.   I'll start by musing over what metrics I found to be useful, then continue with looking at tool support for generating these metrics and finish off with considering when to use these metrics.

Metrics

I am not going to cover all the different metrics in detail but instead highlight what seems to me to be the most useful metrics and refer to articles/links where other people who have done an excellent job on covering these metrics in detail.  Here are the reference articles that I used:
  1. Robert C. Martin's article on OO design quality metrics [1]
  2. Wikipedia's summary of these package metrics [2]
  3. Kirk Knoernschild's excellent introductory article on metrics with sample refactorings included [3]
  4. Patrick Smacchia's (developer of NDepend software) excellent coverage on all the types of metrics supported by NDepend [4]
  5. Write up on the software metrics supported by the Software Design Metrics software [5]

Size metrics

Size metrics are consistently good indicators of fault-proneness: large methods/classes/packages contain more faults [5].
  • Source Lines Of Code (SLOC) measures the amount of lines of code.  To be really useful comment lines and lines that have been broken into multiple lines need to be factored out.  Some people refer to this as the logical LOC vs. the physical LOC. 
"2 significant advantages of logical LOC over physical LOC are:
  • Coding style doesn’t interfere with logical LOC. For example the LOC won’t change because a method call is spawn on several lines because of a high number of argument.
  • logical LOC is independent from the language. Values obtained from assemblies written with different languages are comparable and can be summed." [4]

Complexity metrics

There is a direct correlation between complexity and the defect rate of software, so keeping code simple is a solid first step toward lowering the defect rate of software [3].
  • Cyclomatic Complexity (CC) measures code complexity by counting the number of linearly independent paths through code. Complex conditionals and boolean operators increase the number of linear paths, resulting in a higher CCN. Methods with a CCN of five or higher are good refactoring candidates to help ensure code remains easy to understand [3]

Package metrics

Coupling/Dependency metrics
Excessive dependencies between packages compromise architecture and design. Complex dependencies inhibit the testability of your system and presents numerous other challenges as presented in [1] and [3].
  • Afferent Coupling (Ca) measures the number of types outside a package that depend on types within the package (incoming dependencies). High afferent coupling indicates that the concerned packages have many responsibilities. [1]
  • Efferent Coupling (Ce) measures the number of types inside a package that depends on types outside of the package (outgoing dependencies). High efferent coupling indicates that the concerned package is dependant. [1]
It goes to reason that:
"...afferent and efferent coupling allows you to more effectively evaluate the cost of change and the likelihood of reuse. For instance, maintaining a module with many incoming dependencies is more costly and risky since there is greater risk of impacting other modules, requiring more thorough integration testing. Conversely, a module with many outgoing dependencies is more difficult to test and reuse since all dependent modules are required ... Concrete modules with high afferent coupling will be difficult to change because of the high number of incoming dependencies. Modules with many abstractions are typically more extensible, so long as the dependencies are on the abstract portion of a module." [3]
  • Instability (I) measures the ratio of efferent coupling (Ce) to total coupling. I = Ce / (Ce + Ca). This metric is an indicator of the package's resilience to change. The range for this metric is 0 to 1, with I=0 indicating a completely stable package and I=1 indicating a completely instable package. [1]
  • Abstractness (A) measures the ratio of the number of internal abstract types (i.e abstract classes and interfaces) to the number of internal types. The range for this metric is 0 to 1, with A=0 indicating a completely concrete package and A=1 indicating a completely abstract package. [1]
  • Distance from main sequence (D) measures the perpendicular normalized distance of a package from the idealized line A + I = 1 (called main sequence). This metric is an indicator of the package's balance between abstractness and stability. A package squarely on the main sequence is optimally balanced with respect to its abstractness and stability. Ideal packages are either completely abstract and stable (I=0, A=1) or completely concrete and instable (I=1, A=0). The range for this metric is 0 to 1. [1][4]
"A value approaching zero indicates a module is abstract is relation to its incoming dependencies. As distance approaches one, a module is either concrete with many incoming dependencies or abstract with many outgoing dependencies. The first case represents a lack of design integrity, while the second is useless design." [3]
Cohesion metrics
"A low cohesive design element has been assigned many unrelated responsibilities. Consequently, the design element is more difficult to understand and therefore also harder to maintain and reuse. Design elements with low cohesion should be considered for refactoring, for instance, by extracting parts of the functionality to separate classes with clearly defined responsibilities." [5]
  • Relational Cohesion (H) measures the average number of internal relationships per type. Let R be the number of type relationships that are internal to this package (i.e that do not connect to types outside the package). Let N be the number of types within the package. H = (R + 1)/ N. The extra 1 in the formula prevents H=0 when N=1. The relational cohesion represents the relationship that this package has to all its types.  As classes inside an package should be strongly related, the cohesion should be high. On the other hand, too high values may indicate over-coupling. A good range for RelationalCohesion is 1.5 to 4.0. Packages where RelationalCohesion < 1.5 or RelationalCohesion > 4.0 might be problematic. [4]
  • Lack of Cohesion of Methods (LCOM) The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class [4]
Inheritance metrics
"Deep inheritance structures are hypothesized to be more fault-prone. The information needed to fully understand a class situated deep in the inheritance tree is spread over several ancestor classes, thus more difficult to overview.  Similar to high export coupling, a modification to a design element with a large number of descendents can have a large effect on the system." [5]
  • Depth of Inheritance Tree (DIT) measures the number of base classes for a class or structure.  Types where DIT is higher than 6 might be hard to maintain. However it is not a rule since sometime your classes might inherit from tier classes which have a high value for depth of inheritance. [4]

Tools

When it comes to tools, the Mercedes Benz of .NET code metrics tools from my point of view has to be NDepend 2.0NDepend 2 provides more than 60 metrics (including all of the metrics listed above) and includes integration into your automated build via support for MSBuild, NAnt and CruiseControl.NET.  Browse to here for a sample report and here for a demo on how to integrate it into your build. 
There is a visual GUI (VisualNDepend) that allows you to browse your code structure and evaluate the metrics as well as a console application that you can generate the metrics with.  Patrick has also created a CQL (Code Query Language) which allows NDepend to consider your code as a database with CQL being the query language with which you can check some assertions on this database. As a consequence, CQL is similar to SQL and supports the typical SELECT TOP FROM WHERE ORDER BY patterns. Here is an example of a CQL query:

WARN IF Count > 0 IN SELECT METHODS WHERE NbILInstructions > 200 ORDER BY NbILInstructions DESC // METHODS WHERE NbILInstructions > 200 are extremely complex and // should be split in smaller methods.
How cool is this! To quote:
"CQL constraints are customisable and typically tied with a particular application. For example, they can allow the specification of customized encapsulation constraints, such as, I want to ensure that this layer will never use this other layer or I want to ensure that this class will never be instantiated outside this particular namespace."
VisualNDepend also provides a CQL editor which supports intellisense and verbose compile error descriptions to make writing CQL queries a lot easier.  Enough said!  Browse to here for a complete overview of the NDepend 2 features.
Other tools that that you can have a look at include SourceMonitor, DevMetrics, Software Design Metrics and vil to name a few. vil does not support .NET 2.0 and does not seem to be under active development. DevMetrics, after being open-sourced, seemed to have stagnated with no visible activity on SourceForge. SourceMonitor is actively under development and supports a variety of programming languages. However, it supports only a small subset of the metrics mentioned which does not include support for important metrics like efferent and afferent coupling etc. Software Design Metrics takes a novel approach in that it measures the complexity based on the UML models for the software. This has the advantage of being language independent, but you obviously need to have UML models to run
the analysis.

When to use

When should one use these metrics? I agree with Jeremy Miller in his post that the metrics should not replace the visual inspection/QA process and be performed in addition to it. It would be nice to have these metrics at hand to assist in the QA process though. I also agree with Frank Kelly in his post that a working system with no severity 1/2 errors and happy end users are more important than getting the right balance of Ca/Ce or whatever metric you are interested in.
I think I will stick with an approach of identifying a subset of useful metrics and using these as part of an overall process of static code analysis on a regular basis. When I say regular basis I feel it should be part of your continuous build process to prevent people from committing code into your repository that does not satisfy your constraints. With a tool like NDepend you can create your own custom level of acceptance criteria by which the build will fail/succeed and exclude metrics that you feel should not apply to your code base.
As mentioned, the code metrics should form part of a bigger quality process that includes:
  • Visual inspection/QA via peer code reviews (as mentioned having the metrics via a tool like VisualNDepend can greatly assist in the QA)
  • Automated code standards check (I prefer FxCop)
  • Automated code metric check (NDepend seems like the tool to use here)
  • Automated code coverage statistics (I prefer NCover and NCoverExplorer)
Well, that basically covers my thoughts on the topic for now. I'm interested to know how many people are actively using metrics and what metrics they are using. I'd also like to know what processes or tools people are using to generate these metrics on their code. Let me know what's working for you in your environment.

Code Reviews

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

Code reviews are a proven, effective way to minimize defects, improve code quality and keep code more maintainable.  It encourages team collaboration and assists with mentoring developers. Yet, not many projects employ code reviews as a regular part of their development process. Why is this?  Programmer egos and the hassles of packaging source code for review are often sited as some reasons for not doing code reviews.

I felt that code reviews should form part of a good code quality control process that includes:

  • Visual inspection/QA via peer code reviews
  • Automated code standards check (I prefer FxCop)
  • Automated code metrics check (I prefer NDepend)
  • Automated code coverage statistics (I prefer NCover and NCoverExplorer)

In this post I will spend some time looking at code reviews.  I'll start by considering different styles of code review and some code review metrics. I'll then move on to some thoughts on review frequencies and best practices for peer code review.  I'll finish off by considering some tools that can assist with the code review process itself.

FxCop

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

Traffic lightIn previous posts about Code Metrics and Code Reviews, I explored some metrics and techniques that I felt should form part of any good software quality control process.  One of the tools that I mentioned is FxCop.  In this post I take a closer look at FxCop.  I start by looking at how FxCop works and how you can fit it into your development process.  I then consider the different rule sets to use and look at how you can utilise FxCop to guide your VS 2003/2005/2008 development efforts.  I finish off the article by linking to articles that show you how to develop your own custom FxCop rules.

Continuous Integration in .NET: From Theory to Practice

NOTE: This is a repost of on old post as I am moving onto the Blogger platform

TeamPuzzle Continuous Integration (CI) is a popular incremental integration process whereby each change made to the system is integrated into a latest build. These integration points can occur continuously on every commit to the repository or on regular intervals like every 30 minutes.  They should however not take longer than a day i.e. you need to integrate at least once daily.  In this article I take a closer look at CI.  The article is divided into two main sections: theory and practice.  In the theory section, I consider some CI best practices; look at the benefits of integrating frequently and reflect on some recommendations for introducing CI into your environment.  The practice section provides an in-depth example of how to implement a CI process using .NET development tooling.  I conclude the article by providing some additional, recommended reading.

 

References

I used the following references in creating the article:

Continuous Integration in .NET: From Theory to Practice 2nd Edition

NOTE: This is a repost of on old post as I am moving onto the Blogger platform


TeamPuzzle During last year I created a guide on implementing Continuous Integration (CI) for a .NET 2.0 development environment.  The guide illustrates how to create a complete CI setup using VS 2005 and MSBuild (no NAnt) together with tools like FxCop, NCover, TypeMock, NUnit, Subversion, InstallShield, QTP, NDepend, Sandcastle and CruiseControl.NET.  The good news is that I spend some time during the last 2 weeks greatly improving the setup for use on a new VS 2008 project and I have decided to release a 2nd Edition of the guide covering the much improved setup.  Instead of creating another series of blog posts to cover the content, I'm releasing the 2nd edition only as a downloadable PDF guide together with all the associated code and build artefacts.  This will allow new teams to get up and running with CI a lot quicker.


For readers of the first edition of the guide, the most notable differences between the second edition and the first edition of the guide are:

  1. Updated to use VS 2008, .NET 3.5 and MSBuild 3.5 (including new MSBuild features like parallel builds and multi-targeting).
  2. All tools (NUnit, NDepend, NCover etc.) are now stored in a separate Tools folder and kept under source control. The only development tools a developer needs to install are VS 2008, SQL Server 2005 and Subversion. The rest of the tools are retrieved form the mainline along with the latest version of the source code.
  3. Added the CruiseControl.NET configuration (custom style sheets, server setup etc.) to source control and created a single step setup process for the build server. This greatly simplifies the process of setting up a new build server.
  4. Changed from using InstallShield to Windows Installer XML (WiX) for creating a Windows installer (msi).
  5. Added support for running MbUnit tests in addition to the NUnit tests.
  6. Added support for running standalone FxCop in addition to running VS 2008 Managed Code Analysis.
  7. Added targets to test the install and uninstall of the Windows installer created.
  8. Consolidated the CodeDocumentationBuild to become part of the DeploymentBuild.
  9. Removed the QTP integration as this was not a requirement for the new project. If you want to integrate QTP, please refer to the QtpBuild of the first edition of the guide.
  10. Used the latest version of all the tools available.  The tools used in the guide are VS 2008, Subversion, CruiseControl.NET, MSBuild, MSBuild.Community.Tasks, NUnit/MbUnit, FxCop, TypeMock/Rhino.Mocks, WiX, Sandcastle, NCover, NCoverExplorer and NDepend.

I hope you find it to be a useful resource for assisting you with creating your own CI process by harnessing the power of MSBuild!  If you have any questions, additional remarks or any suggestions, feel free to drop me a comment.

Download

Here are the links:
  1. PDF Guide
  2. Code and Build artifacts