Measuring Code Complexity Metrics

Edit

Introducing code complexity metrics

by Anand B Pillai

Let us consider two common scenarios that occur in software development companies.

  1. You have been given the job of maintaining some code written by a previous employee. The code is totally new to you since you were never part of this project.
  2. You have been asked to test some new code written by another person.
Both scenarious require you to understand the code and the style of programming. Though these seem trivial, most of the time such jobs can become non-trivial due to the complexity of the code. If sufficient documentation is not provided along with the code (a very common situation), you will have to work back from the code to understand:
  • What the code is supposed to do at a macro level
  • What is happening in the guts of the code at a micro level
  • Understanding what is written and how the code is designed to achieve what it is supposed to do
  • Figuring out the programming style of the original coder
  • The environment in which the code is run
  • Assumptions about all of the above
Code complexity derives directly out of these aspects. It can be defined as a metric which is directly proportional to the amount of effort required to understand the code and modify it correctly. Code complexity metrics are directly related to the maintainability and testability of the code. The more complex a code is, the less maintainable and testable it is. If the code is object oriented, the complexity metrics also have a direct bearing on the extensibility and modularity of the code. Maintanence metrics can be subdivided into formatting metrics and logical metrics. The former deals with aspects such as indentation conventions, code comment guidelines, naming conventions, white space usage etc. Logical metrics give you information about aspects directly associated with program logic and code flow - these are things such as the number of paths through a program, depth of conditional statements and blocks, the number of parameters to functions etc.

Edit

McCabe Cyclomatic Complexity Metric

The McCabe cyclomatic complexity metric is probably the most useful logical metric. It was introduced by Thomas McCabe in 1976.

The McCabe metric measures the number of linearly independent paths through a program. For example, a simple function with no conditionals has only one path; a function with two conditionals has two paths. The metric is based on the logic that programs with simple conditionals are more easy to follow and hence less complex. Those with multiple conditionals and intertwined logic are harder to follow and hence, more complex. The complexity also increases with multiple exit conditionals for the program.

The McCabe cyclomatic complexity is defined as,

M = E - N + P  

where,

M  => The McCabe metric,
E  => The number of edges of the graph of the program,
N  => The number of nodes of the graph
P  => The number of connected components

'P', can also be considered as the number of exits from the program logic. This is typically in functions which return at the end.

There is a simpler way of thinking about this metric. It can be considered as the number of decision points in a program plus one for the program's entry point.

i.e,

M = D + 1

where,

M => The McCabe metric,
D => The number of decision points inside the code

For example, the following Python code will have a cyclomatic complexity of five:

#!python
def func(x):
        if x==0:
                return 3
        elif x==1:
                return 4
        elif  x==2:
                return 5
        else:
                return 0

This is because the program has four if..else conditionals. Add to it one for the entry point and you get the McCabe metric of five.

Researchers in this area have agreed that a cyclomatic complexity of more than 10 for a particular method is considered complex. Here are some standard ranges of this metric and the associated code complexity:


Cyclomatic Complexity Code Complexity
1-10 A simple program, without much risk
11-20 More complex, moderate risk
21-50 Complex, high risk
51+ Untestable, very high risk

There are a number of tools available for measuring this metric. The IBM Eclipse project provides a metrics plugin. For measuring this metric for Python, there is the excellent open source tool called PyMetrics. More on that tool in the next article.

Apart from the cyclomatic metric, there are other code complexity metrics that one should subject his code to occassionally, in order to make sure that the code remains maintainable and extensible. Some of these are Essential complexity metric, Design complexity metric and Data complexity metric.

Edit

Further Reading



Most Recent

Most Popular

Most Active Categories




Back To Top Add New Article Printable Page

MediaWiki

This page has been accessed 8,088 times.

This page was last modified 20:00, 13 December 2005.