1. Introduction

Complexity measurement tools provide several pieces of information. They help to:

locate suspicious areas in unfamiliar code
get an idea of how much effort may be required to understand that code
get an idea of the effort required to test a code base
provide a reminder to yourself. You may see what you’ve written as obvious, but others may not. It is useful to have a hint about what code may seem harder to understand by others, and then decide if some rework may be in order.

But why another complexity analyzer? Even though the McCabe analysis tool already exists (pmccabe), I think the job it does is too rough for gauging complexity, though it is ideal for gauging the testing effort. Each code path should be tested and the pmccabe program provides a count of code paths. That, however, is not the only issue affecting human comprehension. This program attempts to take into account other factors that affect a human’s ability to understand.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.1 Code Length

Since pmccabe does not factor code length into its score, some folks have taken to saying either long functions or a high McCabe score find functions requiring attention. But it means looking at two factors without any visibility into how the length is obfuscating the code.

The technique used by this program is to count 1 for each line that a statement spans, plus the complexity score of control expressions (for, while, and if expressions). The value for a block of code is the sum of these multiplied by a nesting factor (see section nesting-penalty option (-n)). This score is then added to the score of the encompassing block. With all other things equal, a procedure that is twice as long as another will have double the score. pmccabe scores them identically.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.2 Switch Statement

pmccabe has changed the scoring of switch statements because they seemed too high. switch statements are now “free” in this new analysis. That’s wrong, too. The code length needs to be counted and the code within a switch statement adds more to the difficulty of comprehension than code at a shallower logic level.

This program will multiply the score of the switch statement content by the See section nesting score factor.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.3 Logic Conditions

‘pmccabe’ does not score logic conditions very well. It overcharges for simple logical operations, it doesn’t charge for comma operators, and it undercharges for mixing assignment operators and relational operators and the and and or logical operators.

For example:

xx = (A && B) || (C && D) || (E && F);

scores as 6. Strictly speaking, there are, indeed, six code paths there. That is a fairly straight forward expression that is not nearly as complicated as this:

  if (A) {
    if (B) {
      if (C) {
        if (D)
          a-b-c-and-d;
      } else if (E) {
          a-b-no_c-and-e;
      }
    }
  }

and yet this scores exactly the same. This program reduces the cost to very little for a sequence of conditions at the same level. (That is, all and operators or all or operators.) so the raw score for these examples are 4 and 35, respectively (1 and 2 after scaling, see section --scale).

If you nest boolean expressions, there is a little cost, assuming you parenthesize grouped expressions so that and and or operators do not appear at the same parenthesized level. Also assuming that you do not mix assignment and relational and boolean operators all together. If you do not parenthesize these into subexpressions, their small scores get multiplied in ways that sometimes wind up as a much higher score.

The intent here is to encourage easy to understand boolean expressions. This is done by,

not combining them with assignment statements
canonicalizing them (two level expressions with all && operators at the bottom level and all || operators in the nested level -\- or vice versa)
parenthesizing for visual clarity (relational operations parenthesized before being joined into larger && or || expressions)
breaking them up into multiple if statements, if convenient.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.4 Personal Experience

I have used pmccabe on a number of occasions. For a first order approximation, it does okay. However, I was interested in zeroing in on the modules that needed the most help and there were a lot of modules needing help. I was finding I was looking at some functions where I ought to have been looking at others. So, I put this together to see if there was a better correlation between what seemed like hard code to me and the score derived by an analysis tool.

This has worked much better. I ran complexity and pmccabe against several million lines of code. I correlated the scores. Where the two tools disagreed noticeably in relative ranking, I took a closer look. I found that ‘complexity’ did, indeed, seem to be more appropriate in its scoring.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

1.5 Rationale Summary

Ultimately, complexity is in the eye of the beholder and, even, the particular mood of the beholder, too. It is difficult to tune a tool to properly accommodate these variables.

complexity will readily score as zero functions that are extremely simple, and code that is long with many levels of logic nesting will wind up scoring much higher than with pmccabe, barring extreme changes to the default values for the tunables.

I have included several adjustments so that scores can be tweaked to suit personal taste or gathered experience. (See section nesting score factor, and nested expression scoring factor, but also See section normalization scaling factor, to adjust scores to approximate scores rendered by pmccabe).

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Bruce Korb on May 15, 2011 using texi2html 1.82.