[ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

# 1. Introduction

Complexity measurement tools provide several pieces of information. They help to:

1. locate suspicious areas in unfamiliar code
2. get an idea of how much effort may be required to understand that code
3. get an idea of the effort required to test a code base
4. provide a reminder to yourself. You may see what you’ve written as obvious, but others may not. It is useful to have a hint about what code may seem harder to understand by others, and then decide if some rework may be in order.

But why another complexity analyzer? Even though the McCabe analysis tool already exists (`pmccabe`), I think the job it does is too rough for gauging complexity, though it is ideal for gauging the testing effort. Each code path should be tested and the `pmccabe` program provides a count of code paths. That, however, is not the only issue affecting human comprehension. This program attempts to take into account other factors that affect a human’s ability to understand.

 [ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 1.1 Code Length

Since `pmccabe` does not factor code length into its score, some folks have taken to saying either long functions or a high McCabe score find functions requiring attention. But it means looking at two factors without any visibility into how the length is obfuscating the code.

The technique used by this program is to count 1 for each line that a statement spans, plus the complexity score of control expressions (`for`, `while`, and `if` expressions). The value for a block of code is the sum of these multiplied by a nesting factor (see section nesting-penalty option (-n)). This score is then added to the score of the encompassing block. With all other things equal, a procedure that is twice as long as another will have double the score. `pmccabe` scores them identically.

 [ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 1.2 Switch Statement

`pmccabe` has changed the scoring of `switch` statements because they seemed too high. `switch` statements are now “free” in this new analysis. That’s wrong, too. The code length needs to be counted and the code within a `switch` statement adds more to the difficulty of comprehension than code at a shallower logic level.

This program will multiply the score of the `switch` statement content by the See section nesting score factor.

 [ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 1.3 Logic Conditions

pmccabe’ does not score logic conditions very well. It overcharges for simple logical operations, it doesn’t charge for comma operators, and it undercharges for mixing assignment operators and relational operators and the `and` and `or` logical operators.

For example:

 ```xx = (A && B) || (C && D) || (E && F); ```

scores as `6`. Strictly speaking, there are, indeed, six code paths there. That is a fairly straight forward expression that is not nearly as complicated as this:

 ``` if (A) { if (B) { if (C) { if (D) a-b-c-and-d; } else if (E) { a-b-no_c-and-e; } } } ```

and yet this scores exactly the same. This program reduces the cost to very little for a sequence of conditions at the same level. (That is, all `and` operators or all `or` operators.) so the raw score for these examples are 4 and 35, respectively (1 and 2 after scaling, see section `--scale`).

If you nest boolean expressions, there is a little cost, assuming you parenthesize grouped expressions so that `and` and `or` operators do not appear at the same parenthesized level. Also assuming that you do not mix assignment and relational and boolean operators all together. If you do not parenthesize these into subexpressions, their small scores get multiplied in ways that sometimes wind up as a much higher score.

The intent here is to encourage easy to understand boolean expressions. This is done by,

• not combining them with assignment statements
• canonicalizing them (two level expressions with all `&&` operators at the bottom level and all `||` operators in the nested level -\- or vice versa)
• parenthesizing for visual clarity (relational operations parenthesized before being joined into larger `&&` or `||` expressions)
• breaking them up into multiple `if` statements, if convenient.

 [ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 1.4 Personal Experience

I have used `pmccabe` on a number of occasions. For a first order approximation, it does okay. However, I was interested in zeroing in on the modules that needed the most help and there were a lot of modules needing help. I was finding I was looking at some functions where I ought to have been looking at others. So, I put this together to see if there was a better correlation between what seemed like hard code to me and the score derived by an analysis tool.

This has worked much better. I ran `complexity` and `pmccabe` against several million lines of code. I correlated the scores. Where the two tools disagreed noticeably in relative ranking, I took a closer look. I found that ‘complexity’ did, indeed, seem to be more appropriate in its scoring.

 [ < ] [ > ] [ << ] [ Up ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 1.5 Rationale Summary

Ultimately, complexity is in the eye of the beholder and, even, the particular mood of the beholder, too. It is difficult to tune a tool to properly accommodate these variables.

`complexity` will readily score as zero functions that are extremely simple, and code that is long with many levels of logic nesting will wind up scoring much higher than with `pmccabe`, barring extreme changes to the default values for the tunables.

I have included several adjustments so that scores can be tweaked to suit personal taste or gathered experience. (See section nesting score factor, and nested expression scoring factor, but also See section normalization scaling factor, to adjust scores to approximate scores rendered by `pmccabe`).

 [ << ] [ >> ] [Top] [Contents] [Index] [ ? ]

This document was generated by Bruce Korb on May 15, 2011 using texi2html 1.82.