Metrics of Complexity
Determining whether some unit of code is more complex than another unit of code is often a difficult and subjective proposition. What follows is a short attempt to define several metrics to be used when evaluating the complexity of a unit of code. With all metrics, the higher the value, the more complex the unit and the lower the value, the more simple the unit.
I do not intend for these metrics to have a defined impact on the comprehensibility of a unit. There is a correlation between complexity and comprehensibility, but many other factors allow for better comprehensibility than just these metrics; the use of shared knowledge and assumptions is one. Comprehensibility is so subjective that a formal definition of it may be impossible.
It’s not possible or practical to come up with a single measurement of complexity by combining these metrics. Each metric must be taken on its own, and simplifications made with respect to one metric may complicate another metric. Therefore, a single number would not provide an accurate representation of the complexity of a unit. Remember as well that in general the simpler a system is, the less it is capable of doing so simplicity of a system is not necessarily the end goal.
Syntactic Complexity
Definition
Syntactic complexity is primarily a line-oriented metric. To determine the complexity of larger units of software, simply add the complexity of the constituent lines.
The definition of syntactic complexity is simply the number of unique tokens in a line, where a token is defined by the language.
Implications
Note that the definition above is not language-agnostic, and a thorough understanding of the language syntax is required in order to determine the complexity of a line of code.
Syntactic complexity clarifies the concepts of verbosity and readability.
Parametric Complexity
Definition
A parameter is anything used as explicit input to a unit of code. These mostly take the form of function or method arguments. They can also take the form of values passed to an object constructor, or arguments provided at the command line. Parameters can be optional or required. Parameters may be optional or required.
The complexity of a unit with respect to its parameters is the sum of the total number of required parameters and twice the total number of optional parameters. Optional parameters increase the complexity by two because they require the user to be aware of the unit’s behavior when that parameter is not provided.
Implications
Note that although optional parameters have a greater impact on complexity than required parameters, it does not mean that it makes the unit harder to comprehend. The use of sensible defaults in optional parameters can take advantage of shared assumptions about the behavior of a unit of code, and allow the user to leverage their intuition when predicting its behavior.
Dependency Complexity
Definition
A dependency is a unit of code that is mentioned in the body of the unit of code under measure. If that dependency is also a parameter of the unit, then it is an explicit dependency. If it is not a parameter, then it is an implicit dependency.
The simplicity of the code with respect to dependencies is the sum of the total number of explicit dependencies and twice the total number of implicit dependencies.
Implicit dependencies contribute more to complexity because they are not immediately evident to the user of the code. Thus they have more potential for surprise and unexpected behavior.
Implications
Minimizing dependency simplicity is related to minimizing coupling. It’s not always the correct choice to remove a dependency, just as it isn’t always the correct choice to remove a coupling. Care and judgment need to be used. There’s also a relationship between dependency simplicity and parametric simplicity. It is sometimes the case that their definitions coincide for a particular unit, but not always.
Behavioral Complexity
Definition
A behavior is defined as a specific, distinct path of execution for a unit of code. The method of determining a behavior varies based on the unit, so any application of this metric will need to agree on such. What follows are some examples.
To measure the behaviors of an object, count the distinct types of messages it may receive from its public interface.
To measure the behaviors of a function or method, count how many conditional branches it has.
To minimize behavioral complexity, minimize the amount of behaviors present in the code.
Implications
This corresponds with some common rules of thumb, such as “small objects are better” and “small functions are better”, as they tend to have less behaviors. There’s a rough correlation with the Conditional portion of ABC complexity metrics.
Dominating Factors
It has been my experience that the most dominant contributors to complexity of a system are dependency complexity and behavioral complexity. As dependency and behavior are the defining characteristics of a system it makes sense that they would be the dominating complexity contributors. They are also the two factors that are language agnostic, and thus seem to indicate a more fundamental nature.
Too much syntactical and parametric complexity can obfuscate the underlying behavior and dependencies of a system, so they are important to minimize as well. However they should not be minimized to the point of complicating the other factors.