The ACPATH Software Metric by sebastianbergmann · Pull Request #25 · sebastianbergmann/complexity

sebastianbergmann · 2026-05-22T11:55:31Z

This implements the ACPATH complexity metric described in

Bagnara, Roberto & Bagnara, Abramo & Benedetti, Alessandro & Hill, Patricia. (2016). The ACPATH Metric: Precise Estimation of the Number of Acyclic Paths in C-like Languages.

PDF

New/updated paper:

Bagnara, Roberto & Bagnara, Abramo & Benedetti, Alessandro & Hill, Patricia. (2024). The ACPATH Structural Complexity Metric.

PDF

This software metric answers the question: How many acyclic execution paths exist through a function?

The number of acyclic paths through a function is a direct measure of its testability: since each acyclic path represents a distinct execution scenario, a thorough test suite should exercise all (or a significant fraction) of them. A function with 200 acyclic paths is fundamentally harder to test than one with 4.

ACPATH improves on two predecessors:

Cyclomatic complexity (McCabe) does not distinguish between different control-flow structures (e.g., conditionals vs. loops, sequential vs. nested), and therefore correlates poorly with testing effort.
NPATH (Nejmeh, 1988) was intended to count acyclic paths but its definition fails to do so, even for simple programs. NPATH can both underestimate and overestimate the true count. For example, NPATH does not account for short-circuit evaluation of &&/||, does not handle the backward jump in while loops, and does not model switch fall-through correctly.

ACPATH is proven (Theorem 2 in the paper) to yield the exact number of acyclic paths for all controlled function bodies, functions that contain no backward gotos and no jumps into a loop from outside. In practice this covers virtually all real-world code.

How Is ACPATH Calculated?

ACPATH works by structural induction over the abstract syntax tree. It performs a single traversal of the function body, propagating a set of path counts through each statement and expression.

Core Concepts

Path Counters for Expressions (Table 3 in the paper)

For every expression E, three functions are defined:

Symbol	Meaning
t(E)	Number of execution paths through `E` that may evaluate to true
f(E)	Number of execution paths through `E` that may evaluate to false
p(E)	Total number of execution paths through `E` (= t + f for boolean exprs)

The key insight is that short-circuit operators (&&, ||) introduce branching within an expression:

E1 && E2: E2 is only evaluated when E1 is true. So t = t(E1) * t(E2), f = f(E1) + t(E1) * f(E2).
E1 || E2: E2 is only evaluated when E1 is false. So t = t(E1) + f(E1) * t(E2), f = f(E1) * f(E2).
!E1: swaps t and f.
E1 ? E2 : E3 (ternary): t = t(E1)*t(E2) + f(E1)*t(E3), f = t(E1)*f(E2) + f(E1)*f(E3).
Leaf expressions (variables, function calls, etc.): t = f = p = 1.

For non-boolean operators (arithmetic, comparison, assignment, etc.), t = f = p since we cannot determine the boolean outcome statically.

Double-Traversal Functions (Tables 4-7 in the paper)

While-loop conditions can be traversed twice in a single acyclic path: once when entering the loop (evaluating to true) and once when exiting (evaluating to false). To correctly count paths through a while loop, four additional functions are needed for the condition expression:

Symbol	Meaning
tt(E)	Ways `E` can be traversed twice, both times evaluating to true, on non-overlapping arcs
tf(E)	Ways `E` can be traversed twice, first true then false (or vice versa), on non-overlapping arcs
ff(E)	Ways `E` can be traversed twice, both times evaluating to false, on non-overlapping arcs
pp(E)	Total ways `E` can be traversed twice on non-overlapping arcs

For a simple leaf expression: tt = 0, tf = 1, ff = 0, pp = 0 (there is exactly one arc, and it can be used once for true and once for false).

These compose for &&, ||, !, and ternary in the same way single-traversal functions do, following the structure of the expression's control-flow graph.

Path Counters for Statements (Definitions 6-7, equations 37-53)

Each statement is analyzed by computing a tuple of five values:

Symbol	Meaning
ft (fall-through)	Number of acyclic paths that "fall through" the statement and continue to the next
bp (break paths)	Cumulative paths that reach a `break` (not inside a nested switch/loop)
cp (continue paths)	Cumulative paths that reach a `continue` (not inside a nested loop)
rp (return paths)	Cumulative paths that reach a `return` statement
gt (goto paths)	Partial function mapping label identifiers to path counts (not used in PHP)

The ft for incoming paths is threaded through the statement sequence: each statement receives the ft produced by its predecessor. The final ACPATH value for a function body is ft_out + rp (equation 53): all paths that fall off the end of the function plus all paths that exit via return.

The key formulas for statements are:

Expression statement E;: ft_out = p(E) * ft. No branching, just multiply by expression paths.
Sequential composition S1 S2: process S1 with incoming ft, then process S2 with S1's ft_out. bp, cp, rp accumulate.
return: ft_out = 0, rp = ft (or p(E) * ft if returning an expression). All incoming paths divert to return.
return E: ft_out = 0, rp = p(E) * ft.
if (E) S1 else S2: S1 receives t(E) * ft paths, S2 receives f(E) * ft paths. ft_out = ft1 + ft2.
if (E) S1 (no else): S1 receives t(E) * ft paths. ft_out = ft1 + f(E) * ft.
while (E) S: ft_out = f(E) * ft + bp_S * t(E) + (ft_S + cp_S) * tf(E) / t(E). This accounts for: (1) paths that skip the loop (f * ft), (2) paths broken out of the loop that re-enter the condition (bp_S * t), (3) paths that complete the body and loop back through the condition requiring double traversal (tf). The paper writes this as (ft_S + cp_S) * tf(E) / t(E), but since the body was entered with t(E) * ft incoming paths, dividing by t(E) normalizes back to "per incoming path". In the implementation this is computed differently (see below).
do S while (E): ft_out = f(E) * ft_S + bp_S. The body always executes once; only the false-exit paths from the condition leave the loop.
for (E1; E2; E3) S: desugared to E1; while (E2) { S; E3; }.
break: ft_out = 0, bp = ft.
continue: ft_out = 0, cp = ft.
switch (E) S: Each case label adds st (switch-to) incoming paths. Fall-through between cases is handled by sequential processing. bp from the switch body becomes ft_out (break exits the switch). If there is no default, an additional p(E) * ft paths pass through without matching.

Implementation in `src/Visitor/AcpathCalculator.php`

The implementation is a PHP class that takes a list of PHP-Parser Stmt nodes (the body of a function/method) and returns the ACPATH count as an integer. It implements the paper's formulas from Section 4, adapted to PHP's syntax.

Entry Point: `calculate()`

public function calculate(array $statements): int
{
    ['ft' => $ft, 'bp' => $bp, 'cp' => $cp, 'rp' => $rp] =
        $this->statements($statements, 1, 0);

    return max(1, $ft + $rp);
}

Corresponds to equation (53): apc_i^b[B] := ft_out + rp. Initial ft = 1 (one path enters the function), initial st = 0 (no switch-to paths). The max(1, ...) ensures even an empty function returns at least 1. The gt (goto) component from the paper is not implemented since PHP does not have goto statements in practice within this tool's scope.

Statement Sequence: `statements()`

private function statements(array $statements, int $ft, int $st): array

Implements equation (38): sequential composition. Iterates through statements, threading ft from one to the next. Accumulates bp, cp, rp by summation. The st parameter carries the switch-to path count for switch statements.

Individual Statement Dispatch: `statement()`

Routes each statement type to its handler. Implements the paper's equations as follows:

Statement	Paper Eq.	Implementation
`Expression` (expr stmt)	(37)	ft_out = p(E) * ft
`Return_` (no expr)	(39)	ft=0, rp=ft
`Return_` (with expr)	(40)	ft=0, rp=p(E)*ft
`If_`	(41)/(42)	`processIf()`
`Switch_`	(43)	`processSwitch()`
`While_`	(44)	`processWhile()`
`Do_`	(45)	`processDo()`
`For_`	(46)	`processFor()`
`Foreach_`	n/a	`processForeach()`
`Break_`	(47)	ft=0, bp=ft
`Continue_`	(48)	ft=0, cp=ft
`TryCatch`	n/a	`processTryCatch()`
`Block`	(51)	delegates to `statements()`
Other (echo, noop, etc.)	(52)	ft unchanged

`processIf()`: Conditional Statements

Implements equations (41) and (42).

Computes t, f, p for the condition expression.
Elseif chains: desugared into nested if/else. The first elseif is extracted, a new If_ node is constructed with remaining elseifs and the else clause, and the else branch is processed as this synthetic inner if. This mirrors how the paper treats elseif as syntactic sugar.
If/else: the then-branch receives t * ft incoming paths, the else-branch receives f * ft incoming paths. ft_out = ft1 + ft2.
If without else: the then-branch receives t * ft paths, and f * ft paths fall through directly. ft_out = ft1 + f * ft.

bp, cp, rp from both branches accumulate by addition.

`processSwitch()`: Switch Statements

Implements equation (43).

Computes p for the switch condition expression.
Sets switchSt = p * ft; this is the "switch-to" count: the number of incoming paths that each case label contributes.
Delegates to processSwitchBody() which processes cases sequentially.
Each case label adds st to the current ft (modeling "switch-to" entry). Case body statements are processed normally, allowing fall-through between cases (ft flows from one case to the next unless interrupted by break).
After processing: ftOut = ftS + bpS (fall-through plus break paths). If there is no default case, adds p * ft paths for the "no match" case.

`processWhile()`: While Loops

Implements equation (44).

$ftOut = $f * $ft + $bpS * $t + ($ftS + $cpS) * $tf;

However, note an important difference from the paper's formula. The paper states:

ft_out = f(E) + bp_S + (ft_S + cp_S) * tf(E) / t(E)

where the body is entered with t(E) * ft incoming paths. The implementation instead passes the un-multiplied $ft to the body(line 261: $this->statements($stmt->stmts, $ft, $st)), not $t * $ft. Then compensates by multiplying the break term by $t and multiplying the skip-loop term by $f * $ft instead of $f. This is algebraically equivalent when the paper's formula is expanded with the incoming ft:

Paper: body gets t*ft paths, so ft_S is proportional to t*ft. Then (ft_S + cp_S) * tf / t normalizes out one factor of t.
Implementation: body gets ft paths (no multiplication by t). Then (ft_S + cp_S) * tf is already correct because ft_S is proportional to ft (not t*ft), and multiplying by tf (which already accounts for one true-then-false traversal) gives the right count.

The bp from the body is multiplied by $t because break paths must have entered the loop (condition was true), accounting for the condition's true-paths. Return paths pass through unchanged.

`processDo()`: Do-While Loops

Implements equation (45).

$ftOut = $f * $ftS + $bpS;

The body always executes once (receives $ft paths directly). Then:

$f * $ftS: paths that complete the body and exit via the condition evaluating to false.
$bpS: paths that break out of the loop.

Note: the implementation does not use double-traversal functions for do-while. The paper's equation (45) is ft_out = f(E) * ft_S + bp_S, which is simpler than while because the backward arc in a do-while loop goes from the condition back to the body entry and in the reference CFG this backward arc cannot be traversed in an acyclic path (it would revisit a node). So only one traversal of the condition is needed.

`processFor()`: For Loops

Implements equation (46) by desugaring to E1; while(E2) { S; E3; }:

Processes init expressions, multiplying ft by each expression's p.
Combines condition expressions with BooleanAnd (or uses true if empty).
Appends loop expressions as Expression statements to the body.
Applies the while-loop formula.

`processForeach()`: Foreach Loops

Not in the paper (PHP-specific). Treated as a while loop with a leaf condition (t=1, f=1, tf=1), giving:

$ftOut = 1 * $ft + $bpS * 1 + ($ftS + $cpS) * 1;

This means: the loop may execute zero or one additional iteration, with both the "skip" and "iterate-once" paths counted.

`processTryCatch()`: Try/Catch/Finally

Not in the paper (the paper covers C, which has no exceptions). The implementation treats each catch block as an alternative path: it receives the same ft as the try block (modeling that an exception could occur at the start of the try block). All ft values are summed. A finally block, if present, is threaded after the combined try+catch ft (it always executes).

Expression Path Counting: `expressionPaths()`

Returns {t, f, p} for an expression. Implements Table 3 from the paper:

Expression Type	t	f	p
`BooleanNot` (!E)	f(E)	t(E)	p(E)
`BooleanAnd` (&&)	t1*t2	f1 + t1*f2	f1 + t1*p2
`BooleanOr` (\|\|)	t1 + f1*t2	f1*f2	t1 + f1*p2
`Ternary` (E1?E2:E3)	t1t2 + f1t3	t1f2 + f1f3	t1p2 + f1p3
`Ternary` (E1?:E2, elvis)	t1 + f1*t2	f1*f2	t1 + f1*p2
`Coalesce` (??)	same as \|\|	same as \|\|	same as \|\|
`Match_`	sum of all arm paths	same	same
`Assign`/`AssignOp`	p(var)*p(expr)	same	same
`BinaryOp` (non-boolean)	p1*p2	same	same
`Cast`, `UnaryMinus/Plus`	p(E)	same	same
Leaf (variable, literal, call)	1	1	1

PHP-specific additions beyond the paper:

LogicalAnd/LogicalOr (PHP's and/or keywords): treated identically to &&/||.
Coalesce (??): treated as || (short-circuit on non-null).
Match_: sums up paths from all arm conditions and bodies.

Double-Traversal: `expressionPathsDouble()`

Returns {tt, tf, ff, pp}. Implements Tables 4-7 from the paper. Used only for while/for loop conditions.

The key formulas for &&:

tt = tt1 * tt2
tf = tf1 * t2 + tt1 * tf2
ff = ff1 + 2*tf1*f2 + tt1*ff2
pp = ff1 + 2*tf1*p2 + tt1*pp2

For ||:

tt = tt1 + 2*tf1*t2 + ff1*tt2
tf = tf1*f2 + ff1*tf2
ff = ff1 * ff2
pp = tt1 + 2*tf1*p2 + ff1*pp2

For !E: swaps tt/ff, tf stays the same.

For non-short-circuit expressions (comparisons, arithmetic, etc.), the base case is: tt=0, tf=p, ff=0, pp=0. This means the expression has p independent arcs, each of which can be used once for true and once for false.

Notable Design Decisions in the Implementation

No gt (goto) tracking: the paper tracks goto target paths via a partial function gt. The implementation omits this entirely, which is appropriate for PHP where gotos in function bodies are extremely rare and outside this tool's scope.
max(1, ...): the result is clamped to a minimum of 1. An empty function body or a function consisting only of unreachable code still reports ACPATH = 1.
Elseif desugaring: rather than implementing a separate elseif rule, the code constructs a synthetic nested If_ node. This is mathematically equivalent and reduces code duplication.
While-loop formula variant: the implementation passes un-multiplied ft to the loop body rather than t(E) * ft as the paper does, then adjusts the combining formula accordingly. See the processWhile() section above for the algebraic equivalence argument.
Foreach as while with leaf condition: a pragmatic choice. The iteration variable binding is not modeled as a branching expression.
Try/catch as alternative paths: each catch block is treated as a parallel branch with the same incoming ft as the try block. This is a reasonable extension for exception-handling semantics not covered by the original paper. Exceptions are modelled as if they always fire at try-entry; the analysis does not track which statements within the try block can throw, nor does it propagate uncaught exceptions out of the function.
Constant-true loop conditions are not specialised: a while (true) or for (;;) loop is treated as having a leaf condition (f = 1), so the formulas always count one "skip the loop" path even though the condition can never evaluate to false. This produces a small over-count for infinite loops that exit only via break/return/throw. The behaviour is intentional and locked in by the test suite (e.g., for (;;) { break; } returns 2). Eliminating the phantom path would require constant folding on the condition expression and is outside the current scope.
continue inside switch follows C semantics: the implementation propagates cp (continue paths) through switch statements, treating a bare continue inside a case as targeting the enclosing loop. This matches the paper's C-language model. PHP differs: at level 1, continue inside a switch behaves like break (and emits an E_WARNING since PHP 7.3). Code that uses bare continue directly inside a switch will therefore be modelled as continuing an outer loop rather than breaking out of the switch. This is an edge case; in idiomatic PHP, continue 2 is used instead, which is not handled by the current dispatcher either.

github-actions · 2026-05-22T12:00:06Z

API Surface Changes

If any of the additions below are not intended as public API, mark them with @internal in the docblock.

New API Surface

Classes

SebastianBergmann\Complexity\AcpathCalculator — class SebastianBergmann\Complexity\AcpathCalculator
SebastianBergmann\Complexity\AcpathControlFlowDotVisitor — class SebastianBergmann\Complexity\AcpathControlFlowDotVisitor
SebastianBergmann\Complexity\AcpathControlFlowGraph — class SebastianBergmann\Complexity\AcpathControlFlowGraph
SebastianBergmann\Complexity\AcpathDecompositionDotVisitor — class SebastianBergmann\Complexity\AcpathDecompositionDotVisitor
SebastianBergmann\Complexity\AcpathPathEnumerationDotVisitor — class SebastianBergmann\Complexity\AcpathPathEnumerationDotVisitor
SebastianBergmann\Complexity\ExpressionPathAnalyzer — class SebastianBergmann\Complexity\ExpressionPathAnalyzer

Methods

SebastianBergmann\Complexity\Complexity::acpath — public function acpath(): int
SebastianBergmann\Complexity\ComplexityCollection::acpath — public function acpath(): int
SebastianBergmann\Complexity\AcpathCalculator::calculate — public function calculate(array $statements): int
SebastianBergmann\Complexity\AcpathControlFlowDotVisitor::generate — public function generate(array $statements): string
SebastianBergmann\Complexity\AcpathControlFlowGraph::fromStatements — static public function fromStatements(array $statements): self
SebastianBergmann\Complexity\AcpathControlFlowGraph::nodes — public function nodes(): array
SebastianBergmann\Complexity\AcpathControlFlowGraph::edges — public function edges(): array
SebastianBergmann\Complexity\AcpathControlFlowGraph::entryId — public function entryId(): int
SebastianBergmann\Complexity\AcpathControlFlowGraph::exitId — public function exitId(): int
SebastianBergmann\Complexity\AcpathDecompositionDotVisitor::generate — public function generate(array $statements): string
SebastianBergmann\Complexity\AcpathPathEnumerationDotVisitor::generate — public function generate(array $statements): string
SebastianBergmann\Complexity\ExpressionPathAnalyzer::expressionPaths — static public function expressionPaths(PhpParser\Node\Expr $expr): array
SebastianBergmann\Complexity\ExpressionPathAnalyzer::expressionPathsDouble — static public function expressionPathsDouble(PhpParser\Node\Expr $expr): array

Modified API Surface

Methods

SebastianBergmann\Complexity\Complexity::__construct

- public function __construct(string $name, int $cyclomaticComplexity)
+ public function __construct(string $name, int $cyclomaticComplexity, int $acpath)

codecov · 2026-05-22T12:00:39Z

Codecov Report

❌ Patch coverage is 99.84721% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.30%. Comparing base (b164899) to head (ffb2126).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/Visitor/AcpathPathEnumerationDotVisitor.php	98.34%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main      #25      +/-   ##
============================================
+ Coverage     93.93%   99.30%   +5.36%     
- Complexity       57      315     +258     
============================================
  Files             6       12       +6     
  Lines           132     1440    +1308     
============================================
+ Hits            124     1430    +1306     
- Misses            8       10       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sebastianbergmann changed the title ~~Issue 4/acpath~~ The ACPATH Software Metric May 22, 2026

sebastianbergmann force-pushed the issue-4/acpath branch from 90402da to c8549d6 Compare May 22, 2026 11:59

sebastianbergmann added the enhancement New feature or request label May 22, 2026

sebastianbergmann added 3 commits May 22, 2026 14:06

Initial work on #4 (including visualization)

ed497fa

Extract ExpressionPathAnalyzer class

12ab8a9

Fix CS/WS issues

ffb2126

sebastianbergmann force-pushed the issue-4/acpath branch from c8549d6 to ffb2126 Compare May 22, 2026 12:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The ACPATH Software Metric#25

The ACPATH Software Metric#25
sebastianbergmann wants to merge 3 commits into
mainfrom
issue-4/acpath

sebastianbergmann commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sebastianbergmann commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How Is ACPATH Calculated?

Core Concepts

Path Counters for Expressions (Table 3 in the paper)

Double-Traversal Functions (Tables 4-7 in the paper)

Path Counters for Statements (Definitions 6-7, equations 37-53)

Implementation in src/Visitor/AcpathCalculator.php

Entry Point: calculate()

Statement Sequence: statements()

Individual Statement Dispatch: statement()

processIf(): Conditional Statements

processSwitch(): Switch Statements

processWhile(): While Loops

processDo(): Do-While Loops

processFor(): For Loops

processForeach(): Foreach Loops

processTryCatch(): Try/Catch/Finally

Expression Path Counting: expressionPaths()

Double-Traversal: expressionPathsDouble()

Notable Design Decisions in the Implementation

Uh oh!

github-actions Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API Surface Changes

New API Surface

Classes

Methods

Modified API Surface

Methods

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sebastianbergmann commented May 22, 2026 •

edited

Loading

Implementation in `src/Visitor/AcpathCalculator.php`

Entry Point: `calculate()`

Statement Sequence: `statements()`

Individual Statement Dispatch: `statement()`

`processIf()`: Conditional Statements

`processSwitch()`: Switch Statements

`processWhile()`: While Loops

`processDo()`: Do-While Loops

`processFor()`: For Loops

`processForeach()`: Foreach Loops

`processTryCatch()`: Try/Catch/Finally

Expression Path Counting: `expressionPaths()`

Double-Traversal: `expressionPathsDouble()`

github-actions Bot commented May 22, 2026 •

edited

Loading

codecov Bot commented May 22, 2026 •

edited

Loading