module avail O(N²) performance regression in displayElementList column optimizer with large module counts

Hi Xavier--

It's me, the 16,000 modules guy again 🙂. Claude debugged this problem and gave me the text I'm copying below. Thanks! --Myles

---
### Environment

- **Environment Modules version:** 5.6.1 (likely affects earlier versions)
- **Tcl version:** 8.6.8
- **OS:** RHEL 8.10 (x86\_64)
- **Terminal width:** 129 columns
- **MODULEPATH contents:** \~16,250 modulefiles across multiple directories

### Description

`module avail` takes **\~45 seconds** to complete when MODULEPATH contains \~16,000 modules, while `module avail --terse` completes in **\~2 seconds**. The bottleneck is pure CPU time in the column layout optimizer inside `displayElementList`, not NFS I/O, pager startup, or terminal rendering.

### Root Cause

The `last_round` optimization loop in `displayElementList` (`modulecmd.tcl`, around line 4540) exhibits O(N²) behavior. When the optimizer finds that `cur_cols` columns don't fit, it enters `last_round` mode and increments `cur_rows` **one at a time** from `ceil(N/cur_cols)` up to `rows` (the row count from the previous successful column count). Each increment re-scans all N elements.

With our module set:

| Parameter | Value |
|---|---|
| N (module count) | 16,250 |
| max module name length | 60 chars (+2 suffix = 62) |
| tty\_cols | 129 |
| Initial `cur_cols` | `floor(129/62)` = 2 |
| `rows` after `cur_cols=2` succeeds | 8,125 |
| `cur_rows` when trying `cur_cols=3` | `ceil(16250/3)` = 5,417 |
| `last_round` increments needed | 8,125 − 5,417 = **2,708** |
| Inner iterations per increment | \~5,417 |
| **Total inner loop iterations** | **\~14.7 million** |

At Tcl's interpreted throughput of \~500K iterations/sec for this workload, this takes \~30s of pure user CPU time.

### How to Reproduce

```bash
# Create a directory with ~16,000 modulefiles with long names
mkdir -p /tmp/modules_perf_test/toolname
for i in $(seq 1 16000); do
    printf '#%%Module\n' > "/tmp/modules_perf_test/toolname/some_long_module_name_prefix_${i}"
done

# Slow (~30-45s depending on terminal width and CPU)
MODULEPATH=/tmp/modules_perf_test time modulecmd.tcl bash avail --no-pager

# Fast (~1s) — bypasses the column optimizer entirely
MODULEPATH=/tmp/modules_perf_test time modulecmd.tcl bash avail --terse
```

### Suggested Fix

In the `last_round` branch, replace the linear `incr cur_rows` with a direct jump to `rows`:

```diff
          if {!$restart_loop} {
             if {$last_round} {
-               incr cur_rows
+               set cur_rows $rows
             } else {
                set cur_rows [expr {int(ceil(double($elt_cnt) / $cur_cols))}]
             }
```

**Why this works:** The purpose of `last_round` is to determine whether `cur_cols` columns can fit if given more rows. By jumping `cur_rows` directly to `rows` (the previous successful row count — the most generous possible value), the very next overflow triggers the `cur_rows == rows` bailout and immediately decrements `cur_cols`. This reduces the `last_round` phase from O(N) outer iterations to O(1).

**Tradeoff:** In rare cases the chosen layout may have 1–2 more rows than the theoretical minimum for that column count. This is invisible in practice (e.g., 8,125 rows vs 8,123 rows for 16,000 modules).

**Measured improvement:**

| Metric | Before | After |
|---|---|---|
| Inner loop iterations (N=16,250) | >14,700,000 | \~43,000 |
| Wall time | \~45s | \~2s |
| Computational complexity | O(N²) | O(N) |

### Workarounds

For users who cannot patch `modulecmd.tcl`:

- `export MODULES_TERM_WIDTH=1` — forces single-column output, bypasses the optimizer
- `module avail --terse` — uses `join` instead of column formatting
- `export MODULES_TERM_WIDTH=61` — any value less than `max_name_len` forces single-column

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

module avail O(N²) performance regression in displayElementList column optimizer with large module counts #622

Environment

Description

Root Cause

How to Reproduce

Suggested Fix

Workarounds

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Parameter	Value
N (module count)	16,250
max module name length	60 chars (+2 suffix = 62)
tty_cols	129
Initial `cur_cols`	`floor(129/62)` = 2
`rows` after `cur_cols=2` succeeds	8,125
`cur_rows` when trying `cur_cols=3`	`ceil(16250/3)` = 5,417
`last_round` increments needed	8,125 − 5,417 = 2,708
Inner iterations per increment	~5,417
Total inner loop iterations	~14.7 million

Metric	Before	After
Inner loop iterations (N=16,250)	>14,700,000	~43,000
Wall time	~45s	~2s
Computational complexity	O(N²)	O(N)

module avail O(N²) performance regression in displayElementList column optimizer with large module counts #622

Description

Environment

Description

Root Cause

How to Reproduce

Suggested Fix

Workarounds

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions