Skip to content

module avail O(N²) performance regression in displayElementList column optimizer with large module counts #622

@smprather

Description

@smprather

Hi Xavier--

It's me, the 16,000 modules guy again 🙂. Claude debugged this problem and gave me the text I'm copying below. Thanks! --Myles


Environment

  • Environment Modules version: 5.6.1 (likely affects earlier versions)
  • Tcl version: 8.6.8
  • OS: RHEL 8.10 (x86_64)
  • Terminal width: 129 columns
  • MODULEPATH contents: ~16,250 modulefiles across multiple directories

Description

module avail takes ~45 seconds to complete when MODULEPATH contains ~16,000 modules, while module avail --terse completes in ~2 seconds. The bottleneck is pure CPU time in the column layout optimizer inside displayElementList, not NFS I/O, pager startup, or terminal rendering.

Root Cause

The last_round optimization loop in displayElementList (modulecmd.tcl, around line 4540) exhibits O(N²) behavior. When the optimizer finds that cur_cols columns don't fit, it enters last_round mode and increments cur_rows one at a time from ceil(N/cur_cols) up to rows (the row count from the previous successful column count). Each increment re-scans all N elements.

With our module set:

Parameter Value
N (module count) 16,250
max module name length 60 chars (+2 suffix = 62)
tty_cols 129
Initial cur_cols floor(129/62) = 2
rows after cur_cols=2 succeeds 8,125
cur_rows when trying cur_cols=3 ceil(16250/3) = 5,417
last_round increments needed 8,125 − 5,417 = 2,708
Inner iterations per increment ~5,417
Total inner loop iterations ~14.7 million

At Tcl's interpreted throughput of ~500K iterations/sec for this workload, this takes ~30s of pure user CPU time.

How to Reproduce

# Create a directory with ~16,000 modulefiles with long names
mkdir -p /tmp/modules_perf_test/toolname
for i in $(seq 1 16000); do
    printf '#%%Module\n' > "/tmp/modules_perf_test/toolname/some_long_module_name_prefix_${i}"
done

# Slow (~30-45s depending on terminal width and CPU)
MODULEPATH=/tmp/modules_perf_test time modulecmd.tcl bash avail --no-pager

# Fast (~1s) — bypasses the column optimizer entirely
MODULEPATH=/tmp/modules_perf_test time modulecmd.tcl bash avail --terse

Suggested Fix

In the last_round branch, replace the linear incr cur_rows with a direct jump to rows:

          if {!$restart_loop} {
             if {$last_round} {
-               incr cur_rows
+               set cur_rows $rows
             } else {
                set cur_rows [expr {int(ceil(double($elt_cnt) / $cur_cols))}]
             }

Why this works: The purpose of last_round is to determine whether cur_cols columns can fit if given more rows. By jumping cur_rows directly to rows (the previous successful row count — the most generous possible value), the very next overflow triggers the cur_rows == rows bailout and immediately decrements cur_cols. This reduces the last_round phase from O(N) outer iterations to O(1).

Tradeoff: In rare cases the chosen layout may have 1–2 more rows than the theoretical minimum for that column count. This is invisible in practice (e.g., 8,125 rows vs 8,123 rows for 16,000 modules).

Measured improvement:

Metric Before After
Inner loop iterations (N=16,250) >14,700,000 ~43,000
Wall time ~45s ~2s
Computational complexity O(N²) O(N)

Workarounds

For users who cannot patch modulecmd.tcl:

  • export MODULES_TERM_WIDTH=1 — forces single-column output, bypasses the optimizer
  • module avail --terse — uses join instead of column formatting
  • export MODULES_TERM_WIDTH=61 — any value less than max_name_len forces single-column

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions