Add configurable console metrics filter with presets by YujiaBao · Pull Request #523 · thinking-machines-lab/tinker-cookbook

YujiaBao · 2026-03-23T20:55:50Z

Summary

Add a configurable filter system for the console metrics table (PrettyPrintLogger), with named presets that users can select via CLI or Config.

Depends on #522 (telemetry unification).

Motivation

The console table shows ~50 metrics per step. Most users only care about a handful (reward, progress, key timing). The rest is useful for debugging but clutters the default view. All metrics are always written to metrics.jsonl regardless of the console filter.

Changes

`ConsoleMetricsFilter` dataclass (`ml_log.py`)

Pattern-based filter using fnmatch globs:

include: list of patterns — metric must match at least one to be shown
exclude: list of patterns — metric is hidden if it matches any (takes precedence)

Three presets (`CONSOLE_METRICS_PRESETS`)

"compact" (default for training Configs): Shows key metrics — reward, correct/format, rollout stats (turns, tokens), progress, optim, and top-level timing (time/total, time/sampling, time/train_step). ~16 rows instead of ~50.
"detailed" (default for direct setup_logging() callers): Shows everything except :total/:count aggregates.
"all": No filtering — shows every metric.

Config integration

New console_metrics: str = "compact" field on all three training Configs (RL, SL, DPO). Passed through to setup_logging().

Discoverability

On the first log_metrics call where metrics are filtered, a one-time hint is logged:

Console metrics using 'compact' preset (37 metrics hidden).
Set console_metrics='all' to show all, or 'detailed' for more.
All metrics are always written to metrics.jsonl.

CLI usage

console_metrics=compact       # curated summary (default)
console_metrics=detailed      # everything except :total/:count
console_metrics=all           # no filtering

Backward compatibility

Direct setup_logging() callers (e.g., rl_loop.py, sl_loop.py) that don't pass console_metrics get "detailed" — same behavior as before.
Training Config-based callers get "compact" — UX change (fewer metrics shown), but the hint message tells users how to see more.
All metrics are always written to metrics.jsonl, Wandb, Neptune regardless of the console filter.

Before/after console output (RL sync, step 0)

Before (59 keys in metrics.jsonl, all shown in console):
39 time/* rows + 13 env/* rows + 7 other = ~59 rows

After (compact preset, 16 rows in console, 51 keys in metrics.jsonl):

│ env/all/ac_tokens_per_turn    │ 5.000000  │
│ env/all/correct               │ 0.000000  │
│ env/all/format                │ 0.000000  │
│ env/all/ob_tokens_per_turn    │ 38.500000 │
│ env/all/reward/total          │ -0.100000 │
│ env/all/turns_per_episode     │ 1.000000  │
│ optim/entropy                 │ 0.000774  │
│ optim/kl_sample_train_v1      │ 0.000025  │
│ optim/kl_sample_train_v2      │ 0.000000  │
│ optim/lr                      │ 0.000010  │
│ progress/batch                │ 0         │
│ progress/done_frac            │ 0.500000  │
│ time/run_evaluations_parallel │ 0.000016  │
│ time/sampling                 │ 6.675279  │
│ time/total                    │ 36.008096 │
│ time/train_step               │ 28.200690 │

Test plan

All 100 unit tests pass
ruff format, ruff check, pyright clean (0 new errors)
test_recipe_math_rl::test_math_rl_sync e2e smoke test passes with compact preset
Hint message displays correctly on first step
All metrics still written to metrics.jsonl (verified: 51 keys)

🤖 Generated with Claude Code

Add ConsoleMetricsFilter and preset system ("compact", "detailed", "all") to control which metrics appear in the console table. Default is "compact" which shows only key metrics (reward, progress, timing phases). All metrics are always written to metrics.jsonl regardless of the console filter. A one-time hint is logged when metrics are filtered, telling users how to see more: set console_metrics='all' or 'detailed'. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Include turns_per_episode, ac_tokens_per_turn, ob_tokens_per_turn in the compact preset so users see rollout behavior at a glance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

YujiaBao and others added 2 commits March 24, 2026 14:55

Add rollout metrics to compact preset

f68be50

Include turns_per_episode, ac_tokens_per_turn, ob_tokens_per_turn in the compact preset so users see rollout behavior at a glance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

YujiaBao force-pushed the yujia/configurable-console-metrics branch from 3ea9585 to f68be50 Compare March 24, 2026 21:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configurable console metrics filter with presets#523

Add configurable console metrics filter with presets#523
YujiaBao wants to merge 2 commits intothinking-machines-lab:mainfrom
YujiaBao:yujia/configurable-console-metrics

YujiaBao commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

YujiaBao commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

ConsoleMetricsFilter dataclass (ml_log.py)

Three presets (CONSOLE_METRICS_PRESETS)

Config integration

Discoverability

CLI usage

Backward compatibility

Before/after console output (RL sync, step 0)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

YujiaBao commented Mar 23, 2026 •

edited

Loading

`ConsoleMetricsFilter` dataclass (`ml_log.py`)

Three presets (`CONSOLE_METRICS_PRESETS`)