Skip to content

feat: subprocess visualization feasibility prototype + ADR (Phase 4) #1279

@Jammy2211

Description

@Jammy2211

Overview

Phase 4 of the (now-archived) z_features/jax_visualization.md roadmap. Today PyAutoFit's iterations_per_quick_update runs analysis.visualize(fit) synchronously inside the search loop — the search pauses while plots are written. Phases 0-3 made JAX visualization the default path, which widens the gap between "search ready for next iteration" and "viz finished writing PNGs". This task is the feasibility prototype + Architecture Decision Record that unlocks Phase 4 production (dispatch viz to a worker process) — which in turn unlocks Phase 5 (live Jupyter/Colab cell update). Ships a prototype branch in PyAutoFit + one demo script in autolens_workspace_test + an ADR. Does NOT ship the production implementation.

Plan

  • Stand up a worker harness module in PyAutoFit (autofit/non_linear/visualization/) that owns the lifecycle of a render-side subprocess
  • Wire the new dispatch behind a feature gate in Analysis.fit_for_visualization / iterations_per_quick_update plumbing (synchronous path stays default for the prototype)
  • Investigate FitImaging picklability post-pytree-registration (highest-risk question; Fitness.__getstate__ is the existing precedent for stripping JAX-cached attrs)
  • Empirically compare two Python-native IPC options (multiprocessing.Process+Queue vs concurrent.futures.ProcessPoolExecutor) on a realistic pixelization fit; fall back to disk-staged subprocess.Popen only if both crash or stall under JAX
  • Pick a backpressure policy (drop / queue / coalesce) and a failure-isolation pattern; demonstrate both on the demo script
  • Write the ADR at PyAutoFit/docs/adr/0001-subprocess-visualization.md answering the four open architectural questions
  • Author a follow-up prompt PyAutoPrompt/autofit/visualization_subprocess_production.md scoping the production implementation from the ADR's findings
Detailed implementation plan

Affected Repositories

  • PyAutoFit (primary, library)
  • autolens_workspace_test (one new demo script in scripts/jax_assertions/)

Work Classification

Both (library work first; the demo script is a single workspace file shipped alongside)

Branch Survey

Repository Current Branch Dirty?
./PyAutoFit main clean
./autolens_workspace_test main clean (canonical) — knn-barycentric task holds it via worktree

Suggested branch: feature/viz-subprocess-feasibility
Worktree root: ~/Code/PyAutoLabs-wt/viz-subprocess-feasibility/ (created later by /start_library)

Conflict resolution: File-level coexistence on autolens_workspace_test. knn-barycentric writes scripts/jax_assertions/knn_barycentric.py; this task writes scripts/jax_assertions/subprocess_viz_prototype.py. Same directory, different filenames, no edit overlap. Matches the ag-interferometer-kwargs precedent (parallel worktrees on file-disjoint scope).

Four Architectural Questions (must be answered in the ADR)

  1. IPC mechanism — measure mp.Process+Queue vs ProcessPoolExecutor on a pixelization imaging fit. Fall back to disk-staged Popen only if both Python-native options crash under JAX. (Known caveat: os.fork() warnings already appear in smoke logs.)
  2. FitImaging picklability — pytree-registered ≠ picklable. Determine: does populated FitImaging round-trip through pickle.dumps/loads? If not, where does it break (closures over Analysis, JAX-cached _jitted_fit_from, cosmology distance caches, NUFFT operators)? Fix: strip JAX-cached attrs in __getstate__, or send raw arrays + reconstruct in worker?
  3. Backpressure — pick drop / queue / coalesce. Default recommendation: drop (quick-update is a snapshot, not a frame). Confirm with measurements: how often does the producer outpace the consumer?
  4. Failure isolation — deliberately raise inside the renderer and demonstrate: worker death detected within one quick-update cycle, search continues, single-shot user warning fires.

Implementation Steps

  1. Picklability spike (highest-risk first) — write a one-page script that constructs a populated FitImaging, runs pickle.dumps(...) / pickle.loads(...), and reports the breakage. This decides Q2 and constrains Q1.

  2. Worker harnessPyAutoFit/autofit/non_linear/visualization/ (new module). One file containing WorkerHandle (start/stop, send-render, health check) + the chosen IPC primitive's wiring + backpressure policy + failure-isolation glue. Keep simple — prototype only.

  3. Dispatch gate — extend PyAutoFit/autofit/non_linear/analysis/analysis.py:fit_for_visualization (or add a sibling dispatch_visualization) to optionally route through the worker. Default path stays synchronous. Wire the new dispatch behind a kwarg or env var.

  4. iterations_per_quick_update hookPyAutoFit/autofit/non_linear/fitness.py (or wherever the quick-update trigger lives). Replace the synchronous analysis.visualize(fit) call with the new dispatch behind the feature gate.

  5. End-to-end demoautolens_workspace_test/scripts/jax_assertions/subprocess_viz_prototype.py. Run a pixelization imaging fit with the new dispatch. Measure: search wall-clock with vs without subprocess viz, render latency, dropped-frame count. Include a deliberate worker-crash to demonstrate failure isolation.

  6. ADRPyAutoFit/docs/adr/0001-subprocess-visualization.md (new docs/adr/ subfolder). Capture answers to Q1-Q4 with measurements where applicable.

  7. Follow-up promptPyAutoPrompt/autofit/visualization_subprocess_production.md. Production scope derived from the ADR's recommendations.

Key Files

  • PyAutoFit/autofit/non_linear/analysis/analysis.py (lines 36-138) — Analysis.__init__ + fit_for_visualization post-Phase-2
  • PyAutoFit/autofit/non_linear/fitness.pyiterations_per_quick_update plumbing + Fitness.__getstate__ pickle precedent
  • PyAutoFit/autofit/non_linear/visualization/ — new module
  • PyAutoFit/docs/adr/0001-subprocess-visualization.md — new ADR
  • autolens_workspace_test/scripts/jax_assertions/subprocess_viz_prototype.py — new demo script
  • autofit_workspace_test/scripts/jax_assertions/fitness_dispatch.py::assert_pickle_strips_jax_cached_attrs — existing precedent for stripping JAX-cached attrs

Hard out-of-scope

  • Production-clean error handling, logging config, user-facing config plumbing (config/visualize.yaml)
  • Phase 5 (Jupyter/Colab live cell)
  • GPU resource sharing between search and renderer
  • The autolens_workspace_test cleanup (~12 redundant use_jax_for_visualization=True calls)
  • Migrating PyAutoGalaxy / PyAutoLens visualizer dispatches — dispatch change lives in PyAutoFit's base Analysis only

Original Prompt

Click to expand starting prompt (verbatim from PyAutoPrompt/autofit/visualization_subprocess_feasibility.md)

Phase 4 of `z_features/jax_visualization.md` (archived under
`z_features/complete/jax_visualization.md`). Feasibility prototype — NOT
a production implementation.

Today PyAutoFit's `iterations_per_quick_update` mechanism triggers
visualization synchronously inside the search process:

```
search loop iteration N
→ fitness evaluation
→ if N % iterations_per_quick_update == 0:
analysis.visualize(fit) # <— search blocks here
→ continue loop
```

For a fast Nautilus fit on a pixelization model the synchronous render
adds seconds of wall-clock per quick-update — and as Phase 1-3 made JAX
viz the default path, the gap between "search ready for next iteration"
and "viz finished writing PNGs" only widens.

The user-facing goal stated in the archived tracker is:

"Visualization (including via quick updates) would happen on a separate
process to the search so we don't 'pause' to visualize."

This task is the architectural decision that unlocks that goal. It
does NOT ship the production implementation. It ships:

  1. A prototype branch in PyAutoFit demonstrating the chosen approach
    working end-to-end on one workflow.
  2. A short Architecture Decision Record (ADR) — written as
    `@PyAutoFit/docs/adr/0001-subprocess-visualization.md` (new docs
    subfolder is OK) — answering the four open questions below with
    measurements where applicable.

The production implementation (Phase 4 proper) is a follow-up prompt
authored once this feasibility lands.

(For the full 189-line prompt with all four questions detailed, build
scope, success criteria, and references, see
`PyAutoPrompt/issued/visualization_subprocess_feasibility.md` in the
PyAutoPrompt repo — moved there by /start_dev.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions