Skip to content

m4vic/AI-In-The-Loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AITL & AEOS Research Repository

Paper 1: AITL Taxonomy Paper 2: Sunk-Cost Fallacy Paper 3: Modality Paradox Website

Welcome to the consolidated research repository of Neuralchemy Labs for AI In The Loop (AITL) and Autonomous Empirical Optimization System (AEOS) research.

This repository contains the complete experimental code, active search loops, configurations, data loaders, and paper manuscripts for Paper 1 (Taxonomy & PoC), Paper 2 (Sunk-Cost Fallacy), and Paper 3 (Modality Paradox).


πŸ›οΈ Repository File Structure

To make navigation simple and intuitive, the repository is organized into five main areas:

AI-In-The-Loop/                     # Repository Root
β”œβ”€β”€ archive/                        # Legacy and out-of-scope paper archives (ignored from GitHub)
β”‚   β”œβ”€β”€ legacy_paper1/              # Paper 1 taxonomy manuscript drafts
β”‚   β”œβ”€β”€ legacy_paper4/              # Paper 4 (Gatekeepers & MoE) archived sources
β”‚   β”œβ”€β”€ legacy_paper5/              # Paper 5 (Lab Director & Closed-Loop) archives
β”‚   β”œβ”€β”€ legacy_blind_nas/           # Legacy AITL Blind NAS PoC code
β”‚   └── legacy_AITL_main/           # Archived raw files from the original AITL repo
β”œβ”€β”€ docs/                           # Core research & architectural documentation
β”œβ”€β”€ paper/                          # Scientific manuscript directories
β”‚   β”œβ”€β”€ paper1_taxonomy/            # Paper 1: AITL Taxonomy manuscript, built PDF & figures
β”‚   β”œβ”€β”€ paper2_sunk_cost/           # Paper 2: Sunk-Cost Fallacy manuscript & built PDF
β”‚   └── paper3_modality_paradox/    # Paper 3: Modality Paradox LaTeX sources & PDF
β”œβ”€β”€ aeos_sunk_cost/                 # ACTIVE CODE: Paper 2 (Sunk-Cost Fallacy)
β”‚   β”œβ”€β”€ results/                    # Sunk-Cost single-agent run metrics
β”‚   β”œβ”€β”€ agent.py                    # Monolithic autonomous agent
β”‚   └── runner.py                   # Single-agent driver
└── experiments/                    # ACTIVE CODE: Active experimental sweeps
    β”œβ”€β”€ aitl_blind_nas/             # ACTIVE CODE: Paper 1 Proof-of-Concept (Blind NAS Tuner)
    β”‚   β”œβ”€β”€ concept.md              # Blinding mechanism details
    β”‚   β”œβ”€β”€ agent.py                # Architecture search agent
    β”‚   β”œβ”€β”€ trainer.py              # PyTorch training sandbox
    β”‚   └── runner.py               # Main tuner loop runner
    └── modality_paradox/           # ACTIVE CODE: Paper 3 (Modality Paradox)
        β”œβ”€β”€ results/                # Cross-modality JSON logs
        β”œβ”€β”€ runner_critic.py        # Asymmetric Coder-Reviewer loop
        β”œβ”€β”€ run_math_ablation.py    # Math prompt self-reflection sweep
        β”œβ”€β”€ aggregate_paper3.py     # Aggregator tool
        └── build_paper3_assets.py  # High-res chart generator

πŸ“– The Research Trilogy

graph TD
    subgraph Theoretical Foundations
        P1[Paper 1: AITL Taxonomy] -->|Defines Paradigm| Loop[Autonomous Self-Improving Loops]
        Loop -->|Validates Tabula Rasa| POC[AITL Blind NAS PoC]
    end
    
    subgraph Core System Experiments
        Loop -->|Single-Agent Loops| P2[Paper 2: Sunk-Cost Fallacy]
        Loop -->|Asymmetric Dual-Agent Loops| P3[Paper 3: Modality Paradox]
    end
    
    style P1 fill:#7f1d1d,stroke:#b91c1c,stroke-width:2px,color:#fff
    style P2 fill:#064e3b,stroke:#047857,stroke-width:2px,color:#fff
    style P3 fill:#1e3a8a,stroke:#1d4ed8,stroke-width:2px,color:#fff
    style POC fill:#4b5563,stroke:#374151,stroke-width:2px,color:#fff
Loading

πŸ”΄ Paper 1: AITL Taxonomy & Foundations

  • Title: AI In The Loop (AITL): A Systems Taxonomy for Closed-Loop Autonomous Evaluation
  • Manuscript Directory: paper/paper1_taxonomy/
  • Active Codebase: experiments/aitl_blind_nas/
  • Paradigm: Establishes the core systems taxonomy of AI In The Loop. Demonstrates how to prove empirical optimization self-improvement by "blinding" the search LLM to prevent zero-shot parametric memorization of standard datasets.

🟒 Paper 2: Sunk-Cost Fallacy (AEOS Core)

  • Title: The Autonomous Sunk-Cost Fallacy: Stopping Failures and Meta-Reasoning in LLMs Deployed within AEOS
  • Manuscript Directory: paper/paper2_sunk_cost/
  • Active Codebase: aeos_sunk_cost/
  • Paradigm: Demonstrates how a monolithic, single-agent loop trapped in an open-ended code optimization environment struggles with cognitive anchoring, repeating failed strategies over dozens of iterations (the "sunk-cost fallacy" for AI agents).
flowchart TD
    A[Dataset] --> B[Monolithic Agent]
    B -->|Writes PyTorch Code| C(Sandboxed Sandbox)
    C -->|Calculates Val Loss| B
    B -->|Anchored Strategy Loop| B
    B -->|Fails to Terminate| B
Loading

πŸ”΅ Paper 3: The Modality Paradox

  • Title: The Modality Paradox in Autonomous LLM Engineering: Stopping Behaviors in Asymmetric Reviewer-Coder Loops
  • Manuscript Directory: paper/paper3_modality_paradox/
  • Active Codebase: experiments/modality_paradox/
  • Paradigm: Establishes that LLM stopping thresholds are highly task- and modality-dependent. Introduces an asymmetric dual-agent loop (Reviewer + Coder) and benchmarks it across structured tabular, text, and vision workloads.
flowchart TD
    Reviewer[Reviewer Agent <br> *Sets Strategy & Holds Stop Key*] -->|DIRECTIVE| Coder[Coder Agent <br> *Writes ML Code*]
    Coder -->|Execute| Sandbox(Isolated Sandbox)
    Sandbox -->|Metrics & Tracebacks| Reviewer
Loading

πŸ› οΈ Execution & Reproducibility Guide

1. Environment Setup

Copy .env.example to .env inside either code directory and add your LLM API keys:

cp aeos_sunk_cost/.env.example aeos_sunk_cost/.env
# Or for Paper 3:
cp experiments/modality_paradox/.env.example experiments/modality_paradox/.env

2. Run Paper 1 (Blind NAS Tuner PoC)

Run the blinded neural architecture search:

cd experiments/aitl_blind_nas
python runner.py

Observe results/loss_curve.png to watch validation loss trend downwards over iterations, mathematically demonstrating the self-improving properties of AITL.

3. Run Paper 2 (Sunk-Cost Fallacy)

Run the monolithic engineering loops:

cd aeos_sunk_cost
python runner.py

4. Run Paper 3 (Modality Paradox)

Run math-prompt self-reflection sweeps and aggregate figures:

cd experiments/modality_paradox
python run_math_ablation.py
python aggregate_paper3.py
python build_paper3_assets.py

All compiled publication-ready SVG/PNG figures and LaTeX tabular models are saved directly inside paper/paper3_modality_paradox/figures/.


πŸ” Scientific Citation & Zenodo Indexing

Note

Manuscript Separation Policy

  • GitHub hosts the active, reproducible code, datasets, and configurations.
  • Zenodo hosts the official, permanent, immutable preprint PDFs.
  • LaTeX templates are included in paper/ for structural development. For formal scientific citations, please use the Zenodo records referenced in CITATION.cff.

Neuralchemy Labs Research Series β€” neuralchemy.in

About

The Autonomous Sunk-Cost Fallacy: Stopping Failures and Meta-Reasoning in LLMs Deployed within the Autonomous Empirical Optimization System (AEOS)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors