Skip to content

optimizedwf/creative-studio-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Creative Studio Pipeline

Distributed AI video creation pipeline — connect ComfyUI (image/video generation), Kokoro TTS (voiceover), and audio tools into an automated workflow.

License: MIT

What It Is

Creative Studio Pipeline orchestrates video production across multiple machines:

  • GPU Worker — runs ComfyUI with FLUX (image generation) and Wan2.2 (image-to-video animation)
  • Audio Worker — runs ACE-Step for music generation, Kokoro TTS for voiceover
  • Orchestrator — coordinates jobs, transfers assets, assembles final video with ffmpeg

Works with 1 machine or 3. All communication is SSH-based — no cloud dependencies, no paid APIs.

Architecture

┌─────────────┐     ┌──────────────┐     ┌──────────────┐
│ Orchestrator │────▶│  GPU Worker  │     │ Audio Worker │
│ (your dev   │ SSH │ (ComfyUI)    │ SSH │ (TTS/Music)  │
│  machine)   │◀────│ FLUX + Wan2.2│     │ Kokoro/ACE   │
└─────────────┘     └──────────────┘     └──────────────┘
       │                                        │
       └──────────── ffmpeg assembly ───────────┘
                        │
                   ┌────┴────┐
                   │  Output │
                   │  video  │
                   └─────────┘

The pipeline:

  1. Plan — an LLM generates a creative brief with scene descriptions, music prompts, narration
  2. Generate images — FLUX on the GPU worker creates stills for each scene
  3. Animate — Wan2.2 I2V turns stills into short video clips with motion
  4. Voiceover — Kokoro TTS generates narration audio for each scene
  5. Music — ACE-Step or ffmpeg procedural audio creates the soundtrack
  6. Assemble — ffmpeg concatenates clips, adds transitions, mixes audio, applies color grade

Prerequisites

  • Python 3.10+
  • ffmpeg (with libx264, aac support)
  • ComfyUI with FLUX GGUF models + Wan2.2 I2V (on the GPU worker)
  • Kokoro TTS server (or any OpenAI-compatible TTS API)
  • SSH access to remote machines (optional — works fully local too)

Quick Start

# 1. Clone
git clone https://github.com/optimizedwf/creative-studio-pipeline.git
cd creative-studio-pipeline

# 2. Set up environment
cp .env.example .env
# Edit .env with your machine addresses and paths

# 3. Run a full pipeline
pip install -r requirements.txt  # if you have one
python scripts/creative_studio_local.py music --artifacts ./runs/my-video
python scripts/creative_studio_local.py tts --artifacts ./runs/my-video
python scripts/creative_studio_local.py images --artifacts ./runs/my-video
python scripts/creative_studio_local.py animate --artifacts ./runs/my-video
python scripts/creative_studio_local.py enhance --artifacts ./runs/my-video
python scripts/creative_studio_local.py assemble --artifacts ./runs/my-video

Or use the Archon workflow:

# Requires Archon (separate tool)
archon run creative-studio-local --args "your creative brief here"

Available Commands

Command Description
save-plan Save an LLM-generated plan to artifacts
preflight Check all dependencies and remote connectivity
music Generate music (ACE-Step remote or ffmpeg procedural)
tts Generate voiceover audio (Kokoro or Edge-TTS)
images Generate FLUX stills on the GPU worker
animate Run Wan2.2 I2V animation on the GPU worker
enhance Post-process clips (upscale + frame interpolation)
assemble Concatenate clips, add transitions, mix audio
qa Validate the final video (duration, streams, quality)
critique Per-scene quality analysis (brightness, motion, duration)
detect-beats Detect beat/onset times from music for alignment
regen-scene Regenerate a single failed scene

Environment Variables

Variable Default Description
PIPBOY_SSH user@comfy-host SSH target for the GPU worker
PIPBOY_HOST comfy-host.local Hostname/IP of GPU worker
PIPBOY_COMFY_PATH /path/to/ComfyUI ComfyUI directory on GPU worker
PIPBOY_RUNS_DIR /tmp/creative-studio-runs Working directory on GPU worker
PIPBOY_WORKSPACE /home/user Remote workspace root
PIPBOY_MEDIA_DIR /home/user/remotion-render/public/media Media directory on remote
PIPBOY_BRIDGE_PATH /home/user/comfyui-bridge/bridge.py Remote bridge script path
PIPBOY_WIN_USER User Windows username for WSL file transfer
PIPBOY_WSL_DISTRO Ubuntu WSL distribution name
PIPBOY_KEY $HOME/.ssh/id_ed25519 SSH key path
DELL_SSH user@audio-worker SSH target for the audio worker
DELL_PORT 22 SSH port for audio worker
ACE_STEP_ROOT /opt/ACE-Step-1.5 ACE-Step installation directory
KOKORO_URL http://localhost:8765 Kokoro TTS server URL
KOKORO_VOICE af_heart Default TTS voice
CREATIVE_PUBLIC_ROOT ./output Output directory for final videos
CREATIVE_QUALITY draft Quality preset (draft / final)
CREATIVE_UPSCALE none Upscale mode (none / hd / 2x)
CREATIVE_INTERPOLATE none Frame interpolation (none / film / 2x)
CREATIVE_AUTO_REGEN false Auto-regenerate failed scenes
CREATIVE_SNAP_TO_BEATS true Snap transitions to detected beats

Configuration Reference

See .env.example for all available environment variables.

The config.yaml file can be used for workflow-level configuration.

Security Model

  • No embedded secrets. All credentials, API keys, and machine addresses come from environment variables or .env files.
  • SSH-based only. No cloud APIs, no telemetry, no external callbacks.
  • Local-first. Binds to localhost by default. Run on a single machine with no network for full isolation.
  • File paths are configurable. No hardcoded paths — everything is env-var driven.

License

MIT — see LICENSE.

Disclaimer

This software generates AI media content. You are responsible for:

  • Complying with the terms of service of any APIs you connect (ComfyUI, Kokoro, etc.)
  • Ensuring your content does not infringe on others' rights
  • Using appropriate safety measures when running on production systems

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

About

Distributed AI video creation pipeline — orchestrate ComfyUI (FLUX + Wan2.2), Kokoro TTS, and audio tools across machines via SSH.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages