Skip to content

notactuallytreyanastasio/bluesky_bot_python

Repository files navigation

Bluesky Bot (Python)

Fine-tune Qwen3-8B on your posts and auto-post to Bluesky.

This is the original Python implementation that inspired the Elixir version.

What This Does

  1. Scrape your existing posts from Bluesky/Twitter
  2. Fine-tune Qwen3-8B-4bit using LoRA on Apple Silicon
  3. Generate new posts in your voice
  4. Post automatically to Bluesky on a schedule

Requirements

  • Apple Silicon Mac (M1/M2/M3/M4)
  • Python 3.10+
  • ~8GB RAM for training, ~5GB for inference
pip install -r requirements.txt

Quick Start

1. Set Up Credentials

cp .env.example .env
# Edit .env with your Bluesky credentials

Get an app password: Bluesky → Settings → App Passwords → Add App Password

2. Collect Training Data

# Fetch your Bluesky posts
python fetch_bluesky_posts.py @your-handle.bsky.social -o bluesky_posts.jsonl

# If you have Twitter data, combine them
python combine_data.py

3. Fine-Tune the Model

# Train with LoRA (takes 1-2 hours on M1 Pro)
mlx_lm.lora --config qwen3_4bit_config.yaml

4. Merge Adapters

# Fuse LoRA weights into base model
python merge_lora.py

5. Generate Posts

# Preview a generated post
python bluesky_bot.py generate

# Post to Bluesky
python bluesky_bot.py post

# Run on schedule (every 4 hours)
python bluesky_bot.py schedule --interval 4

Files

File Purpose
bluesky_bot.py Main bot - generate and post
fetch_bluesky_posts.py Scrape posts from Bluesky
combine_data.py Merge Twitter + Bluesky data
merge_lora.py Fuse LoRA adapters into model
qwen3_4bit_config.yaml Training configuration
requirements.txt Python dependencies

Training Configuration

model: lmstudio-community/Qwen3-8B-MLX-4bit
batch_size: 1
grad_accumulation_steps: 8    # Effective batch = 8
iters: 2000
learning_rate: 1e-5
max_seq_length: 256

lora_parameters:
  rank: 16
  dropout: 0.05
  scale: 32.0

Why These Settings?

  • 4-bit model: Fits in 5GB, trains on consumer Macs
  • LoRA rank 16: Good quality without massive VRAM
  • Learning rate 1e-5: Low to preserve base model knowledge
  • max_seq 256: Posts are short, no need for long context

Data Format

Training data is ChatML format in JSONL:

{"messages": [
  {"role": "user", "content": "Write a tweet in your authentic voice."},
  {"role": "assistant", "content": "your actual post here"}
]}

The fetch script uses varied prompts to help the model generalize.

macOS Service

Run the bot as a background service:

# Install LaunchAgent
./install.sh

# Check status
launchctl list | grep bluesky

# View logs
tail -f ~/Desktop/bskybot.log

# Uninstall
./uninstall.sh

Generation Settings

GENERATION_CONFIG = {
    "temp": 0.8,      # Higher = more random
    "top_p": 0.9,     # Nucleus sampling
    "max_tokens": 280 # Bluesky limit is 300
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Training Pipeline                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │   Your       │    │   ChatML     │    │   Combined   │   │
│  │   Posts      │ -> │   Format     │ -> │   Dataset    │   │
│  └──────────────┘    └──────────────┘    └──────────────┘   │
│                                                │             │
│                                                ▼             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │   Qwen3-8B   │    │    LoRA      │    │   Trained    │   │
│  │   4-bit      │ -> │   Training   │ -> │   Adapters   │   │
│  └──────────────┘    └──────────────┘    └──────────────┘   │
│                                                │             │
│                                                ▼             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │   Base +     │    │    Fuse      │    │   Fused      │   │
│  │   Adapters   │ -> │   Weights    │ -> │   Model      │   │
│  └──────────────┘    └──────────────┘    └──────────────┘   │
│                                                              │
├─────────────────────────────────────────────────────────────┤
│                    Inference Pipeline                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │   Prompt     │    │   MLX-LM     │    │   Generated  │   │
│  │   Template   │ -> │   Generate   │ -> │   Post       │   │
│  └──────────────┘    └──────────────┘    └──────────────┘   │
│                                                │             │
│                                                ▼             │
│                                         ┌──────────────┐    │
│                                         │   Bluesky    │    │
│                                         │   API Post   │    │
│                                         └──────────────┘    │
│                                                              │
└─────────────────────────────────────────────────────────────┘

How It Works

MLX on Apple Silicon

MLX is Apple's ML framework optimized for M-series chips:

  • Unified Memory: CPU and GPU share RAM - no copying
  • Lazy Evaluation: Builds compute graph, executes efficiently
  • Native Quantization: 4-bit inference without quality loss

LoRA Fine-Tuning

Instead of training all 8B parameters:

  1. Freeze the base model weights
  2. Add small "adapter" matrices (rank 16)
  3. Only train the adapters (~0.1% of params)
  4. Merge adapters back into model after training

This lets you fine-tune on a laptop in 1-2 hours.

Generation

from mlx_lm import load, generate

model, tokenizer = load("./fused_model")
response = generate(
    model,
    tokenizer,
    prompt="<|im_start|>user\nWrite a post<|im_end|>\n<|im_start|>assistant\n",
    max_tokens=280,
    temp=0.8
)

Troubleshooting

"mlx_lm API changed"

The library updated. Use the new sampler API:

from mlx_lm.sample_utils import make_sampler
sampler = make_sampler(temp=0.8, top_p=0.9)

Out of Memory

  • Reduce batch_size to 1
  • Reduce max_seq_length to 128
  • Close other applications

Posts Too Long

  • Reduce max_tokens to 200
  • Add "keep it brief" to prompt

The Elixir Version

This Python version works great. But I wanted to run inference in Elixir.

That required:

  • Forking EMLX to add quantization NIFs
  • Writing a safetensors parser
  • Implementing Qwen3 architecture in Elixir
  • Building a Phoenix app around it

See bobby_posts for the insane Elixir version.

License

MIT

About

Fine-tune Qwen3-8B on your posts and auto-post to Bluesky (Python/MLX)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors