Releases · bkataru/powerglide

06 Mar 10:13

bkataru

v0.3.2

2b8f81d

v0.3.2 — 9B 17/17 Measured, Security Fixes, igllama v0.3.11 Latest

Latest

Highlights

9B Achieves 17/17 at All Quantizations

The full 9B T01-T17 trial is complete. Every quantization variant (Q4, Q5, Q6, Q8, BF16) passes all 17 agentic tasks — code generation, JSON round-trip, error recovery, multi-source synthesis, everything. 9B is the gold standard for local agentic tool use.

9B Variant	Score	Turns	Time
9B-Q4	17/17	38	9127s
9B-Q5	17/17	49	14239s
9B-Q6	17/17	39	8642s
9B-Q8	17/17	39	17324s
9B-BF16	17/17	43	15722s

Security and Correctness Fixes

OAIResponse use-after-free — send() now dupes all string fields before returning (previously returned dangling pointers into freed JSON parse tree)
Config save malformed JSON — save() rewrote using std.fmt instead of broken manual string concatenation
getToolCalls page_allocator leak — replaced leaking ArrayList with proper allocator-based allocation
Auth header memory leak — Bearer {key} string now freed after HTTP request

igllama v0.3.11

Strip residual </think> tokens when --no-think is active (both streaming and non-streaming paths)

Verified

195/195 tests pass, 0 leaks
GH Pages rebuilt and deployed

Assets 2

05 Mar 12:30

bkataru

v0.3.1

07fe638

v0.3.1 — Context Sensitivity Harness + Trial Filter + 4B T01-T17 Measured

What's New

Context Length Sensitivity Harness

New examples/ctx_sensitivity.zig measures 2B-Q6 accuracy across ctx-size 512/1024/2048/4096 × T01-T17. Run with zig build ctx.

Trial Quantization Filter

trial-quant now accepts optional model names as CLI arguments:

./zig-out/bin/trial-quant 9B-Q4 9B-Q5     # only these two
./zig-out/bin/trial-quant                  # all 16 models

4B-Q4 T01-T17 Fully Measured

Result: 15/17, 63 turns, ~9050s
T04 (multi-step write) and T16 (Zig compile recovery) fail via turn exhaustion at 1.3 tok/s
These are characteristic failure modes for 4B at Q4 — not correctness failures

Verified

195/195 tests pass, 0 leaks
9B T01-T17 re-run in progress (background); showcase will be updated when complete

Assets 2

05 Mar 08:41

bkataru

v0.3.0

70b08f9

v0.3.0 — 4B Quant Curve, Throughput Benchmark, Extended Task Suite

What's New

New: Throughput Benchmark (`examples/bench.zig`)

zig build bench

Measures tokens/second for each Qwen3.5 model across Q4/Q8/BF16 precision levels. Reports tok/s, file size, and RAM (RSS). Requires igllama v0.3.10+ for accurate usage.completion_tokens.

Measured results (CPU-only, 4 threads):

Model	tok/s	RAM
0.8B-Q8	3.4	0.8 GB
0.8B-BF16	2.9	1.5 GB
2B-Q4	2.9	1.3 GB
2B-Q8	2.6	1.9 GB
2B-BF16	1.9	3.6 GB
4B-Q4	1.3	2.7 GB
4B-Q8	0.1	~4 GB (swap!)

Key finding: RAM is the hard limit. Models that exceed physical RAM fall off a cliff (4B-Q8: 0.1 tok/s from swap thrashing). 4B-Q4 is the practical ceiling on systems with ≤6 GB free RAM.

Extended: 4B Full Quant Curve

Added Qwen3.5-4B-Q4_K_M, Q5_K_M, and Q6_K to trial_quant.zig. All three pass 13/17 — 4B is saturated at Q4. 4B-Q4 (2.6 GB) is the recommended production config: full accuracy, minimum file size.

Extended: T01–T17 in `trial_quant.zig`

The quantization sensitivity harness now runs all 17 tasks (was T01–T13), adding code generation with zig fmt validation (T14), JSON round-trip (T15), error recovery (T16), and multi-source synthesis (T17) across all 16 quantization variants.

trial_quant.zig now covers: 16 models × 17 tasks = 272 test cases per full run.

igllama v0.3.10 Upstream Fix

Patched usage.completion_tokens in igllama's non-streaming /v1/chat/completions handler — was hardcoded 0, now returns real counts. Fix upstreamed as igllama PR #82, released as igllama v0.3.10.

Upgrade Notes

Requires igllama v0.3.10+ for accurate bench token counts (fallback estimate still works with older builds)
Download 4B quant GGUFs: igllama pull unsloth/Qwen3.5-4B-GGUF -f Qwen3.5-4B-Q4_K_M.gguf

Stats

195/195 tests pass
3 harness executables: trial, trial-quant, bench
16 models × 17 tasks in quantization harness

Assets 2

04 Mar 18:23

bkataru

v0.2.7

19cbf69

v0.2.7 — BF16 precision trials, igllama grammar fix, LICENSE

What's new in v0.2.7

Added

BF16 in trial_quant.zig — 2B-BF16 and 9B-BF16 added to QUANT_MODELS; harness now covers the full Q4/Q5/Q6/Q8/BF16 precision curve
LICENSE — MIT license file added

Fixed

igllama v0.3.10 — streaming json_mode use-after-free — streaming handler freed the grammar string while the sampler held a pointer to it; replaced with direct JSON_GRAMMAR comptime const (matches non-streaming handler)
trial_quant.zig — changed response_format from json_object to text; the grammar sampler in the bundled llama.cpp crashes during generation for 2B+ model vocabularies; system prompt JSON constraint is sufficient for 4B+

Changed

showcase.smd — documents igllama json_mode crash finding, expanded trial task suite to T01–T17, updated framework version

Build

zig build trial-quant   # Q4/Q5/Q6/Q8/BF16 sensitivity trial for 2B + 9B
zig build trial         # T01-T17 across all 4 weight classes
zig build test          # 195/195 tests

Assets 2

04 Mar 09:45

bkataru

v0.2.2

71ec8c1

v0.2.2 — Session summary, igllama port scan, json_mode, Showcase

What's new in v0.2.2

Features

Session summary output — powerglide run now emits a structured completion block with steps, elapsed time, agent/model, and the <POWERGLIDE_DONE> or <POWERGLIDE_ERROR> terminal signal
igllama port scan — powerglide doctor scans :8090–8099 and reports all running igllama instances simultaneously
json_mode on OpenAIClient — sets response_format: {"type":"json_object"} to force constrained JSON output from igllama and other local endpoints

New Showcase page

Live at bkataru.github.io/powerglide/showcase — four case studies documenting powerglide dogfooding with Qwen3.5 0.8B and 4B models via igllama, including the honest tool calling triage and performance table.

Bug fix

Loop step count increments test now uses an isolated /tmp session file; previously picked up real .powerglide/session.json from dogfooding runs

Stats

195/195 tests passing, 0 memory leaks
Fully local stack: powerglide + igllama + Qwen3.5-4B, no API keys required

Assets 2

04 Mar 09:10

bkataru

v0.2.1

171a597

v0.2.1 — 195 tests, bug fixes

v0.2.1

Test Coverage Expansion

195/195 tests passing (up from 170)
New test modules: SSE parser, HTTP response, persistence manager
Root module now covers all submodules via refAllDecls

Bug Fixes (uncovered by expanded coverage)

stream.zig: unmanaged ArrayList API fixes (Zig 0.15.2 compliance)
terminal/pool.zig: sessions.size → sessions.count()
terminal/session.zig: array literal syntax fix, orphaned test code removed

195/195 tests, 0 leaked.

Assets 2

04 Mar 08:36

bkataru

v0.2.0

43d7bde

v0.2.0 — MCP Integration

What's New in v0.2.0

MCP Integration

MCP Server — powerglide mcp starts powerglide as a JSON-RPC 2.0 MCP server over stdin/stdout, exposing all registered tools to any MCP-compatible client
MCP Client — connect to external MCP servers; their tools become first-class powerglide tools prefixed as mcp_{server}_{tool}
Tool Bridge — transparent McpTool → Tool conversion for seamless integration
Config support — mcp_servers array in ~/.config/powerglide/config.json

Fixes

Stdin API fix for Zig 0.15.2 (posix.read byte-by-byte pattern)
Favicon 404 on GitHub Pages resolved
Homepage title deduplication

138/138 tests passing.

Assets 2

Releases: bkataru/powerglide

v0.3.2 — 9B 17/17 Measured, Security Fixes, igllama v0.3.11

Highlights

9B Achieves 17/17 at All Quantizations

Security and Correctness Fixes

igllama v0.3.11

Verified

Uh oh!

v0.3.1 — Context Sensitivity Harness + Trial Filter + 4B T01-T17 Measured

What's New

Context Length Sensitivity Harness

Trial Quantization Filter

4B-Q4 T01-T17 Fully Measured

Verified

Uh oh!

v0.3.0 — 4B Quant Curve, Throughput Benchmark, Extended Task Suite

What's New

New: Throughput Benchmark (examples/bench.zig)

Extended: 4B Full Quant Curve

Extended: T01–T17 in trial_quant.zig

igllama v0.3.10 Upstream Fix

Upgrade Notes

Stats

Uh oh!

v0.2.7 — BF16 precision trials, igllama grammar fix, LICENSE

What's new in v0.2.7

Added

Fixed

Changed

Build

Uh oh!

v0.2.2 — Session summary, igllama port scan, json_mode, Showcase

What's new in v0.2.2

Features

New Showcase page

Bug fix

Stats

Uh oh!

v0.2.1 — 195 tests, bug fixes

v0.2.1

Test Coverage Expansion

Bug Fixes (uncovered by expanded coverage)

Uh oh!

v0.2.0 — MCP Integration

What's New in v0.2.0

MCP Integration

Fixes

Uh oh!

New: Throughput Benchmark (`examples/bench.zig`)

Extended: T01–T17 in `trial_quant.zig`