Skip to content
View druide67's full-sized avatar

Block or report druide67

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
druide67/README.md

Same model. Same Mac. 30 vs 71 tok/s. That's why I built asiai.

🦞 I'm Jean-Marc (druide67) — I build tools for local LLM inference on Apple Silicon.

asiai : Benchmark, monitor & compare 6 inference engines (Ollama, LM Studio, mlx-lm, llama.cpp, vllm-mlx, Exo). One CLI. Real numbers.

Built because my AI agents needed to monitor their own inference. So I gave them asiai's API. They started monitoring themselves.

Bench your claw!

Recent discoveries

  • MLX is 2.3x faster than llama.cpp for MoE architectures on Apple Silicon
  • DeltaNet KV cache stays flat from 64k to 256k context (same VRAM!)
  • Same model, same Mac: 30 tok/s on one engine, 71 tok/s on another

claude-whisper : Your Claude Code instances can now talk to each other. 240 lines of bash, zero daemon. The filesystem is the message bus.

OpenClaw : contributor — multi-agent AI assistant.

Strasbourg, France | asiai.dev | @jmn67 on X | LinkedIn

Pinned Loading

  1. asiai asiai Public

    Multi-engine LLM benchmark & monitoring CLI for Apple Silicon

    Python 5 2

  2. college-of-ai-rchitects college-of-ai-rchitects Public

    PRISM — A multi-LLM peer review framework for architectural governance. Triangulate AI Architecture Decisions.

    2