Kyu's Blog

Notes on local AI, agents, engineering, and experiments.

2026-05-11

Running Qwen3.6 27B dense locally on M4 Max: which path works?

Benchmarks of every practical local path for Qwen3.6 27B dense: oMLX, OptiQ, Ollama NVFP4/MXFP4, KV quantization, and long-context behavior. Low vs High Power.

2026-05-10

Local LLM inference on an M4 Max 128GB

Benchmarking local LLM runtimes in Low vs High Power mode. oMLX, Ollama, ds4, Gemma 4, and DeepSeek V4 Flash compared.