Overview
How it works
brancher-mini reads your staged Git diff, feeds it to a local language model, and writes a Conventional Commit message for you — all without sending anything to the internet.
git add, or pass
--all to let brancher-mini stage everything for you.
Lock files and generated/build artifacts (source maps, minified assets,
dist/ output) are detected automatically and committed with a
static message — no LLM call needed for them.
Core concept
Two generation modes
Both modes analyse each file individually first, then diverge in how they group and present the results.
Compact
brancher-miniAnalyses each file individually, then squashes all per-file summaries into a single commit message covering all staged changes.
Best for: focused, single-purpose changes — a bug fix, a feature addition, a refactor.
Extended
--mode extendedAnalyses each file individually, groups related files by semantic similarity, then generates one commit message per group and commits them separately.
Best for: large changesets that touch multiple concerns at once — e.g. a feature plus a docs update plus a dependency bump.
How large diffs are handled
Each file's diff goes through a three-tier pipeline before reaching the LLM:
- Fits the context budget — one LLM call (the normal path).
- Exceeds budget, multiple hunks — split on
@@boundaries into batches, each summarised separately, then merged. - Single oversized hunk (minified line, binary blob) — no LLM call; falls back to
chore: update <filename>.
Privacy & performance
Local-first design
Every part of the inference pipeline runs on your machine. Your code, diffs, and commit messages are never sent anywhere.
Resumable model downloads
Models are downloaded from Hugging Face on first use and cached in
~/.brancher-mini/models/.
An interrupted download picks up where it left off. GGUF magic-byte
verification catches corrupt downloads before they reach the model loader.
Load-on-exec mode
Loads the model fresh on each run, generates, then unloads. Ideal if you commit infrequently or want zero background processes.
Daemon mode
Keeps the model loaded in a background process so every subsequent
commit is instant — no reload wait. The daemon starts automatically
and is managed via brancher-mini daemon.
GPU with CPU fallback
Inference runs with all layers on the GPU by default. If the process exits within 8 seconds (OOM), it automatically restarts in CPU-only mode.
Capabilities
Features
-
Conventional Commits — always generates properly typed messages:
feat:,fix:,chore:,refactor:and more. - Two generation modes — compact (single message) and extended (per-group commits for multi-concern changesets).
- Interactive workflow — accept, edit inline, regenerate, or exit without committing. Edit uses your existing message as the default.
- Staging area safety — your staging area is always restored to its original state on cancel, ignore, or crash.
-
Lock file & generated file detection —
package-lock.json,yarn.lock, source maps, minified assets, anddist/output get static messages — no LLM call wasted. - Large diff handling — oversized diffs are chunked by hunk and summarised in batches; single-hunk monster files fall back gracefully.
- Background daemon mode — keeps the model warm between commits for instant generation on subsequent runs.
- GPU inference with automatic CPU fallback — uses all available GPU layers; restarts in CPU mode if VRAM is insufficient.
-
Debug logging — optionally captures every LLM prompt,
output, and crash to dated log files in
~/.brancher-mini/logs/.
Reference
Usage
Generate a commit message
brancher-mini # analyse staged changes brancher-mini --all # stage everything, then generate brancher-mini --mode extended # one message per logical change group brancher-mini --message-only # print message to stdout, skip prompts brancher-mini --single-line # title only, no body
Configuration
Run brancher-mini config to open the interactive setup wizard.
Settings are saved to ~/.brancher-minirc.json.
| Option | Values | Default |
|---|---|---|
model |
gemma-4:e4b — Gemma 3 4Bqwen2.5-coder:3b — Qwen2.5-Coder 3B |
Set at first run |
gitBehavior |
staged-only · stage-all |
Set at first run |
executionMode |
load-on-exec · always-loaded |
Set at first run |
telemetry |
true · false |
true (stable releases only) |
Daemon
Daemon mode keeps the model loaded between commits for instant generation.
It starts automatically when executionMode is set to
always-loaded.
brancher-mini daemon status # check if daemon is running brancher-mini daemon stop # stop the daemon
Debug logging
When enabled, every commit run appends a structured log to
~/.brancher-mini/logs/YYYY-MM-DD.log
containing the full LLM prompt, output, and any crash details.
Crash logs are always written regardless of the debug flag.
brancher-mini debug enable # turn on verbose logging brancher-mini debug disable # turn off verbose logging brancher-mini debug status # show log file count and size brancher-mini debug clear # delete all log files
Under the hood
Tech stack
brancher-mini is a single self-contained binary with no runtime dependencies for end users.
Runtime
Bun + TypeScript — compiled to a self-contained native binary (~60–80 MB). No Node or Bun required on the end-user's machine.
LLM inference
llama-server (llama.cpp's built-in HTTP server) — exposes an OpenAI-compatible API on localhost. Bundled in the release tarball.
Models
GGUF Q4_K_M quantised models from Hugging Face. Downloaded on first use, cached locally. Verified with GGUF magic-byte check after download.
CLI framework
Commander.js for subcommands and flags. Inquirer (@inquirer/select, @inquirer/input) for interactive prompts.
.tar.gz and installed by install.sh (Linux/macOS)
or install.ps1 (Windows). No package manager, no runtime, no PATH
gymnastics beyond a single symlink.