Skip to content

Usage

do and doer are the same program. Use whichever your fingers prefer.

one-shot

do "find files larger than 100MB"
do "current git branch?"
do "what's my public IP?"

Direct question → direct answer. No > prompts. No conversation.

piped

When you pipe to do, stdin becomes the context. Your query becomes the instruction.

cat error.log | do "what broke"
git diff      | do "review this"
git log -20   | do "release notes"
curl -s api.io | do "summarize"
echo '{"a":1}' | do "to yaml"

piped = no markdown

doer detects TTY. When its output goes to a pipe or file, it strips markdown.
Just the answer. Clean. Parseable.

multimodal

Pass --img, --audio, or --video to route through a VLM. Auto-selects the right model based on modality mix. Requires pip install 'doer-cli[vlm]' on Apple Silicon.

do --img screenshot.png  "what's in this UI?"
do --audio call.wav      "transcribe + bullet action items"
do --video clip.mp4      "describe what's happening"

# multiple images
do --img a.png --img b.png "spot the differences"

# mixed modality → omni model
do --img slide.png --audio speaker.wav "summarize this presentation"

Flags can appear anywhere in the command. The agent sees the raw bytes along with your query + stdin + all the usual prompt context.

chained

Because output is clean, chaining with |, tee, xargs, awk, jq just works.

# top 5 big files → long-format listing
do "list top 5 largest files in cwd" | xargs -n1 ls -lh

# fetch → filter → jq
curl -s api.com/users | do "filter admins only" | jq .

# branch graph → plain english
git log --graph --oneline -30 | do "explain this branch topology"

# extract → email
tail -200 sales.csv | do "top 3 customers by revenue" | mail -s "weekly" boss@co

env knobs

No config file. Every knob is an env var. Put them in your .zshrc/.bashrc or inline.

var default purpose
DOER_PROVIDER (auto: mlx-vlm → bedrock → mlx → ollama) bedrock | mlx | mlx-vlm | ollama
DOER_HISTORY 10 Q/A pairs injected into prompt
DOER_SHELL_HISTORY 20 shell history lines in prompt
Bedrock
DOER_BEDROCK_MODEL global.anthropic.claude-opus-4-7 bedrock model id or inference profile
DOER_BEDROCK_REGION us-west-2 bedrock region
DOER_ANTHROPIC_BETA context-1m-2025-08-07 (Claude only) comma-sep anthropic_beta headers
DOER_MAX_TOKENS 128000 (Opus 4.7 native max) cap output length
DOER_TEMPERATURE (unset — Opus 4.7 rejects non-default) sampling temperature
DOER_TOP_P (unset — Opus 4.7 rejects non-default) nucleus sampling
DOER_CACHE_PROMPT (unset) enable prompt caching
AWS_BEARER_TOKEN_BEDROCK (preferred) bedrock bearer token
Ollama
DOER_MODEL qwen3:1.7b any model Ollama can run
OLLAMA_HOST http://localhost:11434 point at a remote Ollama
MLX (Apple Silicon, opt-in)
DOER_MLX_MODEL mlx-community/Qwen3-1.7B-4bit base MLX model
DOER_ADAPTER (unset) path to LoRA adapter (hot-swap)
MLX-VLM (multimodal, opt-in)
DOER_MLX_VLM_MODEL mlx-community/Qwen2.5-VL-3B-Instruct-4bit vision model (--img)
DOER_MLX_AUDIO_MODEL mlx-community/gemma-3n-E2B-it-4bit audio model (--audio)
DOER_MLX_OMNI_MODEL mlx-community/Qwen3-Omni-30B-A3B-Instruct-4bit omni (image + audio)
DOER_VLM_ADAPTER (unset) path to VLM LoRA adapter
HuggingFace upload (opt-in)
DOER_HF_REPO <user>/doer-training target dataset repo for --upload-hf
HF_TOKEN (fallback: ~/.cache/huggingface/token) HF auth — or huggingface-cli login
# default: Claude Opus 4.7 on Bedrock (1M context, 128k output)
do "review this pull request" < diff.patch

# force ollama for offline/private work
DOER_PROVIDER=ollama DOER_MODEL=qwen3:4b do "rewrite idiomatic" < utils.py

# pin a specific Bedrock model (Sonnet 4 for cost savings)
DOER_BEDROCK_MODEL=us.anthropic.claude-sonnet-4-20250514-v1:0 do "summarize"

# disable the 1M context beta (saves a header, ≤200k ctx)
DOER_ANTHROPIC_BETA="" do "short one"

# MLX with your trained adapter (see: Train on yourself)
DOER_PROVIDER=mlx DOER_ADAPTER=~/.doer_adapter do "terse answer"

# remote ollama (on your beefy box)
OLLAMA_HOST=http://gpu-box:11434 DOER_PROVIDER=ollama do "explain this codebase"

Opus 4.7 breaking changes

  • temperature / top_p are not sent by default — Opus 4.7 returns 400 on any non-default value.
  • output-300k-2026-03-24 (seen in SDKs) is not yet accepted by Bedrock. Opus 4.7's real cap is 128kdoer defaults to that.
  • Both issues go away for older models (Sonnet 4, Opus 4.6) — just pin DOER_BEDROCK_MODEL.

train on yourself

do --train-status         # show corpus stats + HF sync state
do --train                # 200 text LoRA iters (needs `[mlx]`)
do --train 500            # 500 iters
do --train-vlm 300        # vision LoRA on image/audio/video records (needs `[vlm]`)
do --upload-hf            # push corpus to <user>/doer-training (private)
do --upload-hf-public     # push corpus to <user>/doer-training (public)

Every do "..." call appends a full training record automatically. See Train on yourself → for the collect → train → hot-swap loop.

when to use which binary

situation prefer
inline one-liner, tight pipe do
script where clarity wins doer
aliasing to something shorter do

Both resolve to the same Python. Zero difference in behavior.

exit codes

code meaning
0 clean run
1 uncaught error (usually Ollama down)
130 Ctrl-C

Use in scripts:

if ! do "is there a TODO in $FILE?" < "$FILE" | grep -qi yes; then
    echo "clean" && exit 0
fi
echo "has todos" && exit 1

interactive mode

Run do with no args, no pipe → interactive REPL.

$ do
🦆 > how many files here?
23
🦆 > which is the biggest?
video.mp4  412 MB
🦆 > ^D

Each turn writes to ~/.doer_history, which is re-injected next session. No active connection. No websocket. Just files.

read the Cookbook →

for 40+ real-world recipes.