changelog¶
v0.7.1 — 2026-04-20¶
New env knob: DOER_LOAD_TOOLS_FROM_DIR¶
- Controls whether
Agent(load_tools_from_directory=...)isTrue(default, hot-reload./tools/*.py) orFalse. - Fixes threading race in
hf_jobs/gen_dataset.py— Strands' hot-reload watcher isn't thread-safe at construction. - Default preserves old behaviour —
1(on) unless explicitly unset/false. - Accepted falsy:
0,false,no,off,""(case-insensitive). Everything else is true. - Use case 1: bulk/concurrent training-script runs —
DOER_LOAD_TOOLS_FROM_DIR=0 doer ... - Use case 2: tiny HF job containers where
./tools/doesn't exist and file watching costs RAM. - Use case 3: ephemeral/sandboxed agents that should not auto-load user tools.
gen_dataset.py now sets DOER_LOAD_TOOLS_FROM_DIR=0 by default so dataset generation is thread-safe out of the box.
v0.7.0 — 2026-04-20¶
Cloud dataset generation (HF Jobs)¶
- New:
doer --hf-jobs gen— generate dense training records on HF infrastructure, auto-append tocagataydev/doer-training - New:
doer/hf_jobs/gen_dataset.py— single UV script, ~290 lines, thread-safe, dedupe-by-sha256 - New:
doer/hf_jobs/prompts.example.txt— 59 seed prompts covering pipe/shell/coding/self-awareness/agentic/meta/style patterns - Input flexibility: local file,
hf://user/ds[:column], or stdin (-) - Provider matrix: Bedrock (default), Ollama, Anthropic, OpenAI — launcher auto-wires the right secret
- Provenance: every generated record carries
generated_by = "doer --hf-jobs gen @ <job_id>"— filter synthetic vs. human records trivially - Idempotent: rerun same prompts, nothing happens (sha256 dedupe against existing dataset)
- Cost: ~$0.60 per 500 records (Bedrock Opus 4.7 @ concurrency=8)
- Full loop is now cloud-native: generate → train → deploy, zero laptop involvement
Packaging¶
doer.hf_jobspackage-data now includes*.txtsoprompts.example.txtships in the wheel- Version bump
0.6.1 → 0.7.0(minor — new feature, fully backward-compatible)
Migration¶
No breaking changes. The gen subcommand is additive; existing text / vlm / omni / ps / logs / hw workflows unchanged.
v0.6.1 — 2026-04-20¶
Packaging polish¶
hf_jobs/now ships inside the wheel — no moregit clonerequired to use HF Jobs training- New
doer --hf-jobsCLI shortcut — dispatches to the bundledlaunch.sh: doer --hf-jobs— prints the bundled directory pathdoer --hf-jobs text|vlm|omni— launch cloud training (was./hf_jobs/launch.sh ...)doer --hf-jobs ps|logs|hw— manage running jobs / see hardware pricing- Wheel size: 16 KB → 50 KB (still tiny; adds 5 bundled files)
- Fixes: pipx/conda install conflicts — clean install now works across shells
Migration¶
- Old:
./hf_jobs/launch.sh text - New:
doer --hf-jobs text(or still works from a git clone)
Changelog¶
Compressed history. Newest first.
v0.6.0 — cloud training (HuggingFace Jobs)¶
hf_jobs/suite — burn HF credits instead of battery for scale-up traininghf_jobs/train_text_lora.py— any causal LM → LoRA → merged push (Qwen3-1.7B default)hf_jobs/train_vlm.py— Qwen2.5-VL-3B image+text LoRAhf_jobs/train_omni.py— Qwen2.5-Omni-7B text+audio+image LoRAhf_jobs/launch.sh— one-shot dispatcher (text/vlm/omni/ps/logs/hw)- One file per trainer, inline UV deps — no repo setup, no Dockerfile,
hf jobs uv runhandles everything - Raw JSONL loading via
hf_hub_download— bypasses Arrow schema churn on heterogeneous multimodal records - Merge + push by default — output is a drop-in for
transformers.AutoModelForCausalLM.from_pretrained, nopeftglue on the consumer side - Tool calls preserved — Strands
toolUse/toolResultbecome native<tool_call>/<tool_result>tags so the chat template lays down real tool-call tokens - Validated end-to-end: T4-medium, 522 records → 468/53, 50 steps / 33 min, eval_loss 0.149, token accuracy 97.6%, 3.44 GB merged model auto-pushed
- Local
--train/--train-vlmunchanged — cloud is opt-in, laptop-first stays default - New docs:
train.md#train-in-the-cloud-huggingface-jobs
v0.5.0 — multimodal + dataset publishing¶
- Multimodal input —
--img,--audio,--videoflags route tomlx-vlmautomatically - vision-only →
Qwen2.5-VL-3B - audio-only →
gemma-3n-E2B-it - mixed (image + audio) →
Qwen3-Omni-30B-A3B - VLM LoRA training —
do --train-vlm [iters]trains on image/audio/video records - HuggingFace upload —
do --upload-hf/do --upload-hf-publicpublishes the corpus as an HF dataset (private by default). Idempotent, one atomic commit, reuseshuggingface-cli login. --train-statusrefreshed — shows sha256, modality breakdown (text/image/audio/video), HF sync state- Structural refactor — same CLI surface, clearer internals (PR #5). ~730 lines total.
- New env knobs:
DOER_MLX_VLM_MODEL,DOER_MLX_AUDIO_MODEL,DOER_MLX_OMNI_MODEL,DOER_VLM_ADAPTER,DOER_HF_REPO,DOER_CACHE_PROMPT,DOER_BEDROCK_GUARDRAIL_ID/VERSION,DOER_ADDITIONAL_REQUEST_FIELDS - New opt-in extras:
[vlm](mlx-vlm + datasets),[hf](huggingface-hub),[all]
v0.4.0 — closed the loop¶
- Self-training — every
do "..."call appends a dense, self-contained record to~/.doer_training.jsonl(full system prompt + messages + tool specs) - In-process LoRA via
do --train [iters]— callsmlx_lm.tunerdirectly, nostrands-mlxtrainer indirection (~50 lines) - Native tool-call tokens —
_strands_to_openai()preservestool_callsas structured data so tokenizer chat templates emit real<tool_call>tokens (Qwen/Llama), not string mimicry - MLX provider —
DOER_PROVIDER=mlxfor Apple Silicon on-device inference with LoRA hot-swap viaDOER_ADAPTER - Corpus inspector —
do --train-statusshows turn count, KB, path - Auto-detect extended — provider order now
bedrock → mlx (Apple Silicon) → ollama - New env knobs:
DOER_MLX_MODEL,DOER_ADAPTER,DOER_DEBUG - Opt-in extra:
pip install 'doer-cli[mlx]'pullsstrands-mlx+mlx-lm(~500MB) — default install stays lean - ~420 LOC (up from 221) at the time — one file, one default dep
- New docs: Train on yourself
v0.3.0 — frontier by default¶
- Default model:
global.anthropic.claude-opus-4-7on Bedrock (was Ollama-only) - Auto-detect provider — Bedrock if AWS creds exist, else Ollama fallback
- 1M context window auto-enabled via
context-1m-2025-08-07beta header - 128k max output (Opus 4.7 native cap; raise via
DOER_MAX_TOKENS) - Opt-in
temperature/top_p— Opus 4.7+ rejects non-default sampling, so doer skips them unless explicitly set - New env knobs:
DOER_PROVIDER,DOER_BEDROCK_MODEL,DOER_BEDROCK_REGION,DOER_ANTHROPIC_BETA,DOER_ADDITIONAL_REQUEST_FIELDS - 221 LOC (up from 164) — still one file, still one dep
v0.2.1 — curl or pipx¶
doshortcut alongsidedoer(less typing)- One-line installer (
curl | sh) planned via GitHub Releases - Renamed to
doer-clion PyPI (doerwas squatted) - Repo moved to
github.com/cagataycali/doer-cli - Docs: migrated to mkdocs-material (mobile-first, proper nav, dark/light, cookbook)
v0.2.0 — new brand¶
- Bold, solid, pipe-first identity (orange
#FF3D00+ black + paper) - Custom SVG logo
- Clean README, stripped marketing copy
- Auto-inject
SOUL.md+AGENTS.mdinto system prompt
v0.1.x — the primordial soup¶
- 164 LOC — fits on one screen (barely)
- Only dep:
strands-agents - Ollama-only (local, private, no keys)
- Injects own source,
$HOME/.bash_history,$HOME/.zsh_history,~/.doer_history - Hot-reload tools from
./tools/*.py - PyInstaller + Nuitka standalone binaries (linux/macos)
- Rename:
tiny→doer(better verb)
pre-history¶
- Spawned from DevDuck — 60+ tools, every protocol
- DevDuck asked itself at 4am: what if we deleted almost everything?
- Two hours later:
doer.
the cathedral teaches you which stones are load-bearing.