cyber witchery lab

heartbeat: a cyclical contributor bot for your github org

a cron-driven claude code agent that opens and reviews a small slate of prs each cycle
git clone git@github.com:cyberwitchery/cwl-agents

links: repo

heartbeat is a cyclical autonomous agent that contributes to a github org using claude code. it runs every few hours and, in each cycle, opens a small slate of prs (2 small, 2 medium, 1 large), then reviews them and emails a summary.

it lives in the cwl-agents repo, which is the bucket where i’m collecting the long-running claude code agents i use for the cyber witchery lab org.

why?

i have about 15 actively developed repos under cyberwitchery, most of them rust libraries that need the usual routine maintenance: dep bumps, clippy warnings, typos, missing tests, the occasional half-implemented feature someone (me) left behind. i wanted something that would chip away at that backlog steadily in the background, without me having to remember or schedule it.

the obvious alternative is “just have claude code do it interactively when i notice something”. but the work i notice isn’t the work that actually needs doing. what’s visible is the stuff blocking me, not the quiet backlog in repos i haven’t opened in weeks. heartbeat sweeps the whole org every cycle and picks its own work, which catches things i wouldn’t.

how it works

the core loop is a single tick.sh that cron fires every 30 minutes:

cron (every 30min)
  └─ tick.sh
       ├─ usage check (optional, skip if burning too fast)
       ├─ heartbeat_prompt.md          orchestrator
       │    ├─ sync repos
       │    ├─ sweep for work
       │    └─ for each topic:
       │         └─ heartbeat_topic_prompt.md   worker (own session, 30min cap)
       ├─ reviewer_prompt.md           reviews the prs just opened
       └─ release_check_prompt.md      (every few days) opens "publish vX.Y.Z" issues

each topic (a single concrete task like “fix this clippy warning” or “implement the feature in issue #42”) runs in its own claude session with a 30-minute timeout. that isolation is deliberate: a runaway implementation on one topic can’t eat the cycle’s budget or take down the rest of the slate.

tick.sh only schedules; it doesn’t decide what to work on. the orchestrator prompt reads the repos, picks topics, and spawns workers. prompts are plain markdown with ${VAR} placeholders that envsubst expands at runtime from config.env.

pieces

five prompts and a script:

design choices

a few things that matter more than they look:

what’s next

it’s running for the cyberwitchery org now, quietly ticking away in the background on the pi 500 on my desk (a cycle every 2 to 2.5 hours, with jitter). the interesting question is less “does it work” (it does, mostly) and more “what kinds of work does it reliably do well, and where does it need a human in the loop”. i’m keeping a log of the patterns that keep surfacing (topics that always time out, categories of review feedback that repeat across prs) and using that to tune the prompts.

longer term i want to factor out the bits that are org-agnostic so the same harness can host other agents. the release checker is already a second tenant, and there are obvious candidates like a dependency triage agent or a flaky-test sweeper. that’s what the repo name hints at.


  1. the split matters because it’s the only way to keep the reviewer from being able to merge its own recommendations, even in principle.↩︎