Hook runtime API
Python/Bun scripts reach maand through an in-process HTTP API on the CLI host. For when to use hooks and event names, see cli/hooks.md.
Execution model
Open DB + kv.Initialize
StartRuntimeAPI (HTTP on localhost:8080 in maand process)
Resolve worker set for the event (active, allocated, rollout batch, or filtered)
Resolve rollout order (rollout_order KV when valid, else catalog order)
Split workers into batches (max_concurrent_upgrades, or max_concurrent_starts for first-start hooks)
For each batch (sequential):
For each allocation in the batch (parallel within batch):
Stage tmp/workers/<ip>/jobs/<job>/ from job_files + certs from KV
Run script on CLI host via bash (python3 or bun)
Script reaches API at HOOK_API_HOST=127.0.0.1
Commit (CLI path) or return error to caller (deploy / health_check)
Important details:
- Where code runs: bash on the CLI host, working directory = bucket root. Output is logged to structured lines in
logs/<worker_ip>.log(orlogs/maand.logfor bucket-local). Each line includes timestamp, run id, command phase, and payload. Per-run copies live underlogs/runs/<run_id>/. - Per-allocation context: one process per allocation; env vars identify worker IP and allocation UUID even though the process is local.
- Worker access: use Python
run_ssh,run_runner_target, orrun_make_targetto execute onALLOCATION_IP. Bun scripts must shell out tosshthemselves or call a Python helper. - Staging:
tmp/workers/<ip>/jobs/<job>/mirrors deploy layout (files + certs) so scripts can read local copies if needed. - Batching: every hook event uses the same rollout order and batch width rules (see below). Batches run one after another; allocations inside a batch run in parallel.
Environment variables (hook scripts)
Per allocation
| Variable | Meaning |
|---|---|
ALLOCATION_ID |
Stable allocation UUID |
ALLOCATION_IP |
Worker host for this invocation |
ALLOCATION_INDEX |
Zero-based index among non-removed peers (same as <job>_allocation_index in per-allocation KV) |
JOB |
Job name |
EVENT |
Event name (pre_deploy, cli, …) |
COMMAND |
Command name |
DISABLED |
0 or 1 — allocation marked disabled in catalog |
CURRENT_VERSION |
Running version on this allocation (0.0.0 before first promote) |
NEW_VERSION |
Target version from the current build/deploy plan |
HOOK_API_HOST |
Host to reach runtime API (127.0.0.1) |
BUCKET_ROOT |
Absolute path to the maand bucket on the CLI host |
Target version is job-level KV maand/job/<job>/version and env NEW_VERSION. Running vs target for rollout logic lives in the catalog (hash.current_version, allocations.new_version) and template fields .CurrentVersion / .NewVersion.
Per batch
Every hook invocation also receives batch context for the current wave:
| Variable | Meaning |
|---|---|
BATCH_ALLOCATIONS |
Comma-separated worker IPs in this batch |
BATCH_INDEX |
Zero-based batch index |
BATCH_COUNT |
Total batches in this run |
DEPLOY_PHASE |
Phase label (new, update, stop, health_check, pre_deploy, job_control, cli, …) |
ROLLOUT_ORDER |
Full comma-separated order list for this worker set |
ROLLOUT_ORDER_SOURCE |
kv or default |
JOB is set on every script; batch hooks set it again alongside the batch vars above.
Event-specific
| Event | Extra variables |
|---|---|
Deploy job_control |
NEW_ALLOCATIONS, UPDATED_ALLOCATIONS |
maand job / jobcontrol job_control |
TARGET (start, stop, restart, reload, or a Makefile target) |
Batching
Worker order comes from rollout_order in KV when the list matches the current worker set; otherwise maand uses catalog order. Override order in pre_deploy or cli with put_rollout_order() — see kv/namespaces.md.
Batch width comes from the job manifest:
| Phase | Manifest field |
|---|---|
First deploy starts and after_allocation_started on start |
max_concurrent_starts (0 = one batch of all new allocations) |
Everything else (upgrades, hooks, health commands, cli, pre_deploy, …) |
max_concurrent_upgrades (minimum 1) |
| Event | Workers included |
|---|---|
pre_deploy, post_deploy, post_build, cli |
All non-removed allocations (includes disabled) |
health_check (commands) |
All non-removed allocations (includes disabled) |
after_allocation_started |
Workers in the current Makefile rollout batch |
after_allocation_stopped |
Stopped allocations for the job (grouped and batched per job during reconcile) |
Deploy job_control |
All allocated workers (including disabled) |
jobcontrol job_control |
Active workers matching --allocations filter |
Manifest health_check probes (tcp / http / ssh) run on active allocations only, in rollout order with max_concurrent_upgrades batch width. Probes are internal maand checks — they do not run user scripts and do not receive batch env vars. health_check commands use non-removed allocations (includes disabled) with the same order and batch width.
Within a batch, all allocations (and all probe checks × workers in that batch) run in parallel. The next batch starts only after the current batch succeeds.
Use acquire_semaphore when parallel allocations must serialize a critical section (migrations, rate-limited APIs). Failures aggregate into a run error listing per-worker errors.
For a single batch run (BATCH_INDEX=0, BATCH_COUNT=1), see guides/hook-one-shot.md. For run-once logic inside a hook script, use is_one_shot() / isOneShot() — see hook-one-shot.md#guard-in-the-script.
Runtime HTTP API
Started by hooks.StartRuntimeAPI(tx) for the lifetime of build/deploy/health_check/hook sessions.
| Property | Value |
|---|---|
| Listen address | localhost:8080 (not exposed outside the host) |
| Request body | JSON, Content-Type: application/json |
| Allocation scope | Header X-ALLOCATION-ID (required on every route) |
| Event scope | Header EVENT (required; must match the running hook) |
| Command scope | Header COMMAND (required; used by /demands and semaphore scoping) |
Embedded maand.py / maand.ts set these headers automatically from env vars.
Endpoint summary
| Method | Path | Purpose |
|---|---|---|
| GET | /kv |
Read a key from an allowed namespace |
| PUT | /kv |
Write a non-secret key under vars/job/<current job> only |
| DELETE | /kv |
Delete a key under vars/job/<current job> only |
| PUT | /kv/secret |
Write encrypted secret under secrets/job/<current job> |
| DELETE | /kv/secret |
Delete secret under secrets/job/<current job> |
| GET | /kv/keys |
List keys under job-level namespaces |
| GET | /demands |
List downstream hooks that depend on this job+hook |
| POST | /semaphore/acquire |
Block until this allocation holds a slot |
| POST | /semaphore/release |
Release a held slot |
| GET | /semaphore/status?name=... |
Inspect holders and waiters |
KV read vs write
| Namespace pattern | GET /kv |
PUT/DELETE /kv |
PUT/DELETE /kv/secret |
|---|---|---|---|
maand/bucket, maand/worker, maand/worker/<ip>, tags |
✓ | ✗ | ✗ |
vars/bucket, vars/bucket/job/<job> |
✓ | ✗ | ✗ |
maand/job/<job>/worker/<ip> |
✓ | ✗ | ✗ |
maand/job/<job> key rollout_order only |
✓ | ✓ on pre_deploy or cli |
✗ |
vars/job/<job> (current job) |
✓ | ✓ | ✗ |
secrets/job/<job> (current job) |
✓ (decrypted) | ✗ | ✓ |
Upstream demand jobs (maand/job/*, vars/job/*, secrets/job/*) |
✓ if declared in manifest demands |
✗ | ✗ |
Writes to other maand/* keys, vars/bucket/*, and upstream jobs are rejected. Use put_rollout_order / putRolloutOrder (or PUT /kv on maand/job/<job> + key rollout_order) to override rollout order for one deploy; build resets it on the next maand build.
PUT rollout_order body:
{
"namespace": "maand/job/api",
"key": "rollout_order",
"value": "10.0.0.2,10.0.0.1"
}
from maand import put_rollout_order
put_rollout_order(["10.0.0.2", "10.0.0.1"]).raise_for_status()
GET /kv body:
{ "namespace": "vars/job/api", "key": "db_url" }
Response (200):
{ "namespace": "vars/job/api", "key": "db_url", "value": "postgres://..." }
PUT /kv body (value required; optional ttl in seconds, 0 = no expiry):
{ "namespace": "vars/job/api", "key": "db_url", "value": "postgres://...", "ttl": 3600 }
PUT /kv/secret body (optional ttl in seconds):
{ "namespace": "secrets/job/api", "key": "db_password", "value": "plain-text-secret", "ttl": 86400 }
Values are encrypted with AES-256-GCM using secrets/kv.key before storage in maand.db. Expired keys are tombstoned on maand build; maand gc purges tombstones per --retain-days — see kv/persistence.md.
DELETE /kv and /kv/secret use the same JSON body as GET (namespace + key; no value).
GET /kv/keys — optional body { "namespace": "vars/job/api" }. Omit namespace to list both vars/job/<job> and secrets/job/<job> (secret listing returns key names only, never values).
KV writes during health_check
PUT, DELETE, and /kv/secret writes are rejected when the EVENT header is health_check. Health check scripts must be read-only with respect to KV — use pre_deploy or post_deploy to create or update vars and secrets.
GET /demands
Returns hooks whose manifest demands point at this job and this command name (reverse dependency lookup).
Response (200) — array of:
{
"job": "api",
"hook": "hook_migrate",
"demand_config": { "min_version": "2.0.0" }
}
When to use: a shared upstream command (e.g. hook_schema on database) can inspect who depends on it and tailor behavior using demand_config (feature flags, schema versions, etc.).
Semaphores
Coordinate cross-allocation locks inside one hook session. Scoped by job + EVENT + name — the same name under pre_deploy and post_deploy are independent semaphores.
| Field | Default | Limit |
|---|---|---|
capacity |
1 | 1–64 |
timeout_seconds |
600 | max 3600 |
POST /semaphore/acquire body:
{ "name": "migration", "capacity": 1, "timeout_seconds": 600 }
Response (200):
{
"name": "migration",
"allocation_id": "<uuid>",
"capacity": 1,
"acquired": true
}
POST /semaphore/release body: { "name": "migration" }
GET /semaphore/status?name=migration response:
{
"name": "migration",
"capacity": 1,
"holders": ["<allocation-uuid>"],
"waiting": 0,
"available": 0
}
When to use semaphores:
| Pattern | capacity |
Example |
|---|---|---|
| Single-writer migration | 1 | Only one allocation runs DDL at a time |
| Rolling batch | N | Allow N concurrent restarts against an external API |
| Leader bootstrap | 1 | First acquirer writes shared KV, others read |
Always release_semaphore in a finally / try/finally block so a failed script does not hold the lock for the rest of the deploy session.
Example (Python):
from maand import acquire_semaphore, release_semaphore, allocation_ip
acquire_semaphore("migrate", capacity=1, timeout_seconds=900).raise_for_status()
try:
run_migration_for(allocation_ip())
finally:
release_semaphore("migrate")
Semaphores exist only in memory for the current maand process session — they do not survive CLI restart.
Python and Bun helpers
Embedded maand.py / maand.ts wrap the HTTP API. Prefer these over raw HTTP.
Context (env)
| Python | Bun | Returns |
|---|---|---|
allocation_id() |
allocationId() |
ALLOCATION_ID |
allocation_ip() |
allocationIp() |
ALLOCATION_IP |
allocation_index() |
allocationIndex() |
ALLOCATION_INDEX |
job_name() |
jobName() |
JOB |
hook_event() |
hookEvent() |
EVENT |
hook_name() |
hookName() |
HOOK |
is_allocation_disabled() |
isAllocationDisabled() |
DISABLED == "1" |
is_one_shot() |
isOneShot() |
BATCH_INDEX == "0" && ALLOCATION_INDEX == "0" (leader; run once per invocation) |
Aliases: get_allocation_id, get_job, kv_get, etc. (both runtimes).
KV
| Python | Bun | API |
|---|---|---|
get_store_value(ns, key) |
getStoreValue(ns, key) |
GET /kv → Response |
get_kv_value(ns, key) |
(parse JSON yourself) | GET /kv → plaintext value |
put_rollout_order(order) |
putRolloutOrder(order) |
PUT /kv → maand/job/<job>/rollout_order |
get_rollout_order() |
getRolloutOrder() |
GET /kv → rollout_order |
put_job_variable(key, val) |
putJobVariable(key, val) |
PUT /kv |
put_job_secret(key, val) |
putJobSecret(key, val) |
PUT /kv/secret |
delete_job_variable(key) |
deleteJobVariable(key) |
DELETE /kv |
delete_job_secret(key) |
deleteJobSecret(key) |
DELETE /kv/secret |
list_job_keys(ns=None) |
listJobKeys(ns?) |
GET /kv/keys |
Demands and semaphores
| Python | Bun | API |
|---|---|---|
list_hook_demands() |
listHookDemands() |
GET /demands |
acquire_semaphore(name, capacity=1, timeout_seconds=600) |
acquireSemaphore(...) |
POST /semaphore/acquire |
release_semaphore(name) |
releaseSemaphore(name) |
POST /semaphore/release |
semaphore_status(name) |
semaphoreStatus(name) |
GET /semaphore/status |
Worker SSH (Python only)
| Function | Purpose |
|---|---|
load_ssh() |
Parse maand.conf → (user, key_path, use_sudo) |
run_ssh(worker_ip, remote_cmd, ...) |
Arbitrary remote command over SSH |
run_runner_target(target, ...) |
runner.py <target> --jobs <job> on worker (same as deploy) |
run_make_target(target, ...) |
make -C /opt/worker/<bucket>/jobs/<job> <target> |
Bun scripts that need SSH should invoke ssh directly or call a thin Python wrapper script.
Python virtualenv (recommended)
Create a venv per job under _hooks (not copied into tmp/ during runs; maand calls the workspace interpreter):
cd workspace/jobs/<job>/_hooks
python3 -m venv .venv
source .venv/bin/activate # optional for manual work
pip install -r requirements.txt
pip install requests # required if scripts use maand.py
Maand uses, in order:
workspace/jobs/<job>/_hooks/.venv/bin/python3workspace/jobs/<job>/_hooks/venv/bin/python3python3on your PATH
.venv, venv, node_modules, and __pycache__ are skipped during maand build file indexing.
Bun
Install Bun on the CLI host. Per job:
cd workspace/jobs/<job>/_hooks
bun install
KV persistence by context
| Context | When KV writes persist to maand.db |
|---|---|
maand build |
End of main build transaction (PersistToTransaction). |
post_build hooks |
Separate session transaction at end of hook pass (failures fail build). |
maand hooks |
Successful CLI commit. |
maand deploy |
kv.PersistSession() after each job's pre_deploy and after each deployJob. |
maand health_check |
KV writes rejected (read-only). |
Use pre_deploy to write secrets consumed by .tpl on the same deploy. Full persistence and purge rules: kv/persistence.md. Namespace keys: kv/namespaces.md.
HTTP API errors
| HTTP | Message | Typical cause |
|---|---|---|
| 400 | X-ALLOCATION-ID header is missing |
Raw HTTP call without header |
| 404 | Invalid allocation ID |
Stale or wrong allocation UUID |
| 400 | Both namespace and key are required |
Incomplete JSON body |
| 400 | Invalid or unauthorized namespace |
Write to read-only namespace, wrong job, or upstream not in demands |
| 404 | KV get operation failed |
Key does not exist |
| 400 | KV writes are not allowed during health_check |
PUT/DELETE during health_check event |
| 408 | Timed out waiting for semaphore |
timeout_seconds elapsed |
| 409 | Semaphore acquire or release failed |
Release without hold, or internal conflict |
| 415 | Content-Type must be application/json |
Missing or wrong content type |
| 400 | Invalid JSON format |
Malformed request body |
Check logs/<worker_ip>.log and CLI output when --verbose is set.
Related
- cli/hooks.md — events, patterns, checklist
- kv/namespaces.md · kv/persistence.md
- cli/build.md · cli/deploy.md