Hooks

Hooks are Python or Bun scripts under workspace/jobs/<job>/_hooks/ that maand runs on the CLI host. Each invocation is scoped to one allocation (job on a worker).

Doc Contents
manifest.md commands block in manifest.json
hook-api.md HTTP API, env vars, Python/Bun helpers
guides/hooks-tutorial.md Hands-on walkthrough
guides/hook-one-shot.md Single batch (BATCH_INDEX=0, BATCH_COUNT=1)

When to use hooks

Need Prefer
Static config in git vars.conf, bucket.jobs*.conf
Process lifecycle on workers Makefile + runner.py (default deploy)
Custom full rollout job_control command
Batched start/restart/reload + per-batch hooks after_allocation_started + parallel counts — guides/rolling-deploy.md
Drain on stop after_allocation_stopped
Secrets before .tpl render pre_deploy + put_job_secret
Post-rollout smoke test post_deploy
Probes manifest health_check and/or health_check command — health-check.md
Operator one-off cli + maand hooks
One batch / all workers at once Raise max_concurrent_upgradeshook-one-shot.md

Scripts run on the CLI host, not on workers by default. Use Python run_ssh / run_runner_target to reach workers.

Disabled allocations still run post_build, pre_deploy, post_deploy, health_check, and cli hooks (non-removed fan-out). Deploy lifecycle (start/restart/reload/rsync) skips disabled allocations. after_allocation_stopped runs during reconcile when a disabled allocation is stopped. See disable and drain.


CLI (ad-hoc)

maand hooks <hook_name> [job] [--verbose]

When job is omitted, the command runs on every job in the catalog that registers it for the cli event.

Examples:

maand hooks hook_cluster_status
maand hooks hook_migrate api --verbose
maand hook hook_migrate api   # alias

Requires cli in manifest executed_on. Runs in batches of max_concurrent_upgrades over non-removed allocations (rollout order). For a single batch (BATCH_INDEX=0, BATCH_COUNT=1), see hook-one-shot.md. KV commits on success.


Command events (executed_on)

All events below run hooks in rollout order, batched by max_concurrent_upgrades (or max_concurrent_starts for first-start after_allocation_started hooks). Scripts receive BATCH_*, DEPLOY_PHASE, and ROLLOUT_ORDER — see hook-api.md.

Event Triggered by Summary
post_build maand build After catalog commit; build fails on error; runs in deployment_seq order
pre_deploy maand deploy Before rsync; secrets/vars for templates; failure skips job for this deploy
post_deploy maand deploy After successful rollout; failure fails that job
job_control maand deploy, maand job run Custom lifecycle; deploy sets NEW_/UPDATED_ALLOCATIONS; CLI sets TARGET
health_check maand health_check, deploy Manifest probes on active allocations first; then commands on non-removed allocations; KV read-only for commands — health-check.md
cli maand hooks Operator-triggered; all catalog jobs when job name omitted
after_allocation_started maand deploy After each Makefile start/restart/reload batch, before health gate
after_allocation_stopped maand deploy After batched allocation stops during reconcile

pre_deploy may override rollout_order with put_rollout_order() for one deploy; build resets it on the next maand build. See hook-api.md.

Demands

Optional upstream dependency for deployment_seq. See deployment-sequence.md.


Event behavior (detail)

post_build

Validate artifacts, codegen, seed vars/job/*, cross-job checks. Runs in deployment_seq order. Any failure fails maand build.

pre_deploy

Fetch secrets, write secrets/job/* and vars/job/* for .tpl on this deploy, external prerequisite checks. Override rollout order with put_rollout_order("10.0.0.2,10.0.0.1"). Failure: job not staged this deploy.

post_deploy

Smoke tests, cache warming, service registration. Runs after rollout (and health gate when configured).

job_control

Custom rollout via NEW_ALLOCATIONS / UPDATED_ALLOCATIONS (deploy) or TARGET (maand job run). Batched like other events. Still followed by health check and post_deploy when invoked from deploy.

health_check

See health-check.md. Probes and command scripts share rollout order and max_concurrent_upgrades. KV writes rejected.

after_allocation_started / after_allocation_stopped

Hooks for cluster join, leader election, or drain. after_allocation_started runs on the same worker batch as the Makefile start/restart/reload that preceded it. after_allocation_stopped runs on batched stops during reconcile. See guides/rolling-deploy.md.

cli

Operator maintenance and testing. Batched over non-removed allocations; KV commits on success.


File layout

Rule Detail
Command name Prefix hook_
Script One of hook_<name>.py, .ts, .js under _hooks/
{
  "hooks": {
    "hook_init": {
      "executed_on": ["pre_deploy", "cli"],
      "demands": { "job": "", "hook": "", "config": {} }
    }
  }
}

Default deploy without job_control

Deploy runs pre_deploy, Makefile lifecycle (batched), after_allocation_started (if registered), health checks, post_deploy.

Full pipeline: deploy.md.


Example: secret bootstrap + template

hook_bootstrap.py (pre_deploy):

from maand import put_job_secret, put_job_variable

put_job_secret("db_password", fetch_from_vault()).raise_for_status()
put_job_variable("db_host", "db.internal").raise_for_status()

config.toml.tpl:

db_host = "{{ get "vars/job/api" "db_host" }}"
db_password = "{{ getSecret "secrets/job/api" "db_password" }}"

Checklist

  1. Add hook_<name>.py (or .ts) under _hooks/.
  2. Register in manifest.json with executed_on and optional demands.
  3. maand build
  4. Test: maand hooks hook_<name> [job] --verbose (if cli listed)
  5. Wire deploy events as needed.

Single-batch runs (BATCH_COUNT=1): hook-one-shot.md.


CLI errors

Error Meaning
NotFoundError Command not allowed for this event on this job
RunError One or more allocations failed
WorkerFailure Script exited non-zero or SSH failure
File not found Missing or duplicate .py/.ts/.js implementation

HTTP API errors: hook-api.md.


Who runs what

                    post_build   pre_deploy   deploy roll   post_deploy   health_check   cli
maand build              ✓            —            —            —              —         —
maand deploy             —            ✓            ✓*           ✓              ✓*        —
maand health_check       —            —            —            —              ✓         —
maand hook        —            —            —            —              —         ✓

* deploy: health_check after lifecycle (restart/reload/start)/job_control; roll = job_control OR Makefile path
                         (+ after_allocation_started/stopped when registered)