11. Health and monitoring


Health checks

maand health_check verifies workers and jobs after deploy or on demand:

  1. SSH gate — every worker in the catalog (worker_ip:ssh_port)
  2. Manifest probes (tcp/http/ssh) on active allocations, if declared
  3. health_check commands on non-removed allocations (includes disabled), after probes pass

Probes and commands share rollout order and max_concurrent_upgrades batch width, but target different allocation sets.

maand health_check --jobs api
maand health_check --jobs api --wait --verbose

Deploy can run health checks between rolling batches when configured.

Many jobs also define a test Makefile target for application-level checks (HTTP curl, etc.) — deploy does not call it automatically unless you wire that in a hook.

Reference: health-check.md.


Prometheus (optional)

Jobs may include _prometheus/:

Build validates and stores in job_files; scrape configs sync to KV for a prometheus job to consume.

Guide: prometheus.md.


Observability (where to read more)

Concern Document
Deploy/rsync debug output logging.mdmaand logs show, logs/<worker>.log
Deploy skipped or partial debugging-deploy.md
App output on workers Your Makefile (compose, systemd) or SSH — same as without maand

Maand records its own command history on the CLI host. Application stdout and metrics stay in whatever stack you run on workers.


Next

12 — Day-2 operations.