Job resources, bucket overrides, and environment selectors
Maand uses three layers to place jobs and validate capacity:
manifest.json— declares min/max memory and CPU for a job (portable bounds checked into git).bucket.jobs*.conf— sets the actual reservation (current_memory_mb/current_cpu_mhz) for the current environment; values must stay within manifest min/max.workers.json— declares host capacity and labels used for placement and validation.
Environment file naming: maand.conf → job_config_selector picks the override file. Empty selector → bucket.jobs.conf. Non-empty (e.g. "prod") → bucket.jobs.prod.conf. General pattern: bucket.jobs.<selector>.conf.
After edits, run maand build. Build fails if a reservation exceeds manifest bounds, or if active allocations on a worker need more memory/CPU than that worker declares.
Related: configuration.md · concepts.md · build.md
How the pieces fit together
manifest.json bucket.jobs.prod.conf workers.json
min / max bounds → memory / cpu override → capacity + labels
│ │ │
└──────────── build ──────┴────────────────────────┘
│
allocations (label match)
ValidateWorkerResources
| Layer | File | What it controls |
|---|---|---|
| Job bounds | workspace/jobs/<job>/manifest.json |
Allowed memory/CPU range (min / max) |
| Job reservation | workspace/bucket.jobs[.<env>].conf |
Current memory/CPU charged against workers (must be within bounds) |
| Worker capacity | workspace/workers.json |
Available memory/CPU; labels for placement |
Disabled allocations are excluded from capacity validation (removed=0, disabled=0 only).
Memory and CPU in manifest.json
Declare optional limits under resources:
{
"selectors": ["worker", "prod"],
"resources": {
"memory": { "min": "256 mb", "max": "2 gb" },
"cpu": { "min": "500 mhz", "max": "2000 mhz" }
}
}
Rules
| Rule | Behavior |
|---|---|
Omitted min / max |
Treated as 0 (no bound on that side) |
min set, max omitted or 0 |
max defaults to min |
min > max |
Build fails (ErrInvalidManifest) |
| Units | Memory: mb, gb, tb (case-insensitive). CPU: mhz, ghz, thz |
| Plain numbers | "512" → 512 MB or 512 MHz |
If both memory and CPU are omitted (all zeros), the job does not participate in worker resource validation unless a bucket override sets memory or cpu.
What build stores
| DB / KV field | Source |
|---|---|
min_memory_mb, max_memory_mb |
Manifest resources.memory |
min_cpu_mhz, max_cpu_mhz |
Manifest resources.cpu |
current_memory_mb, current_cpu_mhz |
Manifest max, or bucket override (below) |
KV namespace maand/job/<job> exposes min_memory_mb, max_memory_mb, memory, min_cpu_mhz, max_cpu_mhz, cpu for templates and commands.
Bucket-level override (bucket.jobs.conf)
Per-job reservations live in TOML under workspace/. They set what build stores as current_memory_mb and current_cpu_mhz — the values summed per worker during validation.
[api]
memory = "512 mb"
cpu = "1500 mhz"
[worker]
memory = "128 mb"
Bounds checking
The override must fall inside the manifest min/max:
manifest min ≤ bucket.jobs.conf value ≤ manifest max
Examples:
| Manifest | bucket.jobs.conf |
Result |
|---|---|---|
min: 128, max: 512 |
memory = "192 mb" |
OK — current = 192 |
min: 128, max: 256 |
memory = "512 mb" |
Build fails — above max |
| no memory in manifest | memory = "256 mb" |
OK — min and max set to 256 |
The same rules apply to cpu.
Environment-specific files (job_config_selector)
Set job_config_selector in maand.conf to pick which override file build reads:
# maand.conf
job_config_selector = "prod"
job_config_selector |
File read |
|---|---|
"" (default) |
workspace/bucket.jobs.conf |
"prod" |
workspace/bucket.jobs.prod.conf |
"staging" |
workspace/bucket.jobs.staging.conf |
Example layout for two environments on one bucket checkout:
workspace/
├── bucket.jobs.conf # default / dev
├── bucket.jobs.staging.conf
└── bucket.jobs.prod.conf
# bucket.jobs.staging.conf
[api]
memory = "256 mb"
cpu = "500 mhz"
# bucket.jobs.prod.conf
[api]
memory = "1 gb"
cpu = "2000 mhz"
Switch environments by changing job_config_selector in maand.conf, then maand build. Job manifests stay the same; only the reservation file changes.
Override keys are also copied to KV namespace vars/bucket/job/<job> (along with any other keys in that job’s TOML section).
Worker capacity (workers.json)
Workers must declare capacity when any active allocation on that host reserves memory or CPU:
[
{
"host": "10.0.0.1",
"labels": ["worker", "prod"],
"memory": "8192 mb",
"cpu": "4000 mhz"
},
{
"host": "10.0.0.2",
"labels": ["worker", "staging"],
"memory": "4096 mb",
"cpu": "2000 mhz"
}
]
Build sums current_memory_mb and current_cpu_mhz of all active allocations on each worker and compares to available_memory_mb / available_cpu_mhz.
Typical failure:
worker_ip 10.0.0.1, available memory is 4096.00 MB, required memory is 5120.00 MB
Fix by raising worker capacity, lowering a job reservation in bucket.jobs.conf, or moving a job to another worker (selectors).
To discover capacity from live hosts:
maand collect facts
maand collect facts --generate-workers > workspace/workers.json
maand build
See collect.md.
Selectors for different environments
Selectors are worker labels. Build creates an allocation for job × worker only when every job selector appears on the worker. When selectors is omitted from the manifest, the job name is the selector. The label worker is added automatically to every host.
Pattern: one job, environment labels
Workers
[
{ "host": "10.0.0.10", "labels": ["worker", "prod"], "memory": "16 gb", "cpu": "8 ghz" },
{ "host": "10.0.0.20", "labels": ["worker", "staging"], "memory": "8 gb", "cpu": "4 ghz" }
]
Job (workspace/jobs/api/manifest.json)
{
"selectors": ["worker", "prod"],
"resources": {
"memory": { "min": "256 mb", "max": "4 gb" },
"cpu": { "min": "200 mhz", "max": "4000 mhz" }
}
}
api allocates only on 10.0.0.10. Staging workers never receive it.
For a staging copy, add workspace/jobs/api-staging/ (or reuse the same tree) with:
{ "selectors": ["worker", "staging"], "resources": { ... } }
Pattern: shared manifest bounds, per-env reservations
Keep one manifest with wide bounds in git:
"resources": {
"memory": { "min": "128 mb", "max": "4 gb" },
"cpu": { "min": "100 mhz", "max": "8000 mhz" }
}
Tune actual reservations per environment in the matching bucket.jobs.<env>.conf without editing the job directory.
Pattern: one bucket per environment
Many teams use separate bucket directories (or separate maand.conf / selector) per environment instead of mixing prod and staging workers in one workers.json. Selectors then separate job types (gpu, arm) within that environment.
Inspecting placement
maand build
maand cat allocations
maand cat allocations --jobs api
Each row is a (job, worker) pair. If a job has no rows, no worker matched all selectors (manifest selectors, or the job name when selectors are omitted).
End-to-end example
Goal: api runs on prod with 1 GB RAM; api-staging on staging with 256 MB.
-
workers.json— prod and staging hosts with labels and capacity (see above). -
jobs/api/manifest.json{ "selectors": ["worker", "prod"], "resources": { "memory": { "min": "256 mb", "max": "2 gb" }, "cpu": { "min": "500 mhz", "max": "2000 mhz" } } } -
jobs/api-staging/manifest.json— same resources,"selectors": ["worker", "staging"]. -
maand.conf—job_config_selector = "prod". -
bucket.jobs.prod.conf[api] memory = "1 gb" cpu = "1500 mhz" -
Switch to staging: set
job_config_selector = "staging", usebucket.jobs.staging.conf:[api-staging] memory = "256 mb" cpu = "500 mhz" -
Build and verify:
maand build maand cat allocations maand cat kv --jobs api,api-stagingKV for each job includes
memory,min_memory_mb,max_memory_mb, and the same for CPU.
Troubleshooting
| Symptom | Likely cause |
|---|---|
ErrInsufficientResource on build |
Sum of job reservations on a worker exceeds workers.json capacity |
worker_ip … must specify memory in workers.json |
Job reserves memory but worker has no memory field |
ErrUnsupportedResourceConfiguration |
bucket.jobs.conf memory/CPU outside manifest min/max |
| Job has no allocations | No worker has all required selector labels |
| Override ignored | Wrong job_config_selector or typo in TOML job section name (must match job directory name) |
| Validation ignores a job | Allocation is disabled or removed |
Related
- configuration.md —
maand.conf,bucket.jobs.conf,job_config_selector - build.md — build pipeline and validation step
- concepts.md — how label matching creates allocations
- KV persistence —
maand/job/<job>andvars/bucket/job/<job>namespaces