Spaces:
Running
Running
File size: 5,276 Bytes
b2d9e47 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | # One-click deploy templates
Stand up agentmemory on managed infrastructure without rolling your own
Docker host. Each template ships a self-contained Dockerfile that pulls
`@agentmemory/agentmemory` from npm at build time and copies the iii
engine binary in from the official `iiidev/iii` image β no pre-built
agentmemory image required. Storage mounts at `/data`; an HMAC secret
is generated by the first-boot entrypoint and persisted to the volume.
The entrypoint overwrites the npm-bundled iii config with a
deploy-tuned one that binds `0.0.0.0` and uses absolute `/data` paths,
then drops privileges from `root` to `node` via `gosu` before
exec'ing the agentmemory CLI.
| Platform | Pitch | Cost floor |
|----------|-------|------------|
| [fly.io](./fly/README.md) | Single machine with auto-stop. Cheapest idle cost on a managed host; cold-start on first request after sleep. | ~$0.15/month at full idle |
| [Railway](./railway/README.md) | Push from GitHub, volume in the dashboard. Easiest managed dashboard flow. | $5/month (Hobby plan flat fee) |
| [Render](./render/README.md) | Blueprint-driven; persistent disk attaches automatically. Most "set it and forget it." | $7.25/month (Starter web + 1 GB disk) |
| [Coolify](./coolify/README.md) | Self-hosted on your own VPS. Same Docker Compose stack, you own the host and the data. | VPS cost only (Hetzner CX22 ~β¬3.79/month) |
## What every template guarantees
- **Volume mounted at `/data`.** Matches the path the engine has used
since v0.9.10.
- **HMAC secret generated on first boot** via `openssl rand -hex 32`,
written to `/data/.hmac` with `chmod 600`, and printed to stdout
exactly once so the operator can capture it from the deploy logs.
Subsequent boots load the secret from the file. The secret is never
committed to a config file or set as a platform env var.
- **Only port 3111 is exposed publicly.** The viewer on port 3113
stays bound to the container's localhost. Reach it via SSH tunnel
(see each platform's README).
- **TLS upstream of the container.** Every managed platform terminates
TLS at its edge proxy; the templates publish a single internal port
(`3111`) to that proxy, never to the host. Integration plugins
configured with `AGENTMEMORY_REQUIRE_HTTPS=1` will refuse to send the
bearer over plaintext HTTP to a non-loopback host, so a
misconfigured TLS layer fails loud instead of silently leaking the
secret.
## Pick a platform
- Pick **fly.io** if you want the lowest idle cost and don't mind a
cold-start latency hit on the first request after sleep.
- Pick **Railway** if you want a clicky dashboard flow and a flat
monthly bill.
- Pick **Render** if you want the most "set it and forget it"
Blueprint flow with automatic disk snapshots on paid plans.
- Pick **Coolify** if you already run a VPS and want a self-hosted
control plane β same Docker Compose stack, no third-party host has
your memories.
All four give you the same agentmemory API at the same port (3111)
with the same auth model. Migrating between them later is a `tar` of
`/data` and a re-import β see each platform's README for the exact
commands.
## Optional: LLM + embedding provider keys
Every template runs out of the box without any LLM or embedding key β
search falls back to BM25-only mode and synthetic (zero-LLM)
compression keeps memories indexable. To unlock LLM-powered
compression and hybrid (BM25 + vector) recall, add one of the
following to your platform's environment variables (Fly:
`flyctl secrets set`; Railway / Render / Coolify: dashboard
*Variables / Environment* tab):
| Variable | Purpose |
|---------------------------|----------------------------------------------------------|
| `ANTHROPIC_API_KEY` | LLM-backed compression + summarization |
| `GEMINI_API_KEY` | LLM provider alternative |
| `OPENROUTER_API_KEY` | LLM provider alternative |
| `OPENAI_API_KEY` | Embedding provider (text-embedding-3-small by default) |
| `VOYAGE_API_KEY` | Embedding provider alternative |
| `AGENTMEMORY_AUTO_COMPRESS=true` | Run LLM compression on every observation batch |
| `AGENTMEMORY_INJECT_CONTEXT=true` | Inject recalled memories back into agent prompts |
The defaults are intentionally conservative: provider keys default to
absent (no third-party calls), `AGENTMEMORY_AUTO_COMPRESS` is off,
and `AGENTMEMORY_INJECT_CONTEXT` is off. Opt in only after you've
confirmed your provider quota can absorb the workload.
## Cold-start budget
Measured against fly.io's `iad` region with a 1 GB volume:
```
machine image prepared : 5.1 s
volume mount + format : 2.5 s
firecracker boot : 1.0 s
entrypoint + chown : 0.5 s
iii-engine ready : 3.0 s
agentmemory worker reg : 2.0 s
βββββββββββββββββββββββββββββββββ
healthcheck passes : ~9-10 s
```
Every template's health-check `grace_period` (or compose
`start_period`) is set to 30 s for a 3x safety margin. Tune lower
once you've measured your own platform's image-pull characteristics.
|