Spaces:
Running
Running
| # One-click deploy templates | |
| Stand up agentmemory on managed infrastructure without rolling your own | |
| Docker host. Each template ships a self-contained Dockerfile that pulls | |
| `@agentmemory/agentmemory` from npm at build time and copies the iii | |
| engine binary in from the official `iiidev/iii` image β no pre-built | |
| agentmemory image required. Storage mounts at `/data`; an HMAC secret | |
| is generated by the first-boot entrypoint and persisted to the volume. | |
| The entrypoint overwrites the npm-bundled iii config with a | |
| deploy-tuned one that binds `0.0.0.0` and uses absolute `/data` paths, | |
| then drops privileges from `root` to `node` via `gosu` before | |
| exec'ing the agentmemory CLI. | |
| | Platform | Pitch | Cost floor | | |
| |----------|-------|------------| | |
| | [fly.io](./fly/README.md) | Single machine with auto-stop. Cheapest idle cost on a managed host; cold-start on first request after sleep. | ~$0.15/month at full idle | | |
| | [Railway](./railway/README.md) | Push from GitHub, volume in the dashboard. Easiest managed dashboard flow. | $5/month (Hobby plan flat fee) | | |
| | [Render](./render/README.md) | Blueprint-driven; persistent disk attaches automatically. Most "set it and forget it." | $7.25/month (Starter web + 1 GB disk) | | |
| | [Coolify](./coolify/README.md) | Self-hosted on your own VPS. Same Docker Compose stack, you own the host and the data. | VPS cost only (Hetzner CX22 ~β¬3.79/month) | | |
| ## What every template guarantees | |
| - **Volume mounted at `/data`.** Matches the path the engine has used | |
| since v0.9.10. | |
| - **HMAC secret generated on first boot** via `openssl rand -hex 32`, | |
| written to `/data/.hmac` with `chmod 600`, and printed to stdout | |
| exactly once so the operator can capture it from the deploy logs. | |
| Subsequent boots load the secret from the file. The secret is never | |
| committed to a config file or set as a platform env var. | |
| - **Only port 3111 is exposed publicly.** The viewer on port 3113 | |
| stays bound to the container's localhost. Reach it via SSH tunnel | |
| (see each platform's README). | |
| - **TLS upstream of the container.** Every managed platform terminates | |
| TLS at its edge proxy; the templates publish a single internal port | |
| (`3111`) to that proxy, never to the host. Integration plugins | |
| configured with `AGENTMEMORY_REQUIRE_HTTPS=1` will refuse to send the | |
| bearer over plaintext HTTP to a non-loopback host, so a | |
| misconfigured TLS layer fails loud instead of silently leaking the | |
| secret. | |
| ## Pick a platform | |
| - Pick **fly.io** if you want the lowest idle cost and don't mind a | |
| cold-start latency hit on the first request after sleep. | |
| - Pick **Railway** if you want a clicky dashboard flow and a flat | |
| monthly bill. | |
| - Pick **Render** if you want the most "set it and forget it" | |
| Blueprint flow with automatic disk snapshots on paid plans. | |
| - Pick **Coolify** if you already run a VPS and want a self-hosted | |
| control plane β same Docker Compose stack, no third-party host has | |
| your memories. | |
| All four give you the same agentmemory API at the same port (3111) | |
| with the same auth model. Migrating between them later is a `tar` of | |
| `/data` and a re-import β see each platform's README for the exact | |
| commands. | |
| ## Optional: LLM + embedding provider keys | |
| Every template runs out of the box without any LLM or embedding key β | |
| search falls back to BM25-only mode and synthetic (zero-LLM) | |
| compression keeps memories indexable. To unlock LLM-powered | |
| compression and hybrid (BM25 + vector) recall, add one of the | |
| following to your platform's environment variables (Fly: | |
| `flyctl secrets set`; Railway / Render / Coolify: dashboard | |
| *Variables / Environment* tab): | |
| | Variable | Purpose | | |
| |---------------------------|----------------------------------------------------------| | |
| | `ANTHROPIC_API_KEY` | LLM-backed compression + summarization | | |
| | `GEMINI_API_KEY` | LLM provider alternative | | |
| | `OPENROUTER_API_KEY` | LLM provider alternative | | |
| | `OPENAI_API_KEY` | Embedding provider (text-embedding-3-small by default) | | |
| | `VOYAGE_API_KEY` | Embedding provider alternative | | |
| | `AGENTMEMORY_AUTO_COMPRESS=true` | Run LLM compression on every observation batch | | |
| | `AGENTMEMORY_INJECT_CONTEXT=true` | Inject recalled memories back into agent prompts | | |
| The defaults are intentionally conservative: provider keys default to | |
| absent (no third-party calls), `AGENTMEMORY_AUTO_COMPRESS` is off, | |
| and `AGENTMEMORY_INJECT_CONTEXT` is off. Opt in only after you've | |
| confirmed your provider quota can absorb the workload. | |
| ## Cold-start budget | |
| Measured against fly.io's `iad` region with a 1 GB volume: | |
| ``` | |
| machine image prepared : 5.1 s | |
| volume mount + format : 2.5 s | |
| firecracker boot : 1.0 s | |
| entrypoint + chown : 0.5 s | |
| iii-engine ready : 3.0 s | |
| agentmemory worker reg : 2.0 s | |
| βββββββββββββββββββββββββββββββββ | |
| healthcheck passes : ~9-10 s | |
| ``` | |
| Every template's health-check `grace_period` (or compose | |
| `start_period`) is set to 30 s for a 3x safety margin. Tune lower | |
| once you've measured your own platform's image-pull characteristics. | |