File size: 5,276 Bytes
b2d9e47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# One-click deploy templates

Stand up agentmemory on managed infrastructure without rolling your own
Docker host. Each template ships a self-contained Dockerfile that pulls
`@agentmemory/agentmemory` from npm at build time and copies the iii
engine binary in from the official `iiidev/iii` image β€” no pre-built
agentmemory image required. Storage mounts at `/data`; an HMAC secret
is generated by the first-boot entrypoint and persisted to the volume.
The entrypoint overwrites the npm-bundled iii config with a
deploy-tuned one that binds `0.0.0.0` and uses absolute `/data` paths,
then drops privileges from `root` to `node` via `gosu` before
exec'ing the agentmemory CLI.

| Platform | Pitch | Cost floor |
|----------|-------|------------|
| [fly.io](./fly/README.md) | Single machine with auto-stop. Cheapest idle cost on a managed host; cold-start on first request after sleep. | ~$0.15/month at full idle |
| [Railway](./railway/README.md) | Push from GitHub, volume in the dashboard. Easiest managed dashboard flow. | $5/month (Hobby plan flat fee) |
| [Render](./render/README.md) | Blueprint-driven; persistent disk attaches automatically. Most "set it and forget it." | $7.25/month (Starter web + 1 GB disk) |
| [Coolify](./coolify/README.md) | Self-hosted on your own VPS. Same Docker Compose stack, you own the host and the data. | VPS cost only (Hetzner CX22 ~€3.79/month) |

## What every template guarantees

- **Volume mounted at `/data`.** Matches the path the engine has used
  since v0.9.10.
- **HMAC secret generated on first boot** via `openssl rand -hex 32`,
  written to `/data/.hmac` with `chmod 600`, and printed to stdout
  exactly once so the operator can capture it from the deploy logs.
  Subsequent boots load the secret from the file. The secret is never
  committed to a config file or set as a platform env var.
- **Only port 3111 is exposed publicly.** The viewer on port 3113
  stays bound to the container's localhost. Reach it via SSH tunnel
  (see each platform's README).
- **TLS upstream of the container.** Every managed platform terminates
  TLS at its edge proxy; the templates publish a single internal port
  (`3111`) to that proxy, never to the host. Integration plugins
  configured with `AGENTMEMORY_REQUIRE_HTTPS=1` will refuse to send the
  bearer over plaintext HTTP to a non-loopback host, so a
  misconfigured TLS layer fails loud instead of silently leaking the
  secret.

## Pick a platform

- Pick **fly.io** if you want the lowest idle cost and don't mind a
  cold-start latency hit on the first request after sleep.
- Pick **Railway** if you want a clicky dashboard flow and a flat
  monthly bill.
- Pick **Render** if you want the most "set it and forget it"
  Blueprint flow with automatic disk snapshots on paid plans.
- Pick **Coolify** if you already run a VPS and want a self-hosted
  control plane β€” same Docker Compose stack, no third-party host has
  your memories.

All four give you the same agentmemory API at the same port (3111)
with the same auth model. Migrating between them later is a `tar` of
`/data` and a re-import β€” see each platform's README for the exact
commands.

## Optional: LLM + embedding provider keys

Every template runs out of the box without any LLM or embedding key β€”
search falls back to BM25-only mode and synthetic (zero-LLM)
compression keeps memories indexable. To unlock LLM-powered
compression and hybrid (BM25 + vector) recall, add one of the
following to your platform's environment variables (Fly:
`flyctl secrets set`; Railway / Render / Coolify: dashboard
*Variables / Environment* tab):

| Variable                  | Purpose                                                  |
|---------------------------|----------------------------------------------------------|
| `ANTHROPIC_API_KEY`       | LLM-backed compression + summarization                   |
| `GEMINI_API_KEY`          | LLM provider alternative                                 |
| `OPENROUTER_API_KEY`      | LLM provider alternative                                 |
| `OPENAI_API_KEY`          | Embedding provider (text-embedding-3-small by default)   |
| `VOYAGE_API_KEY`          | Embedding provider alternative                           |
| `AGENTMEMORY_AUTO_COMPRESS=true` | Run LLM compression on every observation batch    |
| `AGENTMEMORY_INJECT_CONTEXT=true` | Inject recalled memories back into agent prompts |

The defaults are intentionally conservative: provider keys default to
absent (no third-party calls), `AGENTMEMORY_AUTO_COMPRESS` is off,
and `AGENTMEMORY_INJECT_CONTEXT` is off. Opt in only after you've
confirmed your provider quota can absorb the workload.

## Cold-start budget

Measured against fly.io's `iad` region with a 1 GB volume:

```
machine image prepared :  5.1 s
volume mount + format  :  2.5 s
firecracker boot       :  1.0 s
entrypoint + chown     :  0.5 s
iii-engine ready       :  3.0 s
agentmemory worker reg :  2.0 s
─────────────────────────────────
healthcheck passes     : ~9-10 s
```

Every template's health-check `grace_period` (or compose
`start_period`) is set to 30 s for a 3x safety margin. Tune lower
once you've measured your own platform's image-pull characteristics.