Spaces:

Yash030
/

agentmemory-python

Running

App Files Files Community

agentmemory-python / deploy /README.md

Yash030

Initialize Hugging Face Space deployment for AgentMemory Python (clean without assets)

b2d9e47 3 days ago

preview code

raw

history blame contribute delete

5.28 kB

	# One-click deploy templates

	Stand up agentmemory on managed infrastructure without rolling your own
	Docker host. Each template ships a self-contained Dockerfile that pulls
	`@agentmemory/agentmemory` from npm at build time and copies the iii
	engine binary in from the official `iiidev/iii` image — no pre-built
	agentmemory image required. Storage mounts at `/data`; an HMAC secret
	is generated by the first-boot entrypoint and persisted to the volume.
	The entrypoint overwrites the npm-bundled iii config with a
	deploy-tuned one that binds `0.0.0.0` and uses absolute `/data` paths,
	then drops privileges from `root` to `node` via `gosu` before
	exec'ing the agentmemory CLI.

	\| Platform \| Pitch \| Cost floor \|
	\|----------\|-------\|------------\|
	\| [fly.io](./fly/README.md) \| Single machine with auto-stop. Cheapest idle cost on a managed host; cold-start on first request after sleep. \| ~$0.15/month at full idle \|
	\| [Railway](./railway/README.md) \| Push from GitHub, volume in the dashboard. Easiest managed dashboard flow. \| $5/month (Hobby plan flat fee) \|
	\| [Render](./render/README.md) \| Blueprint-driven; persistent disk attaches automatically. Most "set it and forget it." \| $7.25/month (Starter web + 1 GB disk) \|
	\| [Coolify](./coolify/README.md) \| Self-hosted on your own VPS. Same Docker Compose stack, you own the host and the data. \| VPS cost only (Hetzner CX22 ~€3.79/month) \|

	## What every template guarantees

	- Volume mounted at `/data`. Matches the path the engine has used
	since v0.9.10.
	- HMAC secret generated on first boot via `openssl rand -hex 32`,
	written to `/data/.hmac` with `chmod 600`, and printed to stdout
	exactly once so the operator can capture it from the deploy logs.
	Subsequent boots load the secret from the file. The secret is never
	committed to a config file or set as a platform env var.
	- Only port 3111 is exposed publicly. The viewer on port 3113
	stays bound to the container's localhost. Reach it via SSH tunnel
	(see each platform's README).
	- TLS upstream of the container. Every managed platform terminates
	TLS at its edge proxy; the templates publish a single internal port
	(`3111`) to that proxy, never to the host. Integration plugins
	configured with `AGENTMEMORY_REQUIRE_HTTPS=1` will refuse to send the
	bearer over plaintext HTTP to a non-loopback host, so a
	misconfigured TLS layer fails loud instead of silently leaking the
	secret.

	## Pick a platform

	- Pick fly.io if you want the lowest idle cost and don't mind a
	cold-start latency hit on the first request after sleep.
	- Pick Railway if you want a clicky dashboard flow and a flat
	monthly bill.
	- Pick Render if you want the most "set it and forget it"
	Blueprint flow with automatic disk snapshots on paid plans.
	- Pick Coolify if you already run a VPS and want a self-hosted
	control plane — same Docker Compose stack, no third-party host has
	your memories.

	All four give you the same agentmemory API at the same port (3111)
	with the same auth model. Migrating between them later is a `tar` of
	`/data` and a re-import — see each platform's README for the exact
	commands.

	## Optional: LLM + embedding provider keys

	Every template runs out of the box without any LLM or embedding key —
	search falls back to BM25-only mode and synthetic (zero-LLM)
	compression keeps memories indexable. To unlock LLM-powered
	compression and hybrid (BM25 + vector) recall, add one of the
	following to your platform's environment variables (Fly:
	`flyctl secrets set`; Railway / Render / Coolify: dashboard
	Variables / Environment tab):

	\| Variable \| Purpose \|
	\|---------------------------\|----------------------------------------------------------\|
	\| `ANTHROPIC_API_KEY` \| LLM-backed compression + summarization \|
	\| `GEMINI_API_KEY` \| LLM provider alternative \|
	\| `OPENROUTER_API_KEY` \| LLM provider alternative \|
	\| `OPENAI_API_KEY` \| Embedding provider (text-embedding-3-small by default) \|
	\| `VOYAGE_API_KEY` \| Embedding provider alternative \|
	\| `AGENTMEMORY_AUTO_COMPRESS=true` \| Run LLM compression on every observation batch \|
	\| `AGENTMEMORY_INJECT_CONTEXT=true` \| Inject recalled memories back into agent prompts \|

	The defaults are intentionally conservative: provider keys default to
	absent (no third-party calls), `AGENTMEMORY_AUTO_COMPRESS` is off,
	and `AGENTMEMORY_INJECT_CONTEXT` is off. Opt in only after you've
	confirmed your provider quota can absorb the workload.

	## Cold-start budget

	Measured against fly.io's `iad` region with a 1 GB volume:

	```
	machine image prepared : 5.1 s
	volume mount + format : 2.5 s
	firecracker boot : 1.0 s
	entrypoint + chown : 0.5 s
	iii-engine ready : 3.0 s
	agentmemory worker reg : 2.0 s
	─────────────────────────────────
	healthcheck passes : ~9-10 s
	```

	Every template's health-check `grace_period` (or compose
	`start_period`) is set to 30 s for a 3x safety margin. Tune lower
	once you've measured your own platform's image-pull characteristics.