agentmemory-python / README.md
Yash030's picture
Upload README.md with huggingface_hub
6919f68 verified
metadata
title: AgentMemory Python
emoji: ๐Ÿง 
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false

agentmemory-python

Persistent memory for AI coding agents โ€” pure Python, zero external databases.
Works with Claude Code, Cursor, Cline, Windsurf, Gemini CLI, and any MCP client.

English | ็ฎ€ไฝ“ไธญๆ–‡ | ๆ—ฅๆœฌ่ชž | ํ•œ๊ตญ์–ด | Espaรฑol | เคนเคฟเคจเฅเคฆเฅ€ | Portuguรชs | Franรงais | Deutsch

Python 3.10+ SQLite WAL Flask 3.0 MCP Compatible HuggingFace Space

Quick Start โ€ข Features โ€ข MCP โ€ข API โ€ข Config โ€ข Deploy โ€ข Viewer โ€ข Architecture


What Is This?

agentmemory-python is a Python reimplementation of the agentmemory persistent memory server. It exposes a REST API, WebSocket stream, and MCP tools endpoint that AI coding agents use to store and retrieve session observations, long-term memories, lessons, and pinned memory slots.

Key differences from the Node.js original:

  • No Node.js or iii-engine โ€” runs with plain python src/app.py
  • SQLite instead of Dolt โ€” single file, WAL mode, instant startup
  • HuggingFace Space ready โ€” deploys in one click, data synced to an HF dataset repo
  • Same REST + MCP wire format โ€” drop-in for any agent already wired to agentmemory

Your agent captures every tool call, stores them as observations, compresses them into searchable memory, and injects the right context at the start of every new session โ€” automatically.


Quick Start

Run locally

# Clone
git clone https://github.com/Yash030/agentmemory-python.git
cd agentmemory-python

# Install dependencies (no build step)
pip install -r requirements.txt

# Start the server
python src/app.py

Server starts on http://localhost:3111. Open the viewer at http://localhost:3111/viewer.

Verify it works

# Health check
curl http://localhost:3111/agentmemory/livez
# {"status": "ok"}

# Save a memory
curl -X POST http://localhost:3111/agentmemory/remember \
  -H "Content-Type: application/json" \
  -d '{"content": "JWT auth uses jose middleware in src/middleware/auth.ts", "concepts": ["auth", "jwt"]}'

# Recall it
curl -X POST http://localhost:3111/agentmemory/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication middleware", "limit": 5}'

Features

Feature Status Notes
REST API โ€” sessions, memories, observations โœ… Full surface
WebSocket live stream โœ… /stream/mem-live/viewer
MCP tools endpoint โœ… 31 tools
Built-in HTML viewer โœ… Real-time dashboard at /viewer
BM25 keyword search โœ… Always on, no API key needed
Hybrid BM25 + vector search โœ… Requires GEMINI_API_KEY
4-tier memory consolidation โš™๏ธ CONSOLIDATION_ENABLED=true + LLM key
Knowledge graph extraction โš™๏ธ GRAPH_EXTRACTION_ENABLED=true + LLM key
LLM observation compression โš™๏ธ AGENTMEMORY_AUTO_COMPRESS=true + LLM key
Lessons with confidence decay โœ… Fingerprinted, auto-strengthen on repeat
Memory slots (pinned context) โœ… CRUD + auto-reflect
Session replay โœ… Full timeline in viewer
Audit log โœ… Tracks every write with agent_id + timestamp
HuggingFace Space deploy โœ… One-click, data synced to dataset repo
Privacy filtering โœ… Strips API keys, tokens before storage

4-Tier Memory Model

Inspired by how human memory works โ€” raw experience โ†’ compressed episodes โ†’ extracted facts โ†’ learned patterns.

Tier What When
Working Raw observations from tool use Every tool call
Episodic Compressed session summaries Session end
Semantic Extracted facts and patterns Consolidation
Procedural Workflows and decision patterns Consolidation

MCP Integration

Wire agentmemory-python into your agent's MCP config. It speaks the same MCP protocol as the Node.js original.

Most agents (Cursor, Claude Desktop, Cline, Windsurf)

{
  "mcpServers": {
    "agentmemory": {
      "command": "npx",
      "args": ["-y", "@agentmemory/mcp"],
      "env": {
        "AGENTMEMORY_URL": "http://localhost:3111"
      }
    }
  }
}

Claude Code

Paste this prompt and your agent will wire everything:

Start agentmemory-python: run `python src/app.py` from the agentmemory-python directory.
Then add this MCP server to ~/.claude.json under mcpServers:
{
  "agentmemory": {
    "command": "npx",
    "args": ["-y", "@agentmemory/mcp"],
    "env": { "AGENTMEMORY_URL": "http://localhost:3111" }
  }
}
Verify with: curl http://localhost:3111/agentmemory/livez
Open the viewer at: http://localhost:3111/viewer

Available MCP Tools (31)

Tool Description
memory_save Save a long-term insight, decision, or pattern
memory_recall Search past observations by keyword
memory_smart_search Hybrid BM25 + vector semantic search
memory_sessions List recent sessions
memory_sessions_list Retrieve all memory sessions
memory_timeline Chronological observations for a session
memory_observations Observations for a session
memory_profile Per-project concept + file profile
memory_lessons List active lessons with confidence scores
memory_lesson_save Save a lesson (duplicate saves strengthen it)
memory_lesson_recall Search lessons by query
memory_lesson_search Search lessons by keywords
memory_consolidate Run 4-tier memory consolidation
memory_reflect Reflect on session, update context
memory_diagnose Health check across all subsystems
memory_forget Delete memory, session, or observations
memory_export Export all memory data as JSON
agent_observe Log agent execution observation
agent_remember Save agent memory to long-term storage
memory_antigravity_sync Sync Antigravity transcripts to memory
memory_antigravity_sync_all Master sync: transcript + crystallize + reflect
memory_slot_list List all pinned memory slots
memory_slot_get Retrieve a specific pinned memory slot
memory_slot_create Create/overwrite a pinned memory slot
memory_slot_append Append text content to a pinned memory slot
memory_slot_replace Replace pinned memory slot content
memory_slot_delete Delete a pinned memory slot
memory_action_create Create a new work item / action
memory_action_update Update fields of an existing action
memory_frontier Get active and pending actions sorted by priority
memory_crystallize Crystallize/summarize observations in a session

API Reference

Base URL: http://localhost:3111/agentmemory

Health

Method Path Description
GET /livez Liveness probe โ€” no auth required

Sessions

Method Path Description
POST /session/start Start a new session
POST /session/end End a session
POST /session/commit Commit session with summary
GET /sessions List all sessions

Observations

Method Path Description
POST /observe Ingest a hook event observation
POST /agent/observe Simplified observe for direct agent use
GET /observations List observations (?session_id=)
POST /timeline Chronological observation window

Memories

Method Path Description
POST /remember Save long-term memory
POST /agent/remember Simplified remember
POST /forget Delete memory / session / observations
POST /search BM25 + vector search
POST /context Compile context for a session + project
GET /memories List memories (?latest=true&limit=N)
POST /evolve Create a new memory version

Lessons

Method Path Description
GET /lessons List lessons
POST /lessons Create lesson
POST /lessons/search Search lessons
POST /lessons/strengthen Reinforce an existing lesson

Slots

Method Path Description
GET /slots List all pinned slots
POST /slot Create or update a slot
GET /slot Get slot by name
DELETE /slot Delete a slot
POST /slot/reflect Auto-populate from session observations

Graph + Profile

Method Path Description
GET /relations Knowledge graph edges
POST /relations Add a relation
GET /profile Project profile (top concepts, files)

Actions

Method Path Description
GET /actions List actions
POST /actions Create an action
PATCH /actions/<id> Update action status / fields
GET /frontier Pending actions sorted by priority
GET /insights List insights

Replay

Method Path Description
GET /replay/sessions Sessions list for replay tab
GET /replay/load Full session + observations (?sessionId=)

MCP

Method Path Description
GET /mcp/tools MCP tool schema list
POST /mcp/tools MCP tool call dispatch

Configuration

Create ~/.agentmemory/.env (no export prefix needed):

# Server port
III_REST_PORT=3111

# Vector search โ€” enables Gemini 768-dim embeddings
GEMINI_API_KEY=your-gemini-key

# LLM for compression / consolidation / graph extraction
# Any one of these enables LLM features:
ANTHROPIC_API_KEY=your-anthropic-key
# OPENAI_API_KEY=your-openai-key
# GEMINI_API_KEY=your-key   (same key as above works for both)

# LLM-powered features (disabled by default โ€” spend tokens)
CONSOLIDATION_ENABLED=true
GRAPH_EXTRACTION_ENABLED=true
AGENTMEMORY_AUTO_COMPRESS=true

# Context injection limits
TOKEN_BUDGET=2000
MAX_OBS_PER_SESSION=500

# Auth โ€” set to require Bearer token on all endpoints
AGENTMEMORY_SECRET=your-secret

# Agent scope isolation
AGENT_ID=my-agent
AGENTMEMORY_AGENT_SCOPE=isolated   # only see this agent's data

# HuggingFace sync
HF_TOKEN=your-hf-token
AGENTMEMORY_DATASET_REPO=username/agentmemory-data

Full Variable Reference

Variable Default Purpose
III_REST_PORT / PORT 3111 API server port
GEMINI_API_KEY / GOOGLE_API_KEY โ€” Enables 768-dim vector search
AGENTMEMORY_SECRET โ€” Bearer token auth on all endpoints
AGENT_ID โ€” Default agent ID for scope isolation
AGENTMEMORY_AGENT_SCOPE=isolated โ€” Filters data to current AGENT_ID
MAX_OBS_PER_SESSION 500 Hard cap on observations per session
TOKEN_BUDGET 2000 Max tokens in compiled context
GRAPH_EXTRACTION_ENABLED false Knowledge graph (needs LLM)
CONSOLIDATION_ENABLED false Memory consolidation (needs LLM)
AGENTMEMORY_AUTO_COMPRESS false LLM observation compression

Viewer

Built-in dashboard at http://localhost:3111/viewer.

Tab What You See
Dashboard Session stats, memory counts, recent activity
Sessions Browse sessions, inspect observations
Memories Search, filter, and read long-term memories
Graph Project folder visualization โ€” nodes = folders, edges = shared concepts or parent path
Timeline Per-session chronological observation view
Lessons Confidence-scored lessons with decay tracking
Slots Pinned memory slots editor
Replay Scrub through past sessions frame by frame

Deploy to HuggingFace

This project is designed to run as a HuggingFace Space. Data is stored in an HF dataset repo and restored on every boot โ€” so no persistent disk is needed.

Setup

  1. Fork this repo as a HuggingFace Space (SDK: Docker)

  2. Create a dataset repo (e.g. your-username/agentmemory-data)

  3. Add Space secrets in the HF dashboard:

    Secret Value
    HF_TOKEN Your HF write token
    AGENTMEMORY_DATASET_REPO your-username/agentmemory-data
    AGENTMEMORY_SECRET A random secret (optional but recommended)
    GEMINI_API_KEY Gemini key (optional, enables vector search)
  4. The Space boots, restores agentmemory.db from the dataset repo, and starts the server

How sync works

sync.py uses mtime fingerprinting โ€” it only uploads when the database actually changed, so there are no unnecessary uploads during idle periods.

# Manual backup
python sync.py

# Environment for sync
HF_TOKEN=...
AGENTMEMORY_DATASET_REPO=username/agentmemory-data

Architecture

agentmemory-python/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ app.py          Flask server โ€” all endpoints, WebSocket broadcaster
โ”‚   โ”œโ”€โ”€ db.py           SQLite StateKV โ€” WAL mode, audit_log table
โ”‚   โ”œโ”€โ”€ functions.py    Core logic โ€” observe, remember, search, context
โ”‚   โ”œโ”€โ”€ search.py       BM25 + Gemini vector index + HybridSearch (RRF)
โ”‚   โ””โ”€โ”€ viewer/
โ”‚       โ””โ”€โ”€ index.html  Single-file HTML dashboard (no build step)
โ”œโ”€โ”€ sync.py             HuggingFace dataset backup/restore
โ”œโ”€โ”€ Dockerfile          HF Space container
โ”œโ”€โ”€ start.sh            Boot script (restore โ†’ start server โ†’ start sync)
โ””โ”€โ”€ requirements.txt    6 Python dependencies, no external DB required

Database layout

Two SQLite tables in ~/.agentmemory/agentmemory.db:

-- All data lives here, namespaced by scope
kv_store (
  scope TEXT NOT NULL,    -- e.g. "mem:sessions", "mem:obs:{session_id}"
  key   TEXT NOT NULL,
  value TEXT NOT NULL,    -- JSON-serialized
  PRIMARY KEY (scope, key)
)

-- Audit trail replaces Dolt git versioning
audit_log (
  id       INTEGER PRIMARY KEY AUTOINCREMENT,
  ts       INTEGER NOT NULL,   -- unix millis
  agent_id TEXT NOT NULL,
  message  TEXT NOT NULL
)

Search pipeline

Query
  โ†’ BM25 (always)         โ€” Porter-stemmed keyword matching
  โ†’ Vector (if Gemini key) โ€” 768-dim cosine similarity
  โ†’ RRF fusion            โ€” Reciprocal Rank Fusion (k=60)
  โ†’ Session diversify     โ€” max 3 results per session
  โ†’ Return top-K

vs Original agentmemory

agentmemory (Node.js) agentmemory-python
Runtime Node.js 20+ Python 3.10+
Storage Dolt SQL (git-versioned MySQL) SQLite WAL (single file)
Engine dependency iii-engine (separate binary) None โ€” just Flask
Embeddings 6 providers + local @xenova/transformers Gemini 768-dim
MCP tools 53 31
REST endpoints 128 ~50
Deploy npm, Docker, fly.io, Railway, Render Docker, HuggingFace Spaces
Cold boot ~7s (iii engine warm-up) <2s
Database size ~232MB (417 Dolt chunk files) ~20MB (single .db file)
Setup npm install -g @agentmemory/agentmemory pip install -r requirements.txt

Choose the Python version for: simpler setup, HF Space deployment, single-file database, no Node.js, or Python ecosystem integration.

Choose the Node.js version for: the full 53-tool MCP surface, iii-engine observability, production multi-agent deployments, or the full auto-hook suite.


Contributing

See CONTRIBUTING.md. Issues and PRs welcome.

Priority areas: test coverage, additional embedding providers, more agent hook scripts.


License

Apache-2.0 โ€” see LICENSE.