A reference architecture built on Cloudflare AI Search. Store decisions, patterns, and context in R2. Recall them with hybrid search. Deploy to your own account.
One command. Opens Deploy to Cloudflare, then configures your MCP client.
npx create-ai-search-memory
Coming soon — not yet published to npm. See the manual setup below.
# Clone and install git clone https://github.com/miguelcardoso/ai-search-memory npm install --prefix app npm install --prefix app/dashboard # Build dashboard and deploy (R2 auto-provisioned) npm run deploy # Set your auth token npx wrangler secret put AUTH_TOKEN --name ai-search-memory
Opens your browser to deploy the Worker to your account. R2 bucket and AI binding are provisioned automatically. You'll set the auth token during deploy.
Detects OpenCode, Claude Desktop, or Cursor and writes the MCP server config with your Worker URL and token.
Your agent can save and recall memories immediately. Open the dashboard at your Worker URL to browse and search.
# You get: Dashboard https://<your-worker>.workers.dev MCP https://<your-worker>.workers.dev/mcp API https://<your-worker>.workers.dev/api/memories
memory_saveStore with type, scope, tags, metadata
memory_recallSemantic search
memory_askAI answer from memories
memory_listList by scope
memory_getFetch by ID
memory_deleteRemove by ID
memory_clearWipe a scope
/rememberSave something to memory
/recallSearch memories by meaning
/ask-memoryQuestion answered from memories
/forgetDelete a memory by ID
/memoriesList all stored memories
POST /api/memories Save GET /api/memories/search?q=... Recall POST /api/memories/ask AI answer GET /api/memories List GET /api/memories/:id Get DELETE /api/memories/:id Delete DELETE /api/memories Clear GET /api/policy Save policy config GET /api/status Status GET /health Health check (public)
# All API requests require the auth header curl https://<your-worker>.workers.dev/api/memories \ -H "Authorization: Bearer YOUR_TOKEN"
Agent saves a memory → Markdown + YAML frontmatter written to R2 → Stored at {instance}/{scope}/{hash}/{id}.md Agent recalls memories → Query sent to AI Search binding (env.AI.aiSearch()) → Hybrid retrieval: vector + keyword → Folder filter scopes to user/project → Full memory fetched from R2 in parallel
One Worker, multiple isolated AI Search instances. Pass the
X-Memory-Instance
header to select which instance to use.
Each instance has its own search index and R2 prefix — fully isolated.
Created automatically on first recall.
# No header → uses the default instance curl .../api/memories/search?q=... \ -H "Authorization: Bearer TOKEN" # Custom instance → created on demand curl .../api/memories/search?q=... \ -H "Authorization: Bearer TOKEN" \ -H "X-Memory-Instance: my-project"
Add memory guidance to your AGENTS.md
so your agent knows what to save, when, and with what context. The server also exposes a
memory://policy
MCP resource with structured save policies the agent reads at session start.
# AGENTS.md — add to your project or global config ## Memory ### What to save | Category | When to save | Scope | |----------------|---------------------------------------|---------| | Decisions | Architectural choices + rationale | project | | Patterns | Discovered codebase conventions | project | | Preferences | User-stated style / tooling prefs | global | | Context | Project structure, key file roles | project | ### Always include metadata "metadata": { "files": ["src/auth.ts", "src/middleware.ts"], "branch": "feat/auth-refactor", "repository": "my-api" } ### Recall before acting At the start of a session, use memory_recall to check for relevant prior context before re-discovering things.
The Worker serves a built-in dashboard at the root URL. Browse, search, and manage memories from your browser — no extra deployment needed. Switch between list and timeline views.
SearchSemantic search across all memories
FiltersFilter by scope, type, importance
TimelineVisual timeline grouped by day
MetadataExpand cards to see files, branch, repo
DeleteRemove individual memories or clear all
Token authEnter your Bearer token once, stored in a cookie
# Just open your Worker URL in a browser open https://<your-worker>.workers.dev # Enter your Bearer token when prompted — that's it.
Memories carry metadata — recently worked files, branch, repository — so you always know where and when something was decided. The dashboard shows metadata inline when you expand a memory card.
curl -X POST https://<worker>/api/memories \ -H "Authorization: Bearer TOKEN" \ -H "Content-Type: application/json" \ -d '{ "content": "Use R2 over KV for memory storage — need files >25MB and list ops", "type": "decision", "importance": "high", "tags": ["storage", "architecture"], "metadata": { "files": ["src/memory-service.ts", "wrangler.jsonc"], "branch": "feat/storage-layer", "repository": "ai-search-memory" } }' # Response: { "ok": true, "action": "created", "memory": { "id": "mem_mm99z7fu_a97fc61b", "type": "decision", "metadata": { "files": [...], "branch": "feat/storage-layer", ... } } }
This is a reference architecture showing what you can build with Cloudflare AI Search. Questions? Join the Cloudflare Discord or check the docs.