AI Search reference architecture

Memory for AI agents

A reference architecture built on Cloudflare AI Search. Store decisions, patterns, and context in R2. Recall them with hybrid search. Deploy to your own account.

Hybrid search (vector + keyword) Contextual metadata Save policy config Dashboard + Timeline Auto-provisioning Bearer token auth 7 MCP tools + 5 prompts Multi-instance Markdown storage in R2

Get started

One command. Opens Deploy to Cloudflare, then configures your MCP client.

npx create-ai-search-memory

Coming soon — not yet published to npm. See the manual setup below.

Setup

# Clone and install
git clone https://github.com/miguelcardoso/ai-search-memory
npm install --prefix app
npm install --prefix app/dashboard

# Build dashboard and deploy (R2 auto-provisioned)
npm run deploy

# Set your auth token
npx wrangler secret put AUTH_TOKEN --name ai-search-memory

What happens

Deploy to Cloudflare

Opens your browser to deploy the Worker to your account. R2 bucket and AI binding are provisioned automatically. You'll set the auth token during deploy.

Configure your MCP client

Detects OpenCode, Claude Desktop, or Cursor and writes the MCP server config with your Worker URL and token.

Start using it

Your agent can save and recall memories immediately. Open the dashboard at your Worker URL to browse and search.

# You get:
Dashboard  https://<your-worker>.workers.dev
MCP        https://<your-worker>.workers.dev/mcp
API        https://<your-worker>.workers.dev/api/memories

MCP Tools

memory_save

Store with type, scope, tags, metadata

memory_recall

Semantic search

memory_ask

AI answer from memories

memory_list

List by scope

memory_get

Fetch by ID

memory_delete

Remove by ID

memory_clear

Wipe a scope

Prompts

/remember

Save something to memory

/recall

Search memories by meaning

/ask-memory

Question answered from memories

/forget

Delete a memory by ID

/memories

List all stored memories

REST API

POST   /api/memories              Save
GET    /api/memories/search?q=...  Recall
POST   /api/memories/ask           AI answer
GET    /api/memories               List
GET    /api/memories/:id           Get
DELETE /api/memories/:id           Delete
DELETE /api/memories               Clear
GET    /api/policy                 Save policy config
GET    /api/status                 Status
GET    /health                     Health check (public)

# All API requests require the auth header
curl https://<your-worker>.workers.dev/api/memories \
  -H "Authorization: Bearer YOUR_TOKEN"

How it works

Agent saves a memory
  → Markdown + YAML frontmatter written to R2
  → Stored at {instance}/{scope}/{hash}/{id}.md

Agent recalls memories
  → Query sent to AI Search binding (env.AI.aiSearch())
  → Hybrid retrieval: vector + keyword
  → Folder filter scopes to user/project
  → Full memory fetched from R2 in parallel

Multi-instance

One Worker, multiple isolated AI Search instances. Pass the X-Memory-Instance header to select which instance to use. Each instance has its own search index and R2 prefix — fully isolated. Created automatically on first recall.

# No header → uses the default instance
curl .../api/memories/search?q=... \
  -H "Authorization: Bearer TOKEN"

# Custom instance → created on demand
curl .../api/memories/search?q=... \
  -H "Authorization: Bearer TOKEN" \
  -H "X-Memory-Instance: my-project"

Configure your agent

Add memory guidance to your AGENTS.md so your agent knows what to save, when, and with what context. The server also exposes a memory://policy MCP resource with structured save policies the agent reads at session start.

# AGENTS.md — add to your project or global config

## Memory

### What to save
| Category       | When to save                          | Scope   |
|----------------|---------------------------------------|---------|
| Decisions    | Architectural choices + rationale     | project |
| Patterns     | Discovered codebase conventions       | project |
| Preferences  | User-stated style / tooling prefs     | global  |
| Context      | Project structure, key file roles     | project |

### Always include metadata
"metadata": {
  "files": ["src/auth.ts", "src/middleware.ts"],
  "branch": "feat/auth-refactor",
  "repository": "my-api"
}

### Recall before acting
At the start of a session, use memory_recall to check
for relevant prior context before re-discovering things.

Dashboard

The Worker serves a built-in dashboard at the root URL. Browse, search, and manage memories from your browser — no extra deployment needed. Switch between list and timeline views.

Search

Semantic search across all memories

Filters

Filter by scope, type, importance

Timeline

Visual timeline grouped by day

Metadata

Expand cards to see files, branch, repo

Delete

Remove individual memories or clear all

Token auth

Enter your Bearer token once, stored in a cookie

# Just open your Worker URL in a browser
open https://<your-worker>.workers.dev

# Enter your Bearer token when prompted — that's it.

Save with context

Memories carry metadata — recently worked files, branch, repository — so you always know where and when something was decided. The dashboard shows metadata inline when you expand a memory card.

curl -X POST https://<worker>/api/memories \
  -H "Authorization: Bearer TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Use R2 over KV for memory storage — need files >25MB and list ops",
    "type": "decision",
    "importance": "high",
    "tags": ["storage", "architecture"],
    "metadata": {
      "files": ["src/memory-service.ts", "wrangler.jsonc"],
      "branch": "feat/storage-layer",
      "repository": "ai-search-memory"
    }
  }'

# Response:
{
  "ok": true,
  "action": "created",
  "memory": {
    "id": "mem_mm99z7fu_a97fc61b",
    "type": "decision",
    "metadata": { "files": [...], "branch": "feat/storage-layer", ... }
  }
}

Build something with AI Search

This is a reference architecture showing what you can build with Cloudflare AI Search. Questions? Join the Cloudflare Discord or check the docs.