Question 1

What is an MCP memory layer?

Accepted Answer

MCP (Model Context Protocol) is the open standard Anthropic, OpenAI, and others use to expose tools to coding agents. An MCP memory layer is a server that speaks MCP and stores prior context — code chunks, decisions, bug fixes, architectural notes — so the agent can recall it instead of re-reading the whole repo every session. Imprint is a local-first implementation: vectors in Qdrant, facts in SQLite, embeddings via on-device ONNX.

Question 2

How is this different from Cursor or ChatGPT built-in memory?

Accepted Answer

Built-in memories are provider-bound, shallow, and cross-session at best. Imprint is agent-agnostic, indexed per-project, semantically searchable, and structured (tags for language, layer, domain, topic). One memory store is shared across Claude Code, Cursor, Copilot, Codex CLI, and Cline — whatever you use today and whatever you switch to next.

Question 3

Does my code ever leave my machine?

Accepted Answer

Not by default. The Qdrant daemon runs on 127.0.0.1:6333. Embeddings come from EmbeddingGemma-300M via ONNX Runtime — no network call. The knowledge graph is local SQLite. If you want LLM-powered topic tagging (better granularity, not required) you opt in explicitly with `imprint config set tagger.llm true` and pick a provider (Anthropic, OpenAI, Gemini, Ollama, vLLM).

Question 4

Why not just wait for bigger context windows?

Accepted Answer

Bigger contexts still pay for every token at inference, still forget between sessions, and still degrade on long conversations. Imprint flips the problem: the context window stays tight, the agent pulls only the 5–10 chunks relevant to the current question. Our benchmark shows Claude Code dropping from 10.3M total tokens to 3.0M across the same 150-run suite — the agent simply did not need the rest.

Question 5

Which agents does it work with?

Accepted Answer

Any MCP-capable host. `imprint setup all` auto-wires Claude Code (with hard enforcement via a PreToolUse hook), Cursor, Copilot (VSCode agent mode), Codex CLI, and Cline. Adding a new host is a config file and a CLI subcommand away.

Question 6

What about teams and multiple machines?

Accepted Answer

Project identity is derived from the manifest (package.json, go.mod, pyproject.toml, etc.), so the same project on two laptops has the same identity. `imprint sync serve --relay <host>` runs a WebSocket forwarder and pushes memory deltas between peers. Per-workspace isolation keeps client projects separate.

Question 7

Will it slow my agent down?

Accepted Answer

Search itself is tens of milliseconds against a local Qdrant. Net turn count can go up slightly (search → answer takes one extra call), but each turn carries a tiny fraction of the tokens. Across the benchmark, duration drops in parallel with cost on debug and cross-project questions, stays flat on creation.

Question 8

Can it index JS-heavy docs sites or complex documents?

Accepted Answer

Yes, via two opt-in extractors. For JS-rendered pages, install the obscura headless browser (Apache 2.0) and set `imprint config set ingest.use_obscura true` — `imprint ingest-url` will pipe every URL through a real browser before extraction. For richer document support (archives, Apple iWork, ODF, email/mbox, image OCR), install kreuzberg (`pip install kreuzberg[all]`) and set `ingest.use_kreuzberg true`. kreuzberg is under the Elastic License 2.0 — internal and self-hosted use is fine; SaaS re-hosting of its extraction is not.

Question 9

What does uninstalling look like?

Accepted Answer

`imprint disable` tears down every MCP entry from every host config it previously wrote, stops the Qdrant daemon, and preserves `data/` and `.venv/` so `imprint enable` brings everything back instantly. Full removal is one extra `imprint wipe` — everything is under `data/`.

FAQ