Skip to content
Download

Metadata Tagging

Every chunk gets a structured tag payload stored in Qdrant:

{
"lang": "python", # from file extension
"layer": "api", # from path (api/ui/tests/infra/...)
"kind": "source", # source/test/migration/readme/types/...
"domain": ["auth", "db"], # keyword-matched topics
"topics": ["jwt-validation", ...] # zero-shot (default) or LLM-derived
}
graph LR
CHUNK[chunk + rel_path] --> DET[1. Deterministic<br/>ext → lang<br/>path → layer/kind]
CHUNK --> KW[2. Keyword dict<br/>regex match → domain]
CHUNK --> ZS[3. Zero-shot — default<br/>cosine vs label prototypes]
CHUNK -.opt-in.-> LLM[4. LLM tagging — replaces 3<br/>anthropic/openai/ollama/vllm/gemini]
DET --> MERGE[merge into payload]
KW --> MERGE
ZS --> MERGE
LLM --> MERGE
MERGE --> QDR[(Qdrant payload)]
style DET fill:#1a1a3a,stroke:#60a5fa,color:#fff
style KW fill:#1a1a3a,stroke:#4ecdc4,color:#fff
style ZS fill:#1a1a3a,stroke:#a78bfa,color:#fff
style LLM fill:#1a1a3a,stroke:#fbbf24,color:#fff,stroke-dasharray: 5 5

Four tag sources, layered from cheap to rich:

#SourceWhat it producesCostDefault
1Deterministiclang (from extension), layer (from path), kind (from filename)FreeAlways on
2Keyword dictdomain[] — 13 categories via regex: auth, db, api, math, rendering, ui, testing, infra, ml, perf, security, build, paymentsFreeAlways on
3Zero-shottopics[] — cosine similarity against pre-embedded label prototypes (threshold > 0.35)1 vector compare per chunk per labelOn by default (opt-out: IMPRINT_ZERO_SHOT_TAGS=0)
4LLM taggingtopics[] — 1–4 tags per chunk from an LLM1 API call per chunkOpt-in (IMPRINT_LLM_TAGS=1), replaces zero-shot

When LLM tagging is enabled it replaces zero-shot — no point running both. See tagger.py for implementation.

Set IMPRINT_LLM_TAGS=1 and pick a provider:

ProviderIMPRINT_LLM_TAGGER_PROVIDERDefault modelAPI key env varNotes
Anthropicanthropic (default)claude-haiku-4-5ANTHROPIC_API_KEYUses native Anthropic SDK
OpenAIopenaigpt-4o-miniOPENAI_API_KEYOpenAI SDK
Geminigeminigemini-2.0-flashGOOGLE_API_KEYVia OpenAI-compatible endpoint
Ollamaollamallama3.2Local server, no API key needed
vLLMvllmdefaultLocal server, no API key needed
Local (llama-cpp)localqwen3-1.7b (auto GGUF download)In-process via llama-cpp-python, no server required. Tune via tagger.local.model_repo, tagger.local.model_file, tagger.local.model_path, tagger.local.n_ctx, tagger.local.n_gpu_layers

Overrides:

Env varPurposeExample
IMPRINT_LLM_TAGGER_MODELUse a different modelIMPRINT_LLM_TAGGER_MODEL=llama3.1:70b
IMPRINT_LLM_TAGGER_BASE_URLCustom API endpointIMPRINT_LLM_TAGGER_BASE_URL=http://gpu-box:8000/v1
IMPRINT_LLM_TAGGER_API_KEYFallback API key (for providers without a standard env var)

Ollama and vLLM use OpenAI-compatible APIs internally. Point IMPRINT_LLM_TAGGER_BASE_URL at any OpenAI-compatible server to use unlisted providers.

MCP store always LLM-tags. When a memory is saved via the store MCP tool, the background job runs the LLM tagger regardless of the tagger.llm config flag and stamps llm_tagged: true on the point so a later imprint retag won’t re-tag it. The global toggle still controls ingest/refresh passes over bulk files.

MCP search supports payload filters — the model can narrow:

mcp__imprint__search(
query="JWT validation",
lang="python", # tags.lang
layer="api", # tags.layer
domain="auth,security", # any-match against tags.domain
project="my-web-app",
type="pattern",
limit=10,
)

See mcp.md for the full MCP tool surface.