Knowledge Base — ProxifAI Docs

The knowledge base is ProxifAI’s retrieval layer. Every chat mode draws on it, every @-mention pulls from it, and every agent that needs grounding hits it before the model. It’s a hybrid retrieval system — Meilisearch for full-text speed, Qdrant for vector semantics, Reciprocal Rank Fusion to merge them, and an optional HuggingFace TEI reranker for the final ordering. Source: internal/knowledgebase/.

How it fits together

Source content                Ingestion                  Stores                Retrieval
──────────────                ─────────                  ──────                ─────────
git pushes      ─┐                                      Meilisearch         /api/v1/kb/search
docs saved      ─┤  →  extraction → chunks  ─→  index ─►  (instant)            ↑
issues / PRs    ─┤      │                              Qdrant                  │
@mentions       ─┘      └─ embedding (TEI)  ─→  index ─► (semantic)            │
                                                                               ↓
                                                                          ChatHandler
                                                                          ToolsForMode
                                                                          MCP / @-mentions

The pipeline is opt-in via KB_ENABLED=true plus connection details for Qdrant and Meilisearch. Without it, ProxifAI works fine but chat retrieval falls back to whatever the chat agent’s other tools (forge read, code intelligence, MCP) can surface.

Three search modes

POST /api/v1/kb/search accepts a mode field. Implementation in search/search.go.

Mode	Engine	When
`instant`	Meilisearch full-text + typo tolerance	Exact keywords, identifiers, file paths
`semantic`	Qdrant vector similarity	Conceptual queries where wording differs from source
`hybrid` (default)	Both, run in parallel + Reciprocal Rank Fusion	Most queries — gets best of both

When mode is omitted, hybrid is used. RRF combines instant and semantic ranks via score = sum(1 / (k + rank_in_each_engine)) (default k=60). The hybrid path is a single round-trip to the API — both engines query in parallel, results are merged, deduplicated by chunk ID, and limited.

Reranking

If a RerankerClient is configured (HuggingFace TEI, default model bge-reranker), the merged result list is reranked before being returned. The reranker scores each (query, chunk) pair end-to-end and bumps relevant chunks to the top — the trade-off is a ~50–200 ms latency hit. Disable by leaving the reranker URL unset.

Embeddings

Vector embeddings come from a HuggingFace Text Embeddings Inference instance (embedding/client.go).

Setting	Default (local dev)	Default (production)
Model	`BAAI/bge-small-en-v1.5`	`BAAI/bge-m3`
Vector size	384	1024
Configurable via	`EMBEDDING_MODEL` + `VECTOR_SIZE`	same

bge-m3 is multilingual + supports both dense and sparse vectors; bge-small-en-v1.5 is the lightweight English-only choice for laptops and CI. Pick based on the corpus you index — switching mid-deployment requires re-embedding everything (vector size mismatch will reject queries).

Ingestion

Content is pushed onto the KNOWLEDGE JetStream stream under kb.ingest.> subjects (ingestion/). The kb-worker subscribes, extracts text per source type, chunks it, embeds each chunk, and writes both Meilisearch and Qdrant atomically.

Source types ingested:

Source	Trigger	Chunk strategy
Code	`git.push` events	Per-symbol via tree-sitter; falls back to fixed-window for unsupported languages
Documents	Save in the docs UI	Heading-aware splitting
Issues	`issue.*` events	Title + description + comments as a single chunk per issue
Pull requests	`pr.*` events	Title + body + diff summary + reviews per chunk
Commits	`git.push` events	Commit message + files changed

Re-ingestion is incremental — only changed files re-embed on each push. The full surface is callable via POST /api/v1/kb/ingest for custom sources.

The ingestion path uses the LLM gateway for embedding calls when configured to. Token usage attribution flows through the same usage stream as chat completions.

Search responses

A search call returns chunks plus enough metadata to render a citation:

{
  "query": "auth middleware jwt",
  "mode": "hybrid",
  "totalHits": 42,
  "queryTimeMs": 87,
  "results": [
    {
      "documentId": "doc_…",
      "chunkId": "chk_…",
      "title": "internal/auth/auth.go",
      "content": "func Middleware(next http.Handler) http.Handler { … }",
      "sourceType": "code",
      "sourceId": "github.com/your-org/repo:internal/auth/auth.go",
      "score": 0.87,
      "highlights": { "content": ["…validates <em>JWT</em> Bearer tokens…"] }
    },
    …
  ]
}

The chat agent receives results in this shape via the search_knowledge_base tool. The web UI’s chat surface renders sourceType + title as clickable citation chips below each AI message.

Cross-org isolation

Every chunk is tagged with its source orgId. Search queries always include the caller’s org as a filter — Meilisearch via a filterable attribute, Qdrant via a payload filter. There’s no shared knowledge across orgs even on a single deployment.

Within an org, results respect RBAC — chunks from repos a user doesn’t have read access to are filtered out before returning. The filter happens in the API layer, not the index, so an admin browsing the underlying Qdrant or Meilisearch directly sees everything.

`@`-mentions

The chat input recognizes @-mention prefixes that pull specific entities into the conversation context as full payloads (not just chunks):

Mention	Brings in
`@<repo-name>`	Repo summary + recent activity
`@<file-path>`	Full file contents at the current branch HEAD
`@<issue-id>`	Issue title, description, comments, linked PRs
`@<pr-number>`	PR title, body, diff, reviews, CI status
`@<doc-title>`	Full document body

Mentions are an autocomplete affordance in the chat input — typing @ opens a fuzzy filter over the user’s recent + relevant entities. Internally, mentioned entities are appended to the query’s source list and bypass retrieval ranking (they’re guaranteed to land in context).

The @proxifai mention on issue and PR comments has a different purpose — it triggers the comment-trigger workflow that dispatches the named agent.

Configuration

Env var	Default	Effect
`KB_ENABLED`	`false`	Master switch — when false, `/api/v1/kb/*` returns 404
`QDRANT_URL`	required if KB enabled	Qdrant gRPC/HTTP endpoint
`QDRANT_API_KEY`	optional	Qdrant cloud auth
`MEILISEARCH_URL`	required if KB enabled	Meilisearch endpoint
`MEILISEARCH_API_KEY`	optional	Meilisearch master key
`EMBEDDING_URL`	required if KB enabled	TEI service URL
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	Tag for ingest metadata
`VECTOR_SIZE`	`384`	Must match the model
`RERANKER_URL`	optional	TEI reranker; if unset, no reranking
`KB_NATS_URL`	inherits	NATS for ingestion stream

REST endpoints

Method · Path	Purpose
`POST /api/v1/kb/search`	Search — `{query, mode, sourceType, limit, offset}`
`POST /api/v1/kb/ingest`	Manually queue a chunk for ingestion
`GET /api/v1/kb/sources`	List ingested source types and counts
`GET /api/v1/kb/health`	Embedding service + index health
`DELETE /api/v1/kb/documents/{id}`	Remove a document and its chunks

pfai kb search, pfai kb sources, and pfai kb ingest wrap the most common ones.