GitHub
Concept

Knowledge Base

A RAG layer over your repos, docs, issues, and PRs — Meilisearch for instant full-text, Qdrant for vector semantic, Reciprocal Rank Fusion to combine them, optional reranker on top.

The knowledge base is ProxifAI’s retrieval layer. Every chat mode draws on it, every @-mention pulls from it, and every agent that needs grounding hits it before the model. It’s a hybrid retrieval system — Meilisearch for full-text speed, Qdrant for vector semantics, Reciprocal Rank Fusion to merge them, and an optional HuggingFace TEI reranker for the final ordering. Source: internal/knowledgebase/.

How it fits together

Source content                Ingestion                  Stores                Retrieval
──────────────                ─────────                  ──────                ─────────
git pushes      ─┐                                      Meilisearch         /api/v1/kb/search
docs saved      ─┤  →  extraction → chunks  ─→  index ─►  (instant)            ↑
issues / PRs    ─┤      │                              Qdrant                  │
@mentions       ─┘      └─ embedding (TEI)  ─→  index ─► (semantic)            │

                                                                          ChatHandler
                                                                          ToolsForMode
                                                                          MCP / @-mentions

The pipeline is opt-in via KB_ENABLED=true plus connection details for Qdrant and Meilisearch. Without it, ProxifAI works fine but chat retrieval falls back to whatever the chat agent’s other tools (forge read, code intelligence, MCP) can surface.

Three search modes

POST /api/v1/kb/search accepts a mode field. Implementation in search/search.go.

ModeEngineWhen
instantMeilisearch full-text + typo toleranceExact keywords, identifiers, file paths
semanticQdrant vector similarityConceptual queries where wording differs from source
hybrid (default)Both, run in parallel + Reciprocal Rank FusionMost queries — gets best of both

When mode is omitted, hybrid is used. RRF combines instant and semantic ranks via score = sum(1 / (k + rank_in_each_engine)) (default k=60). The hybrid path is a single round-trip to the API — both engines query in parallel, results are merged, deduplicated by chunk ID, and limited.

Reranking

If a RerankerClient is configured (HuggingFace TEI, default model bge-reranker), the merged result list is reranked before being returned. The reranker scores each (query, chunk) pair end-to-end and bumps relevant chunks to the top — the trade-off is a ~50–200 ms latency hit. Disable by leaving the reranker URL unset.

Embeddings

Vector embeddings come from a HuggingFace Text Embeddings Inference instance (embedding/client.go).

SettingDefault (local dev)Default (production)
ModelBAAI/bge-small-en-v1.5BAAI/bge-m3
Vector size3841024
Configurable viaEMBEDDING_MODEL + VECTOR_SIZEsame

bge-m3 is multilingual + supports both dense and sparse vectors; bge-small-en-v1.5 is the lightweight English-only choice for laptops and CI. Pick based on the corpus you index — switching mid-deployment requires re-embedding everything (vector size mismatch will reject queries).

Ingestion

Content is pushed onto the KNOWLEDGE JetStream stream under kb.ingest.> subjects (ingestion/). The kb-worker subscribes, extracts text per source type, chunks it, embeds each chunk, and writes both Meilisearch and Qdrant atomically.

Source types ingested:

SourceTriggerChunk strategy
Codegit.push eventsPer-symbol via tree-sitter; falls back to fixed-window for unsupported languages
DocumentsSave in the docs UIHeading-aware splitting
Issuesissue.* eventsTitle + description + comments as a single chunk per issue
Pull requestspr.* eventsTitle + body + diff summary + reviews per chunk
Commitsgit.push eventsCommit message + files changed

Re-ingestion is incremental — only changed files re-embed on each push. The full surface is callable via POST /api/v1/kb/ingest for custom sources.

The ingestion path uses the LLM gateway for embedding calls when configured to. Token usage attribution flows through the same usage stream as chat completions.

Search responses

A search call returns chunks plus enough metadata to render a citation:

{
  "query": "auth middleware jwt",
  "mode": "hybrid",
  "totalHits": 42,
  "queryTimeMs": 87,
  "results": [
    {
      "documentId": "doc_…",
      "chunkId": "chk_…",
      "title": "internal/auth/auth.go",
      "content": "func Middleware(next http.Handler) http.Handler { … }",
      "sourceType": "code",
      "sourceId": "github.com/your-org/repo:internal/auth/auth.go",
      "score": 0.87,
      "highlights": { "content": ["…validates <em>JWT</em> Bearer tokens…"] }
    },

  ]
}

The chat agent receives results in this shape via the search_knowledge_base tool. The web UI’s chat surface renders sourceType + title as clickable citation chips below each AI message.

Cross-org isolation

Every chunk is tagged with its source orgId. Search queries always include the caller’s org as a filter — Meilisearch via a filterable attribute, Qdrant via a payload filter. There’s no shared knowledge across orgs even on a single deployment.

Within an org, results respect RBAC — chunks from repos a user doesn’t have read access to are filtered out before returning. The filter happens in the API layer, not the index, so an admin browsing the underlying Qdrant or Meilisearch directly sees everything.

@-mentions

The chat input recognizes @-mention prefixes that pull specific entities into the conversation context as full payloads (not just chunks):

MentionBrings in
@<repo-name>Repo summary + recent activity
@<file-path>Full file contents at the current branch HEAD
@<issue-id>Issue title, description, comments, linked PRs
@<pr-number>PR title, body, diff, reviews, CI status
@<doc-title>Full document body

Mentions are an autocomplete affordance in the chat input — typing @ opens a fuzzy filter over the user’s recent + relevant entities. Internally, mentioned entities are appended to the query’s source list and bypass retrieval ranking (they’re guaranteed to land in context).

The @proxifai mention on issue and PR comments has a different purpose — it triggers the comment-trigger workflow that dispatches the named agent.

Configuration

Env varDefaultEffect
KB_ENABLEDfalseMaster switch — when false, /api/v1/kb/* returns 404
QDRANT_URLrequired if KB enabledQdrant gRPC/HTTP endpoint
QDRANT_API_KEYoptionalQdrant cloud auth
MEILISEARCH_URLrequired if KB enabledMeilisearch endpoint
MEILISEARCH_API_KEYoptionalMeilisearch master key
EMBEDDING_URLrequired if KB enabledTEI service URL
EMBEDDING_MODELBAAI/bge-small-en-v1.5Tag for ingest metadata
VECTOR_SIZE384Must match the model
RERANKER_URLoptionalTEI reranker; if unset, no reranking
KB_NATS_URLinheritsNATS for ingestion stream

REST endpoints

Method · PathPurpose
POST /api/v1/kb/searchSearch — {query, mode, sourceType, limit, offset}
POST /api/v1/kb/ingestManually queue a chunk for ingestion
GET /api/v1/kb/sourcesList ingested source types and counts
GET /api/v1/kb/healthEmbedding service + index health
DELETE /api/v1/kb/documents/{id}Remove a document and its chunks

pfai kb search, pfai kb sources, and pfai kb ingest wrap the most common ones.

See also