GitHub
Concept

Documents

CRDT-based collaborative documents — Y.js over WebSockets with PostgreSQL persistence — for specs, RFCs, runbooks, and decision records that live alongside your code.

Documents are ProxifAI’s collaborative writing surface. They’re not separate from the rest of the platform — every document lives in a team or project, gets indexed into the knowledge base, and is referenceable by @-mention from chat. Real-time collaboration is implemented as a Y.js CRDT over WebSockets (internal/docs/collab/) with persistence in document_yjs_update.

How collaboration works

┌──────────┐  WebSocket: y-websocket frames  ┌──────────────────────┐
│ Editor 1 │ ◄──────────────────────────────► │ /api/v1/documents/   │
│ (TipTap) │                                  │   {id}/collab        │
└──────────┘                                  │                      │
                                              │  fan-out via NATS    │ ──► Other replicas
┌──────────┐                                  │  (cross-pod sync)    │
│ Editor 2 │ ◄──────────────────────────────► │                      │
│ (TipTap) │                                  │  persist updates     │
└──────────┘                                  │  to document_yjs_    │ ──► Postgres
                                              │  update on each frame│
                                              └──────────────────────┘

Every edit produces a Y.js update; the WebSocket handler logs each one to document_yjs_update, then re-broadcasts to every other connected client. Late joiners replay the log to catch up to the current state — no separate snapshot needed for correctness, though periodic compaction trims the log.

This means:

  • No conflict resolution dialogs. CRDT semantics resolve concurrent edits deterministically.
  • Offline edits replay cleanly. A client that disconnects, edits, and reconnects will merge its local changes with everything that happened while it was offline.
  • Cross-replica sync. Multiple ProxifAI replicas fan-out updates via the embedded NATS server, so editors connected to different pods stay in sync.

The endpoint is tenant-scoped via the same RLS policy that protects every other table — even if a Y.js client somehow learned a doc ID it shouldn’t have, the auth middleware blocks the WebSocket upgrade with 403.

Editor surface

The frontend uses TipTap (built on ProseMirror) with the Yjs collaboration extension. What you can put in a document:

  • Rich text — headings, lists (bulleted, numbered, task), tables, blockquotes, callouts
  • Inline marks — bold, italic, underline, strike, subscript, superscript, highlight, color, links
  • Code blocks — syntax-highlighted via Shiki, language selectable per block
  • Diagrams — embedded Excalidraw (@excalidraw/excalidraw 0.18) for hand-drawn architecture diagrams that round-trip with the document
  • Code editor blocks — Monaco-powered code editing for technical content (@monaco-editor/react) with full IntelliSense for the languages you’d expect from VS Code
  • Embeds — paste a YouTube link to embed a video; reference issues, PRs, files inline with @-mentions
  • Images — paste-to-upload from clipboard
  • Tasks[ ] task lists with checkboxes that round-trip to Markdown

Every collaborator’s cursor and selection are visible (the collaboration-caret extension paints them in the editor). The presence layer rides on Yjs awareness, not a separate channel.

Hierarchy and scoping

A document is owned by a team and optionally a parent document. The schema:

FieldNotes
teamIdRequired — documents are team-scoped
parentIdOptional — nest documents into a tree (e.g. RFCs under “Engineering”)
creatorIdWho created it
iconOptional emoji or icon URL for the sidebar
visibilityteam (default) | private (creator-only) | org
isFavoritePer-user star, surfaced in the sidebar

There’s no project-level scoping by default — link to a project via the document’s body (or, conventionally, drop project-scoped specs in a doc tree under Specs/{ProjectName}).

Knowledge base integration

Documents are indexed by the knowledge base as soon as they’re saved, with heading-aware chunking. From AI chat, reference any document inline:

@OAuth Migration Plan — what's the current status?

The mention pulls the full document body into context (not just chunks), so the chat agent has the complete spec to work from. This is the core of “the AI knows what you’ve decided” — once a decision is in a document, every subsequent AI conversation grounds against it.

REST API

Method · PathPurpose
GET /api/v1/documentsList with filters (team, parent, visibility)
POST /api/v1/documentsCreate — {title, teamId, parentId?, content?}
GET /api/v1/documents/{id}Read with content
PATCH /api/v1/documents/{id}Update title / icon / visibility / favorite
DELETE /api/v1/documents/{id}Soft-delete
GET /api/v1/documents/{id}/collabWebSocket — y-websocket protocol for edits

The content field on the REST surface holds the latest snapshot (rendered Markdown for the API consumer). Live edits go through the WebSocket — the REST PATCH is for metadata, not body content.

pfai doc list/view/create/update/delete wraps the metadata endpoints. Body editing is browser-only.

Tradeoffs

  • No version history beyond the y-update log. The log preserves every change but isn’t presented as named “versions” you can restore — restoration today means reading older log rows directly. A presented version-history feature is on the roadmap.
  • No cross-document linking graph. You can paste links between docs, but there’s no automatic backlink panel.
  • No comments yet. Discussion happens on the issue or PR the document references.

See also