docstack-ocr

Dual-VLM OCR + tenant-authored extraction. Self-hosted, model-agnostic, audit-first.

Quickstart Full integration guide API reference

Built for system integrators

docstack-ocr is a self-hosted document understanding platform designed to plug into a larger system — not to be the system. Submit a document, poll for completion, retrieve a rich CanonicalOcrDocument plus structured fields conforming to a tenant-authored template.

Dual-VLM OCR

Every region is OCR’d by PaddleOCR-VL-1.6 and GLM-OCR in parallel. Paddle priority on agreement; GLM as fallback and cross-check; per-block risk flags surface advisory data.

Tenant-authored templates

Document types are runtime-defined. JSON Schema (Draft 2020-12) plus a declarative validation DSL (required, regex, arithmetic, date_plausible, enum, checksum, cross_agreement). Zero hardcoded templates.

Two auth modes

bearer (built-in users + API keys + sessions + CSRF) or trusted_headers (your gateway forwards X-Tenant-Id / X-Actor-Id / X-Actor-Roles).

Model-agnostic LLM

vLLM, Ollama, OpenAI, or native Gemini. Swap via runtime infrastructure config; no code changes.

Review-first

Validation failures route to a queue with approve / reprocess / field-level override. Every override re-runs validation and logs an audit row.

Auditable + recoverable

Every mutation is audited. Soft-delete is the default; restore is one idempotent endpoint away.

Where to go next

Quickstart — boot the stack, mint a key, submit your first document. ~5 minutes.
Infrastructure setup — configure OCR and LLM endpoints across vLLM, Ollama, OpenAI, and Gemini.
Full integration guide — every endpoint, every parameter, every failure mode. The single canonical markdown reference.
API reference — interactive OpenAPI browser with a “Try it” panel.
Contract testing — keep your integration honest by fuzz-testing the API against its own OpenAPI spec.

For AI agents

Every page exposes a Copy as Markdown action and an Open in ChatGPT/Claude dropdown. The full docs surface is also exported in machine-friendly form:

/llms.txt — index following the llmstxt.org convention.
/llms-full.txt — the entire docs site flattened into one Markdown file.
/llms-small.txt — same as llms-full.txt but trimmed for smaller context windows.
/integration-guide.md — the canonical single-file integration reference, raw.
/openapi.json — OpenAPI 3.1 spec snapshot.