Documentation

Introduction

Getting started

Three ways to interact with the protocol: through the web UI, through any MCP-compatible agent, or programmatically with the reference compiler.

1. Web UI

The fastest path is the in-browser Compiler. Paste any human text — an article, a policy, a transcript — pick the output language and format, and get back a valid .ckf package compiled by the full v1.3.1 pipeline (preflight + chunking + reduce + promote + sanitize + coverage + numeric guards). Admins and allowlisted users run on the hosted Lovable AI Gateway for free; everyone else uses BYOK.

2. MCP

Any MCP-compatible agent (Claude Desktop, Cursor, Windsurf, AI SDK, OpenAI Agents SDK, custom GPT) can call the CKF compiler, parser, validator, viewer and news search through a single Streamable HTTP endpoint:

yaml
https://compiledknowledgeformat.org/api/mcp

The compiler tool ckf.compile_llm accepts a language argument ("en" or "pt") and a BYOK key passed inline. Keys are never stored or logged.

3. Programmatically

The reference compiler is published in this repo and runs in any Node.js / Edge environment. For small heuristic compiles (no LLM), use compileCkf():

ts
import { compileCkf } from "@/lib/ckf/compile";

const { pkg, warnings } = compileCkf(rawText, {
  sourceType: "article",
  compressionLevel: "standard",
  outputFormat: "markdown",
  language: "en",
});

For the full v1.3.1 pipeline (preflight → segment → chunk → reduce → promote → sanitize → coverage → numeric guards → quality), feed LLM partials to runCkfPipeline() — see Compiler pipeline v1.3.1.

Read a package

Every package is the same shape regardless of encoding. In TypeScript, the canonical type is CkfPackage exported from src/lib/ckf/types.ts. Iterate sections like any plain object; each item carries an id, a confidence score and a source_basis label.

Validate a package

A package is valid when (a) all required sections are present, (b) cross-references resolve (related_entities[].entity_id points at a real entity, etc.), (c) every numeric field uses the protocol's confidence scale (0.00–1.00), and (d) every item respects the active language lock.

Field-aware sanitizer + language recovery

v1.2 ships a field-aware global sanitizer that removes language drift, truncated artifacts and duplicates from the merged package — without falsely rejecting legitimate titles and labels — and automatically re-runs the affected sections when the output drifts from the detected source language. See Compiler pipeline v1.3.1.

Glossary

  • Package — a single .ckf file describing one knowledge unit.
  • Section — one of the 22 typed top-level fields of a package.
  • Item — one record inside a section (an entity, a concept, a rule…).
  • Source basis — how the item was derived from the source.
  • Confidence — a 0.00–1.00 score the extractor assigned to the item.
  • Pipeline — the ordered set of stages that turn a raw source into a final package: preflight, segmentation, chunking, reduce, promote, sanitize, coverage pass, numeric guards, quality (see v1.2).

CKF v1.0 for this page has not been compiled yet. Downloads become available once an admin runs the compiler.