Protocol
Versioning
Three independent version streams: specification, wire protocol, and reference compiler. Every package declares the protocol_version it targets so consumers can negotiate compatibility.
Current versions
- Specification — the 22-section schema and its semantics. Currently
v0.2, marked experimental. - Protocol — the wire format consumers parse. Currently
ckf-1.0. Independent from the spec. - Compiler — the reference implementation. Currently
v1.3.1. Bumps when observable pipeline behavior changes.
Semver policy (spec)
- MAJOR — incompatible schema changes (renamed fields, removed sections, changed types).
- MINOR — backward-compatible additions (new optional fields, new section, new enum value).
- PATCH — clarifications and fixes that do not affect serialized packages.
Timeline
| Version | Date | Notes |
|---|---|---|
| CKF-0.1 | Sep 2025 | First working schema; heuristic compiler; markdown-only. |
| CKF-0.2 | Jan 2026 | 22-section canonical schema; multi-format encodings (md/yaml/json); MCP server first cut. |
| Compiler v1.0 | Mar 2026 | Reduce step formalized; chunked compile + reduce. |
| Compiler v1.02 | Apr 2026 | Promotion module: atomic_units → if_then_rules / playbooks / anti_patterns. |
| Compiler v1.03 | May 2026 | Global sanitizer (language + completeness + truncation). Regression on rich sections of short sources. |
| Compiler v1.03.1 | May 2026 | Field-aware sanitizer fixes the v1.03 regression; retrieval / procedures / playbooks preserved. |
| Compiler v1.1 | May 2026 | Unified pipeline (runCkfPipeline) used by /compiler, MCP, Lab, admin recompile. Language lock enforced end-to-end. |
| Compiler v1.2 | May 2026 | Source preflight (language/format/records, hash/empty-source guard), record-level segmentation with source_manifest, coverage modes (summary/balanced/complete), domain-agnostic numeric integrity guards (currencies, dates, durations, citations), language recovery. |
| Compiler v1.3.1 | May 2026 | Canonical PDF metadata extraction: title, subtitle, authors[], edition, publisher, year and ISBN derived from front/back-matter via deterministic heuristics override LLM-inferred values. Controlled source_type vocabulary ('PDF e-book' / 'PDF document'). Title sanitization removes section-suffix contamination. Every override is logged as an auditable warning. |
See the project review post for the long-form story.
Migration
When the protocol bumps a major version, this site publishes a migration guide alongside the new spec. Older packages remain valid against their declared protocol_version; consumers decide whether to upgrade.
No automatic upgrade
Proposing a change
Open a discussion on the GitHub repository. Substantive changes go through a short RFC describing the problem, the proposed schema delta, and at least one worked example.