Documentation

Reference

Examples

Six full packages compiled live from curated source text. Same heuristic compiler, six domains — copy, download, or paste into your own pipeline.

Packages

Learning science

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: article
source_title: Untitled source
source_author: Unknown
domain: education / learning science
subdomains: [sleep, memory, consolidation, study]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: education / learning science
  subdomains:
    - name: sleep
      relevance: 1
      related_concepts:
        - sleep
    - name: memory
      relevance: 0.85
      related_concepts:
        - memory
    - name: consolidation
      relevance: 0.7
      related_concepts:
        - consolidation
    - name: study
      relevance: 0.55
      related_concepts:
        - study
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Learning
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: Retrieval
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: Spaced
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: Hermann Ebbinghaus
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: Sleep
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Teachers
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: First
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Second
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Sleep
    definition: Learning improves when students actively retrieve information instead of passively rereading.
    domain: education / learning science
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Memory
    definition: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
    domain: education / learning science
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Consolidation
    definition: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
    domain: education / learning science
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Study
    definition: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
    domain: education / learning science
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: Hours
    definition: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
    domain: education / learning science
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: Without
    definition: Teachers should design study sessions in four steps.
    domain: education / learning science
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: relevant decision context detected
    interpretation: Teachers should design study sessions in four steps.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_002
    trigger: relevant decision context detected
    interpretation: If students confuse fluency with mastery, they overestimate their readiness and reduce study time prematurely.
    recommended_action: Mitigate the described risk.
    avoid: —
    confidence: 0.76
  - id: HEU_003
    trigger: context contains a known failure mode
    interpretation: Avoid cramming the night before exams, because consolidation requires sleep.
    recommended_action: Avoid the described action.
    avoid: Avoid cramming the night before exams, because consolidation requires sleep.
    confidence: 0.76
  - id: HEU_004
    trigger: context contains a known failure mode
    interpretation: Avoid passive rereading as the primary study technique — it produces fluency without retention.
    recommended_action: Avoid the described action.
    avoid: Avoid passive rereading as the primary study technique — it produces fluency without retention.
    confidence: 0.76
  - id: HEU_005
    trigger: context contains a known failure mode
    interpretation: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
    recommended_action: Avoid the described action.
    avoid: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules:
  - id: RULE_001
    condition: Operating context matches the rule's domain.
    decision: Teachers should design study sessions in four steps.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_002
    condition: Operating context matches the rule's domain.
    decision: "Question: How long should a spaced repetition interval be?"
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
```

## 8. PROCEDURES

```yaml
procedures:
  - id: PROC_001
    name: Source-derived procedure
    objective: Apply the sequence implied by the source text.
    steps:
      - step: 1
        action: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
        input_required: —
        output_expected: —
      - step: 2
        action: First, set a clear learning objective.
        input_required: —
        output_expected: —
      - step: 3
        action: When students sleep fewer than six hours after learning, consolidation drops sharply and the next-day quiz score falls by roughly 20 percent.
        input_required: —
        output_expected: —
      - step: 4
        action: "Another heuristic: if a quiz score is below 70 percent, schedule the next review within 24 hours, not later."
        input_required: —
        output_expected: —
      - step: 5
        action: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
        input_required: —
        output_expected: —
    success_criteria: All steps applied in order with expected outcomes.
    failure_criteria: Steps executed out of order or without prerequisites.
```

## 9. PATTERNS

```yaml
patterns:
  - id: PAT_001
    name: Recurring pattern 1
    observed_when: Source-described conditions are present.
    signal: If a concept is reviewed only once, it tends to fade within 48 hours, because memory traces decay without reactivation.
    underlying_mechanism: —
    response_strategy: Recognize and act according to source guidance.
    confidence: 0.7
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: Avoid cramming the night before exams, because consolidation requires sleep.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Avoid passive rereading as the primary study technique — it produces fluency without retention.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains:
  - id: CAU_001
    cause: If a concept is reviewed only once, it tends to fade within 48 hours,
    mechanism: —
    effect: memory traces decay without reactivation.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
  - id: CAU_002
    cause: Avoid cramming the night before exams,
    mechanism: —
    effect: consolidation requires sleep.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
  - id: CAU_003
    cause: "Answer:"
    mechanism: —
    effect: consolidation, not exposure, converts fragile traces into durable memory; without sleep, additional exposure yields diminishing returns.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Sleep
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Sleep concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Memory
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Memory concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Consolidation
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Consolidation concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: students actively retrieve information instead of passively
    then: rereading
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: a concept is reviewed only once
    then: it tends to fade within 48 hours, because memory traces decay without reactivation
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: the spacing interval is too long
    then: retrieval becomes effortful and accuracy drops below 60 percent
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: students confuse fluency with mastery
    then: they overestimate their readiness and reduce study time prematurely
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions: []  # none extracted
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Sleep model
    description: Learning improves when students actively retrieve information instead of passively rereading.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Memory model
    description: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: education / learning science response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Sleep.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is sleep and when does it apply?
    ideal_answer: Learning improves when students actively retrieve information instead of passively rereading.
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is memory and when does it apply?
    ideal_answer: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is consolidation and when does it apply?
    ideal_answer: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is study and when does it apply?
    ideal_answer: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on days
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Learning improves when students actively retrieve information instead of passively rereading. Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge. Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus. Sleep supports consoli…
    activation_queries:
      - What does the source say about days?
      - What does the source say about learning?
      - What does the source say about sleep?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on quiz
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Second, present new material in short blocks of 15 to 20 minutes. Third, interleave related topics instead of blocking them. Fourth, end every block with a low-stakes quiz that surfaces gaps. If a concept is reviewed only once, it tends to fade within 48 hours, because memory traces decay without reactivation. If the spacing interval is too long, retrieval becomes effortful and accuracy drops b…
    activation_queries:
      - What does the source say about quiz?
      - What does the source say about hours?
      - What does the source say about drops?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on heuristic
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "A useful heuristic: if you can teach a concept aloud without notes in under two minutes, you have probably mastered the surface layer. Another heuristic: if a quiz score is below 70 percent, schedule the next review within 24 hours, not later. Avoid cramming the night before exams, because consolidation requires sleep. Avoid passive rereading as the primary study technique — it produces fluency…"
    activation_queries:
      - What does the source say about heuristic?
      - What does the source say about without?
      - What does the source say about avoid?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on question
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Question: How long should a spaced repetition interval be? Answer: Start at 1 day, then double the interval after each successful recall, capping at 6 months for stable knowledge. Question: Why is sleep more important than extra study hours? Answer: Because consolidation, not exposure, converts fragile traces into durable memory; without sleep, additional exposure yields diminishing returns."
    activation_queries:
      - What does the source say about question?
      - What does the source say about interval?
      - What does the source say about answer?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: Learning improves when students actively retrieve information instead of passively rereading.
    type: fact
    tags:
      - learning
      - improves
      - students
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
    type: fact
    tags:
      - retrieval
      - practice
      - strengthens
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
    type: definition
    tags:
      - days
      - spaced
      - repetition
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
    type: fact
    tags:
      - sleep
      - supports
      - consolidation
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
    type: fact
    tags:
      - hippocampus
      - replays
      - daytime
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: Teachers should design study sessions in four steps.
    type: rule
    tags:
      - teachers
      - should
      - design
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: First, set a clear learning objective.
    type: fact
    tags:
      - first
      - clear
      - learning
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Encourage active retrieval over passive review.
    - Suggest spaced repetition where appropriate.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Ignoring decision rules in favor of fluency.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: Learning improves when students actively retrieve information instead of passively rereading.
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: Teachers should design study sessions in four steps.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: Teachers should design study sessions in four steps.
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: If students confuse fluency with mastery, they overestimate their readiness and reduce study time prematurely.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: Avoid cramming the night before exams, because consolidation requires sleep.
    extraction_type: explicit
  - extracted_item_id: HEU_004
    source_location: user_input
    source_excerpt: Avoid passive rereading as the primary study technique — it produces fluency without retention.
    extraction_type: explicit
  - extracted_item_id: HEU_005
    source_location: user_input
    source_excerpt: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: IF students actively retrieve information instead of passively THEN rereading
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF a concept is reviewed only once THEN it tends to fade within 48 hours, because memory traces decay without reactivation
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF the spacing interval is too long THEN retrieval becomes effortful and accuracy drops below 60 percent
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF students confuse fluency with mastery THEN they overestimate their readiness and reduce study time prematurely
    extraction_type: explicit
```

Business strategy

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: strategy
source_title: Untitled source
source_author: Unknown
domain: business
subdomains: [value, should, customers, acquisition]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: business
  subdomains:
    - name: value
      relevance: 1
      related_concepts:
        - value
    - name: should
      relevance: 0.85
      related_concepts:
        - should
    - name: customers
      relevance: 0.7
      related_concepts:
        - customers
    - name: acquisition
      relevance: 0.55
      related_concepts:
        - acquisition
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Sustainable
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: JTBD
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: The North Star Metric
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: Airbnb
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: Slack
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Unit
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: SaaS
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Leaders
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Value
    definition: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
    domain: business
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Should
    definition: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
    domain: business
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Customers
    definition: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
    domain: business
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Acquisition
    definition: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
    domain: business
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: Customer
    definition: Leaders should run quarterly business reviews in five steps.
    domain: business
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: First
    definition: First, restate the North Star Metric and the current value.
    domain: business
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: relevant decision context detected
    interpretation: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_002
    trigger: relevant decision context detected
    interpretation: Leaders should run quarterly business reviews in five steps.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_003
    trigger: relevant decision context detected
    interpretation: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_004
    trigger: relevant decision context detected
    interpretation: If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_005
    trigger: relevant decision context detected
    interpretation: "Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap."
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules:
  - id: RULE_001
    condition: Operating context matches the rule's domain.
    decision: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies l
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_002
    condition: Operating context matches the rule's domain.
    decision: Leaders should run quarterly business reviews in five steps.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_003
    condition: Operating context matches the rule's domain.
    decision: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
```

## 8. PROCEDURES

```yaml
procedures:
  - id: PROC_001
    name: Source-derived procedure
    objective: Apply the sequence implied by the source text.
    steps:
      - step: 1
        action: First, restate the North Star Metric and the current value.
        input_required: —
        output_expected: —
      - step: 2
        action: "Contradiction worth flagging: classic growth playbooks tell founders to optimise acquisition first, while modern PLG playbooks insist activation and retention should be solved firs"
        input_required: —
        output_expected: —
      - step: 3
        action: "Answer: When weekly active retention plateaus below the category benchmark and onboarding completion is under 50 percent — the leaky bucket must be fixed first."
        input_required: —
        output_expected: —
    success_criteria: All steps applied in order with expected outcomes.
    failure_criteria: Steps executed out of order or without prerequisites.
```

## 9. PATTERNS

```yaml
patterns: []  # none extracted
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: Avoid discounting as a default response to slow sales, because it trains the market to wait for promotions and erodes brand perception.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Never sacrifice gross margin to win logos that will not expand.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains:
  - id: CAU_001
    cause: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calc
    mechanism: —
    effect: growth without margin amplifies losses.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
  - id: CAU_002
    cause: Avoid discounting as a default response to slow sales,
    mechanism: —
    effect: it trains the market to wait for promotions and erodes brand perception.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
  - id: CAU_003
    cause: "Edge case: in network-effect businesses, early-stage CAC may legitimately exceed LTV"
    mechanism: —
    effect: each new customer increases the value of the existing base; standard unit economics under-measure this.
    secondary_effects: []
    intervention_points: []
    confidence: 0.7
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Value
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Value concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Should
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Should concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Customers
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Customers concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: CAC exceeds LTV
    then: expansion destroys value and the team must pause paid acquisition
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: churn is rising for two consecutive quarters
    then: the company should revisit onboarding and activation before optimising acquisition
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: a pricing change drops conversion by more than 15 percent
    then: roll back within seven days unless retention improves materially
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: a sales team consistently closes poor-fit customers
    then: support load rises and downstream churn follows within one to two renewal cycles
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions:
  - id: EXC_001
    general_rule: —
    exception_case: If a pricing change drops conversion by more than 15 percent, roll back within seven days unless retention improves materially.
    modified_action: Adjust behavior according to the exception.
    explanation: Source explicitly notes this edge case.
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Value model
    description: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Should model
    description: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: business response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Value.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is value and when does it apply?
    ideal_answer: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is should and when does it apply?
    ideal_answer: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is customers and when does it apply?
    ideal_answer: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is acquisition and when does it apply?
    ideal_answer: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on business
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features. The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team. Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acqu…
    activation_queries:
      - What does the source say about business?
      - What does the source say about customer?
      - What does the source say about value?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on activation
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Third, examine activation, retention and expansion funnels. Fourth, decide which two initiatives to double down on. Fifth, decide which initiative to kill. If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition. If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition. If a pricing change d…
    activation_queries:
      - What does the source say about activation?
      - What does the source say about retention?
      - What does the source say about expansion?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on heuristic
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "A useful heuristic: if your top 10 percent of customers generate more than 50 percent of revenue, your pricing is probably under-segmented. Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap. Pricing should reflect the value delivered, not internal costs alone. Avoid discounting as a default response to slow sales, because it tr…"
    activation_queries:
      - What does the source say about heuristic?
      - What does the source say about percent?
      - What does the source say about customers?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on first
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Contradiction worth flagging: classic growth playbooks tell founders to optimise acquisition first, while modern PLG playbooks insist activation and retention should be solved first — both can be right depending on stage. Question: When should a startup stop optimising acquisition? Answer: When weekly active retention plateaus below the category benchmark and onboarding completion is under 50 p…"
    activation_queries:
      - What does the source say about first?
      - What does the source say about playbooks?
      - What does the source say about acquisition?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
    type: fact
    tags:
      - sustainable
      - business
      - growth
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
    type: definition
    tags:
      - north
      - star
      - metric
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
    type: rule
    tags:
      - customer
      - unit
      - economics
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
    type: fact
    tags:
      - healthy
      - saas
      - business
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: Leaders should run quarterly business reviews in five steps.
    type: rule
    tags:
      - leaders
      - should
      - quarterly
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: First, restate the North Star Metric and the current value.
    type: fact
    tags:
      - first
      - restate
      - north
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: Second, review unit economics by cohort.
    type: fact
    tags:
      - second
      - review
      - unit
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Distinguish between strategy and tactics.
    - Surface assumptions about the market.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Ignoring decision rules in favor of fluency.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifie…
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: Leaders should run quarterly business reviews in five steps.
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: First, restate the North Star Metric and the current value.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifie…
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: Leaders should run quarterly business reviews in five steps.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
    extraction_type: explicit
  - extracted_item_id: HEU_004
    source_location: user_input
    source_excerpt: If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition.
    extraction_type: explicit
  - extracted_item_id: HEU_005
    source_location: user_input
    source_excerpt: "Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap."
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: IF CAC exceeds LTV THEN expansion destroys value and the team must pause paid acquisition
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF churn is rising for two consecutive quarters THEN the company should revisit onboarding and activation before optimising acquisition
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF a pricing change drops conversion by more than 15 percent THEN roll back within seven days unless retention improves materially
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF a sales team consistently closes poor-fit customers THEN support load rises and downstream churn follows within one to two renewal cycles
    extraction_type: explicit
```

Clinical protocol

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: protocol
source_title: Untitled source
source_author: Unknown
domain: healthcare
subdomains: [sepsis, lactate, pressure, antibiotics]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: healthcare
  subdomains:
    - name: sepsis
      relevance: 1
      related_concepts:
        - sepsis
    - name: lactate
      relevance: 0.85
      related_concepts:
        - lactate
    - name: pressure
      relevance: 0.7
      related_concepts:
        - pressure
    - name: antibiotics
      relevance: 0.55
      related_concepts:
        - antibiotics
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Sepsis
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: Glasgow Coma Scale
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: The Hour
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: Surviving Sepsis Campaign
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: First
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Second
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: Third
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Fourth
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Sepsis
    definition: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
    domain: healthcare
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Lactate
    definition: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
    domain: healthcare
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Pressure
    definition: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    domain: healthcare
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Antibiotics
    definition: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
    domain: healthcare
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: Qsofa
    definition: First, measure serum lactate.
    domain: healthcare
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: Blood
    definition: Second, obtain blood cultures before antibiotics.
    domain: healthcare
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: relevant decision context detected
    interpretation: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    recommended_action: Mitigate the described risk.
    avoid: —
    confidence: 0.76
  - id: HEU_002
    trigger: relevant decision context detected
    interpretation: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_003
    trigger: context contains a known failure mode
    interpretation: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
    recommended_action: Avoid the described action.
    avoid: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
    confidence: 0.76
  - id: HEU_004
    trigger: context contains a known failure mode
    interpretation: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
    recommended_action: Avoid the described action.
    avoid: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
    confidence: 0.76
  - id: HEU_005
    trigger: relevant decision context detected
    interpretation: "Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters."
    recommended_action: Apply the technique described.
    avoid: —
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules:
  - id: RULE_001
    condition: Operating context matches the rule's domain.
    decision: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
```

## 8. PROCEDURES

```yaml
procedures:
  - id: PROC_001
    name: Source-derived procedure
    objective: Apply the sequence implied by the source text.
    steps:
      - step: 1
        action: First, measure serum lactate.
        input_required: —
        output_expected: —
      - step: 2
        action: Fifth, start vasopressors if mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is first-line.
        input_required: —
        output_expected: —
      - step: 3
        action: "Question: What is the first-line vasopressor in septic shock?"
        input_required: —
        output_expected: —
    success_criteria: All steps applied in order with expected outcomes.
    failure_criteria: Steps executed out of order or without prerequisites.
```

## 9. PATTERNS

```yaml
patterns:
  - id: PAT_001
    name: Recurring pattern 1
    observed_when: Source-described conditions are present.
    signal: "Useful heuristic: if a patient looks worse than the numbers suggest, trust the bedside impression — early sepsis often outruns vital-sign abnormalities."
    underlying_mechanism: —
    response_strategy: Recognize and act according to source guidance.
    confidence: 0.7
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains: []  # none extracted
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Sepsis
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Sepsis concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Lactate
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Lactate concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Pressure
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Pressure concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is
    then: first-line
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: serum lactate is above 2 mmol/L
    then: repeat lactate within 2 hours to confirm clearance
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: blood pressure remains low after the 30 mL/kg fluid bolus
    then: start norepinephrine at 0
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: procalcitonin trends down for 72 hours and cultures are negative
    then: consider de-escalating antibiotics
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions: []  # none extracted
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Sepsis model
    description: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Lactate model
    description: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: healthcare response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Sepsis.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is sepsis and when does it apply?
    ideal_answer: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is lactate and when does it apply?
    ideal_answer: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is pressure and when does it apply?
    ideal_answer: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is antibiotics and when does it apply?
    ideal_answer: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on sepsis
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement. The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15). A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk. The Hour-1 sepsis bundle,…"
    activation_queries:
      - What does the source say about sepsis?
      - What does the source say about qsofa?
      - What does the source say about score?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on lactate
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Third, administer broad-spectrum antibiotics within 60 minutes. Fourth, begin 30 mL/kg crystalloid for hypotension or lactate above 4 mmol/L. Fifth, start vasopressors if mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is first-line. If serum lactate is above 2 mmol/L, repeat lactate within 2 hours to confirm clearance. If blood pressure remains low after the 30 mL/kg…
    activation_queries:
      - What does the source say about lactate?
      - What does the source say about antibiotics?
      - What does the source say about within?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on failure
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema. Useful heuristic: if a patient looks worse than the numbers suggest, trust the bedside impression — early sepsis often outruns vital-sign abnormalities. Another heuristic: any febrile, tachycardic patient on immunosuppression is sepsis until pr…"
    activation_queries:
      - What does the source say about failure?
      - What does the source say about avoid?
      - What does the source say about heuristic?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on patients
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Edge case: pregnant patients have physiologically higher heart rates and lower blood pressure, so the qSOFA threshold over-triggers; use the obstetric modified early warning score instead. Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters. Question: What is the first-line vasopressor in…"
    activation_queries:
      - What does the source say about patients?
      - What does the source say about pressure?
      - What does the source say about early?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
    type: fact
    tags:
      - sepsis
      - triage
      - emergency
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
    type: fact
    tags:
      - qsofa
      - score
      - adds
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    type: fact
    tags:
      - qsofa
      - more
      - patient
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
    type: fact
    tags:
      - sepsis
      - hour-
      - bundle
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: First, measure serum lactate.
    type: fact
    tags:
      - first
      - measure
      - serum
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: Second, obtain blood cultures before antibiotics.
    type: fact
    tags:
      - second
      - obtain
      - blood
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: Third, administer broad-spectrum antibiotics within 60 minutes.
    type: fact
    tags:
      - third
      - administer
      - broad-spectrum
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Never diagnose; refer to qualified professionals.
    - Cite uncertainty explicitly.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Ignoring decision rules in favor of fluency.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: First, measure serum lactate.
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: Second, obtain blood cultures before antibiotics.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
    extraction_type: explicit
  - extracted_item_id: HEU_004
    source_location: user_input
    source_excerpt: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
    extraction_type: explicit
  - extracted_item_id: HEU_005
    source_location: user_input
    source_excerpt: "Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters."
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: IF mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is THEN first-line
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF serum lactate is above 2 mmol/L THEN repeat lactate within 2 hours to confirm clearance
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF blood pressure remains low after the 30 mL/kg fluid bolus THEN start norepinephrine at 0
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF procalcitonin trends down for 72 hours and cultures are negative THEN consider de-escalating antibiotics
    extraction_type: explicit
```

Legal / GDPR

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: policy
source_title: Untitled source
source_author: Unknown
domain: legal
subdomains: [data, consent, controller, processing]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: legal
  subdomains:
    - name: data
      relevance: 1
      related_concepts:
        - data
    - name: consent
      relevance: 0.85
      related_concepts:
        - consent
    - name: controller
      relevance: 0.7
      related_concepts:
        - controller
    - name: processing
      relevance: 0.55
      related_concepts:
        - processing
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Under
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: General Data Protection Regulation
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: GDPR
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: The Supervisory Authority
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: Regulation
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Member State
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: Article
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Consent
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Data
    definition: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
    domain: legal
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Consent
    definition: A "data processor" processes personal data on behalf of the controller.
    domain: legal
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Controller
    definition: A "data subject" is the identified or identifiable person to whom the data relates.
    domain: legal
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Processing
    definition: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
    domain: legal
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: Must
    definition: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
    domain: legal
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: Personal
    definition: Consent must be freely given, specific, informed and unambiguous.
    domain: legal
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: relevant decision context detected
    interpretation: Consent must be freely given, specific, informed and unambiguous.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_002
    trigger: relevant decision context detected
    interpretation: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_003
    trigger: relevant decision context detected
    interpretation: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and freedoms of natural pers
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_004
    trigger: relevant decision context detected
    interpretation: The controller must inform affected data subjects without undue delay when the breach is likely to result in a high risk.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_005
    trigger: relevant decision context detected
    interpretation: If a transfer of personal data outside the European Economic Area lacks an adequacy decision, the parties must implement appropriate safeguards such as Standard Contractual Clauses.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules:
  - id: RULE_001
    condition: Operating context matches the rule's domain.
    decision: Consent must be freely given, specific, informed and unambiguous.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_002
    condition: Operating context matches the rule's domain.
    decision: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_003
    condition: Operating context matches the rule's domain.
    decision: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and free
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
```

## 8. PROCEDURES

```yaml
procedures: []  # none extracted
```

## 9. PATTERNS

```yaml
patterns:
  - id: PAT_001
    name: Recurring pattern 1
    observed_when: Source-described conditions are present.
    signal: "Another heuristic: when in doubt between consent and legitimate interest, prefer the legal basis that gives the data subject the most control — usually consent "
    underlying_mechanism: —
    response_strategy: Recognize and act according to source guidance.
    confidence: 0.7
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: Avoid bundling consent with the acceptance of terms of service; that bundled consent is generally not freely given.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Never retain personal data longer than necessary for the stated purpose; storage limitation is a core principle.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains: []  # none extracted
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Data
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Data concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Consent
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Consent concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Controller
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Controller concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: "at least one of six legal bases applies: consent"
    then: contract, legal obligation, vital interests, public task, or legitimate interest
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: the breach is likely to result in a
    then: high risk
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: a processor engages another processor (a sub-processor) without prior written authorisation of the controller
    then: the original processor remains fully liable
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: a transfer of personal data outside the European Economic Area lacks an adequacy decision
    then: the parties must implement appropriate safeguards such as Standard Contractual Clauses
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions:
  - id: EXC_001
    general_rule: —
    exception_case: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and free
    modified_action: Adjust behavior according to the exception.
    explanation: Source explicitly notes this edge case.
  - id: EXC_002
    general_rule: —
    exception_case: "Answer: Within 72 hours of becoming aware, to the competent Supervisory Authority, unless the breach is unlikely to result in risk to data subjects."
    modified_action: Adjust behavior according to the exception.
    explanation: Source explicitly notes this edge case.
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Data model
    description: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Consent model
    description: A "data processor" processes personal data on behalf of the controller.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: legal response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Data.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is data and when does it apply?
    ideal_answer: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is consent and when does it apply?
    ideal_answer: A "data processor" processes personal data on behalf of the controller.
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is controller and when does it apply?
    ideal_answer: A "data subject" is the identified or identifiable person to whom the data relates.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is processing and when does it apply?
    ideal_answer: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on data
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data. A "data processor" processes personal data on behalf of the controller. A "data subject" is the identified or identifiable person to whom the data relates. The Supervisory Authority is the independent public body responsible f…
    activation_queries:
      - What does the source say about data?
      - What does the source say about legal?
      - What does the source say about regulation?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on controller
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented. A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and freedoms of natural persons. The controller must inform affected data subjects with…
    activation_queries:
      - What does the source say about controller?
      - What does the source say about must?
      - What does the source say about data?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on consent
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Useful heuristic: if you cannot articulate the specific purpose of a data collection in one sentence, that purpose is probably not specific enough to ground lawful consent. Another heuristic: when in doubt between consent and legitimate interest, prefer the legal basis that gives the data subject the most control — usually consent for marketing, legitimate interest for fraud prevention. Avoid b…"
    activation_queries:
      - What does the source say about consent?
      - What does the source say about purpose?
      - What does the source say about data?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on data
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: 'Contradiction worth flagging: Article 22 restricts solely automated decisions, but many AI-assisted decisions remain "automated in practice" while keeping a token human reviewer — courts disagree on whether this satisfies the safeguard. Question: Within how long must a controller report a personal data breach? Answer: Within 72 hours of becoming aware, to the competent Supervisory Authority, un…'
    activation_queries:
      - What does the source say about data?
      - What does the source say about automated?
      - What does the source say about decisions?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
    type: definition
    tags:
      - data
      - under
      - general
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: A "data processor" processes personal data on behalf of the controller.
    type: fact
    tags:
      - data
      - processor
      - processes
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: A "data subject" is the identified or identifiable person to whom the data relates.
    type: definition
    tags:
      - data
      - subject
      - identified
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
    type: definition
    tags:
      - supervisory
      - authority
      - independent
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
    type: definition
    tags:
      - legal
      - article
      - establishes
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: Consent must be freely given, specific, informed and unambiguous.
    type: rule
    tags:
      - consent
      - must
      - freely
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
    type: rule
    tags:
      - where
      - processing
      - based
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Never provide legal advice; cite the source.
    - Always flag jurisdiction-dependent claims.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Ignoring decision rules in favor of fluency.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: A "data processor" processes personal data on behalf of the controller.
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: A "data subject" is the identified or identifiable person to whom the data relates.
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: Consent must be freely given, specific, informed and unambiguous.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: Consent must be freely given, specific, informed and unambiguous.
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and f…
    extraction_type: explicit
  - extracted_item_id: HEU_004
    source_location: user_input
    source_excerpt: The controller must inform affected data subjects without undue delay when the breach is likely to result in a high risk.
    extraction_type: explicit
  - extracted_item_id: HEU_005
    source_location: user_input
    source_excerpt: If a transfer of personal data outside the European Economic Area lacks an adequacy decision, the parties must implement appropriate safeguards such as Standard Contractual Clauses.
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: "IF at least one of six legal bases applies: consent THEN contract, legal obligation, vital interests, public task, or legitimate interest"
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF the breach is likely to result in a THEN high risk
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF a processor engages another processor (a sub-processor) without prior written authorisation of the controller THEN the original processor remains fully liable
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF a transfer of personal data outside the European Economic Area lacks an adequacy decision THEN the parties must implement appropriate safeguards such as Standard Contractual Clauses
    extraction_type: explicit
```

Engineering runbook

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: runbook
source_title: Untitled source
source_author: Unknown
domain: business
subdomains: [incident, minutes, sev-, within]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: business
  subdomains:
    - name: incident
      relevance: 1
      related_concepts:
        - incident
    - name: minutes
      relevance: 0.85
      related_concepts:
        - minutes
    - name: sev-
      relevance: 0.7
      related_concepts:
        - sev-
    - name: within
      relevance: 0.55
      related_concepts:
        - within
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Severity
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: The Incident Commander
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: Communications Lead
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: Scribe
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: First
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Second
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: Incident Commander
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Third
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Incident
    definition: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
    domain: business
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Minutes
    definition: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
    domain: business
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Sev-
    definition: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
    domain: business
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Within
    definition: The on-call response procedure has six steps.
    domain: business
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: First
    definition: First, acknowledge the page within 5 minutes.
    domain: business
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: Back
    definition: Second, open the incident channel and assign the Incident Commander.
    domain: business
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: relevant decision context detected
    interpretation: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_002
    trigger: context contains a known failure mode
    interpretation: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
    recommended_action: Avoid the described action.
    avoid: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
    confidence: 0.76
  - id: HEU_003
    trigger: relevant decision context detected
    interpretation: "Question: When should you roll back versus patch forward?"
    recommended_action: Follow the recommended practice.
    avoid: —
    confidence: 0.76
  - id: HEU_004
    trigger: relevant decision context detected
    interpretation: "Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or would cause worse harm."
    recommended_action: Mitigate the described risk.
    avoid: —
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules:
  - id: RULE_001
    condition: Operating context matches the rule's domain.
    decision: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
  - id: RULE_002
    condition: Operating context matches the rule's domain.
    decision: "Question: When should you roll back versus patch forward?"
    reasoning: Derived directly from a normative statement in the source.
    required_context: Domain-specific context as described in the source.
    output_action: Apply the recommended decision.
    failure_mode: Recommendation applied outside its valid context.
    confidence: 0.74
```

## 8. PROCEDURES

```yaml
procedures:
  - id: PROC_001
    name: Source-derived procedure
    objective: Apply the sequence implied by the source text.
    steps:
      - step: 1
        action: First, acknowledge the page within 5 minutes.
        input_required: —
        output_expected: —
      - step: 2
        action: Third, declare severity and post the first status update within 10 minutes.
        input_required: —
        output_expected: —
      - step: 3
        action: If the issue began within 30 minutes of a deploy, the first hypothesis is a regression — roll back before debugging.
        input_required: —
        output_expected: —
      - step: 4
        action: "Question: What is the first action after acknowledging an alert?"
        input_required: —
        output_expected: —
      - step: 5
        action: "Answer: Open the incident channel, assign an Incident Commander, and post the first status update within 10 minutes."
        input_required: —
        output_expected: —
    success_criteria: All steps applied in order with expected outcomes.
    failure_criteria: Steps executed out of order or without prerequisites.
```

## 9. PATTERNS

```yaml
patterns: []  # none extracted
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains: []  # none extracted
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Incident
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Incident concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Minutes
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Minutes concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Sev-
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Sev- concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes
    then: page the on-call engineer for that service
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: error rate exceeds 1 percent of requests for 2 minutes
    then: escalate to SEV-2
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: the issue began within 30 minutes of a deploy
    then: the first hypothesis is a regression — roll back before debugging
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: a rollback does not resolve the issue within 10 minutes
    then: expand the suspect set to dependencies and infrastructure changes
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions:
  - id: EXC_001
    general_rule: —
    exception_case: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
    modified_action: Adjust behavior according to the exception.
    explanation: Source explicitly notes this edge case.
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Incident model
    description: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Minutes model
    description: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: business response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Incident.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is incident and when does it apply?
    ideal_answer: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is minutes and when does it apply?
    ideal_answer: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is sev- and when does it apply?
    ideal_answer: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is within and when does it apply?
    ideal_answer: The on-call response procedure has six steps.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on incident
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service. Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact). The Incident Commander coordinates response; the Communications Lead handles statu…
    activation_queries:
      - What does the source say about incident?
      - What does the source say about sev-?
      - What does the source say about severity?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on minutes
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Fourth, stabilise — roll back the most recent change or shed load before deep diagnosis. Fifth, communicate every 30 minutes until resolution. Sixth, schedule a blameless postmortem within 5 business days. If p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes, page the on-call engineer for that service. If error rate exceeds 1 percent of requests for 2 minutes, escalate to SEV-…
    activation_queries:
      - What does the source say about minutes?
      - What does the source say about within?
      - What does the source say about roll?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on incident
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Useful heuristic: prefer rolling back a recent change over root-causing during an active incident — mean time to recovery beats mean time to understand. Another heuristic: if three responders are debating the cause, you need a decision, not more data — the Incident Commander picks one path and runs it. Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canar…"
    activation_queries:
      - What does the source say about incident?
      - What does the source say about during?
      - What does the source say about heuristic?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on question
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Question: What is the first action after acknowledging an alert? Answer: Open the incident channel, assign an Incident Commander, and post the first status update within 10 minutes. Question: When should you roll back versus patch forward? Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or …"
    activation_queries:
      - What does the source say about question?
      - What does the source say about first?
      - What does the source say about answer?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
    type: definition
    tags:
      - incident
      - unplanned
      - event
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
    type: definition
    tags:
      - sev-
      - degradation
      - users
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
    type: fact
    tags:
      - incident
      - commander
      - coordinates
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: The on-call response procedure has six steps.
    type: fact
    tags:
      - on-call
      - response
      - procedure
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: First, acknowledge the page within 5 minutes.
    type: fact
    tags:
      - first
      - acknowledge
      - page
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: Second, open the incident channel and assign the Incident Commander.
    type: fact
    tags:
      - incident
      - second
      - open
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: Third, declare severity and post the first status update within 10 minutes.
    type: fact
    tags:
      - third
      - declare
      - severity
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Distinguish between strategy and tactics.
    - Surface assumptions about the market.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Ignoring decision rules in favor of fluency.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: The on-call response procedure has six steps.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: First, acknowledge the page within 5 minutes.
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: Second, open the incident channel and assign the Incident Commander.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: "Question: When should you roll back versus patch forward?"
    extraction_type: explicit
  - extracted_item_id: HEU_004
    source_location: user_input
    source_excerpt: "Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or would cause worse harm."
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: IF p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes THEN page the on-call engineer for that service
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF error rate exceeds 1 percent of requests for 2 minutes THEN escalate to SEV-2
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF the issue began within 30 minutes of a deploy THEN the first hypothesis is a regression — roll back before debugging
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF a rollback does not resolve the issue within 10 minutes THEN expand the suspect set to dependencies and infrastructure changes
    extraction_type: explicit
```

Scientific paper

ckf_demo_1782847357665

8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE

package_id: ckf_demo_1782847357665
protocol_version: ckf-0.1
source_type: paper
source_title: Untitled source
source_author: Unknown
domain: education / learning science
subdomains: [model, reward, policy, rlhf]
language: en
created_at: 2026-06-30T19:22:37.665Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81

---

## 1. CORE INTENT

```yaml
core_intent:
  primary_purpose: Capture and structure the knowledge expressed in the source.
  intended_user: Developers, researchers and agents consuming structured knowledge.
  intended_agent_use: Retrieval, reasoning, tutoring, decision support.
  transformation_goal: Convert prose into structured, agent-usable cognition.
  key_value: Portable, traceable, reusable knowledge package.
```

## 2. DOMAIN MAP

```yaml
domain_map:
  main_domain: education / learning science
  subdomains:
    - name: model
      relevance: 1
      related_concepts:
        - model
    - name: reward
      relevance: 0.85
      related_concepts:
        - reward
    - name: policy
      relevance: 0.7
      related_concepts:
        - policy
    - name: rlhf
      relevance: 0.55
      related_concepts:
        - rlhf
  adjacent_domains: []
  excluded_domains: []
```

## 3. ENTITY GRAPH

```yaml
entities:
  - id: ENT_001
    name: Reinforcement Learning
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities: []
    source_basis: explicit
  - id: ENT_002
    name: Human Feedback
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_003
    name: RLHF
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_004
    name: Proximal Policy Optimization
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_005
    name: First
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_006
    name: Second
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_007
    name: Third
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
  - id: ENT_008
    name: Fourth
    type: named_entity
    description: Recurring term/entity surfaced from the source.
    aliases: []
    attributes: []
    related_entities:
      - entity_id: ENT_001
        relation_type: co_occurs_with
        confidence: 0.6
    source_basis: explicit
```

## 4. CONCEPT GRAPH

```yaml
concepts:
  - id: CON_001
    label: Model
    definition: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
    domain: education / learning science
    depends_on: []
    contradicts: []
    supports:
      - CON_002
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_002
    label: Reward
    definition: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
    domain: education / learning science
    depends_on:
      - CON_001
    contradicts: []
    supports:
      - CON_003
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_003
    label: Policy
    definition: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
    domain: education / learning science
    depends_on:
      - CON_002
    contradicts: []
    supports:
      - CON_004
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_004
    label: Rlhf
    definition: The canonical RLHF procedure proceeds in four steps.
    domain: education / learning science
    depends_on:
      - CON_003
    contradicts: []
    supports:
      - CON_005
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_005
    label: Preference
    definition: First, collect a dataset of high-quality demonstrations from human writers.
    domain: education / learning science
    depends_on:
      - CON_004
    contradicts: []
    supports:
      - CON_006
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
  - id: CON_006
    label: Human
    definition: Second, supervise fine-tune the base model on these demonstrations.
    domain: education / learning science
    depends_on:
      - CON_005
    contradicts: []
    supports: []
    enables: []
    risks: []
    confidence: 0.78
    source_basis: explicit
```

## 5. PRINCIPLES

```yaml
principles: []  # none extracted
```

## 6. HEURISTICS

```yaml
heuristics:
  - id: HEU_001
    trigger: context contains a known failure mode
    interpretation: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
    recommended_action: Avoid the described action.
    avoid: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
    confidence: 0.76
  - id: HEU_002
    trigger: context contains a known failure mode
    interpretation: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
    recommended_action: Avoid the described action.
    avoid: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
    confidence: 0.76
  - id: HEU_003
    trigger: context contains a known failure mode
    interpretation: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose."
    recommended_action: Avoid the described action.
    avoid: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose"
    confidence: 0.76
```

## 7. DECISION RULES

```yaml
decision_rules: []  # none extracted
```

## 8. PROCEDURES

```yaml
procedures:
  - id: PROC_001
    name: Source-derived procedure
    objective: Apply the sequence implied by the source text.
    steps:
      - step: 1
        action: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally
        input_required: —
        output_expected: —
      - step: 2
        action: First, collect a dataset of high-quality demonstrations from human writers.
        input_required: —
        output_expected: —
    success_criteria: All steps applied in order with expected outcomes.
    failure_criteria: Steps executed out of order or without prerequisites.
```

## 9. PATTERNS

```yaml
patterns:
  - id: PAT_001
    name: Recurring pattern 1
    observed_when: Source-described conditions are present.
    signal: "Useful heuristic: monitor the KL divergence between policy and reference model continuously — sharp jumps usually precede reward hacking."
    underlying_mechanism: —
    response_strategy: Recognize and act according to source guidance.
    confidence: 0.7
```

## 10. ANTI-PATTERNS

```yaml
anti_patterns:
  - id: ANTI_001
    name: Anti-pattern 1
    description: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
  - id: ANTI_002
    name: Anti-pattern 2
    description: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
    why_it_fails: Identified by the source as ineffective or harmful.
    warning_signals: Behavior matches the described failure mode.
    replacement_behavior: Use the recommended alternative from the source.
```

## 11. CAUSAL CHAINS

```yaml
causal_chains: []  # none extracted
```

## 12. CONTEXTUAL TRIGGERS

```yaml
contextual_triggers:
  - id: TRG_001
    if_user_says_or_context_contains: Model
    activate_knowledge:
      - CON_001
      - CON_002
    agent_should: Recall the Model concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_002
    if_user_says_or_context_contains: Reward
    activate_knowledge:
      - CON_002
      - CON_003
    agent_should: Recall the Reward concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
  - id: TRG_003
    if_user_says_or_context_contains: Policy
    activate_knowledge:
      - CON_003
      - CON_004
    agent_should: Recall the Policy concept and apply related rules.
    agent_should_not: Make claims beyond what the source supports.
```

## 13. IF-THEN RULES

```yaml
if_then_rules:
  - id: IFTHEN_001
    if: the KL penalty is too low
    then: the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hacking
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_002
    if: the KL penalty is too high
    then: the policy barely moves and most reinforcement learning gains are lost
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_003
    if: the preference dataset is small
    then: the reward model is high-variance and the policy overfits to its idiosyncrasies
    because: Inferred from source context.
    confidence: 0.72
  - id: IFTHEN_004
    if: the reward model is updated mid-training without recalibrating the reference policy
    then: the KL term becomes meaningless and training collapses
    because: Inferred from source context.
    confidence: 0.72
```

## 14. EXCEPTIONS AND EDGE CASES

```yaml
exceptions: []  # none extracted
```

## 15. MENTAL MODELS

```yaml
mental_models:
  - id: MM_001
    name: Model model
    description: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
  - id: MM_002
    name: Reward model
    description: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
    use_when: The reasoning context maps to this concept.
    do_not_use_when: Context lies outside the source's scope.
    input_needed: Relevant facts about the situation.
    output_generated: A reasoned recommendation aligned with the source.
```

## 16. OPERATIONAL PLAYBOOKS

```yaml
playbooks:
  - id: PLAY_001
    name: education / learning science response playbook
    objective: Apply the source's knowledge to a real interaction.
    activation_context: User asks about Model.
    steps:
      - Identify the concept the question maps to.
      - Recall related rules and heuristics.
      - Cite the source-derived principle.
      - Surface relevant exceptions or limits.
    agent_tone: Clear, sourced, non-overstating.
    tools_needed:
      - retrieval
      - memory
    expected_output: A grounded answer with traceable reasoning.
    failure_modes:
      - Hallucinating beyond source
      - Ignoring exceptions
```

## 17. QUESTION-ANSWER PAIRS FOR AGENTS

```yaml
qa_pairs:
  - id: QA_001
    question: What is model and when does it apply?
    ideal_answer: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
    source_concepts:
      - CON_001
    difficulty: easy
    answer_type: definition_with_context
  - id: QA_002
    question: What is reward and when does it apply?
    ideal_answer: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
    source_concepts:
      - CON_002
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_003
    question: What is policy and when does it apply?
    ideal_answer: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
    source_concepts:
      - CON_003
    difficulty: medium
    answer_type: definition_with_context
  - id: QA_004
    question: What is rlhf and when does it apply?
    ideal_answer: The canonical RLHF procedure proceeds in four steps.
    source_concepts:
      - CON_004
    difficulty: medium
    answer_type: definition_with_context
```

## 18. RETRIEVAL CHUNKS

```yaml
retrieval_chunks:
  - id: CHUNK_001
    title: Chunk on model
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model trained on human preference comparisons. The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer. Proximal P…
    activation_queries:
      - What does the source say about model?
      - What does the source say about human?
      - What does the source say about demonstrations?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_002
    title: Chunk on model
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: Third, collect pairwise preference comparisons over model outputs and train a reward model. Fourth, optimise the policy against the reward model with PPO while regularising against the supervised model using a KL divergence penalty. If the KL penalty is too low, the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hackin…
    activation_queries:
      - What does the source say about model?
      - What does the source say about reward?
      - What does the source say about policy?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_003
    title: Chunk on reward
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Useful heuristic: monitor the KL divergence between policy and reference model continuously — sharp jumps usually precede reward hacking. Another heuristic: a reward model with calibration error above 10 percent on a held-out preference set is not yet reliable enough for RL fine-tuning. Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red…"
    activation_queries:
      - What does the source say about reward?
      - What does the source say about model?
      - What does the source say about policy?
    related_rules: []
    related_entities: []
    related_concepts: []
  - id: CHUNK_004
    title: Chunk on reward
    standalone_context: Self-contained passage extracted from the source.
    compressed_knowledge: "Contradiction worth flagging: some recent work shows Direct Preference Optimization (DPO) matches or beats RLHF without a separate reward model, but other work shows RLHF still wins on hard reasoning benchmarks — the field has not converged. Question: Why is the KL divergence penalty included in PPO during RLHF? Answer: To keep the optimised policy close to the supervised model, preventing rewa…"
    activation_queries:
      - What does the source say about reward?
      - What does the source say about rlhf?
      - What does the source say about model?
    related_rules: []
    related_entities: []
    related_concepts: []
```

## 19. EMBEDDING-READY ATOMIC UNITS

```yaml
atomic_units:
  - id: AU_001
    statement: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
    type: definition
    tags:
      - human
      - model
      - reinforcement
    dependencies: []
    confidence: 0.78
  - id: AU_002
    statement: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
    type: definition
    tags:
      - reward
      - model
      - typically
    dependencies: []
    confidence: 0.78
  - id: AU_003
    statement: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
    type: definition
    tags:
      - proximal
      - policy
      - optimization
    dependencies: []
    confidence: 0.78
  - id: AU_004
    statement: The canonical RLHF procedure proceeds in four steps.
    type: fact
    tags:
      - canonical
      - rlhf
      - procedure
    dependencies: []
    confidence: 0.78
  - id: AU_005
    statement: First, collect a dataset of high-quality demonstrations from human writers.
    type: fact
    tags:
      - first
      - collect
      - dataset
    dependencies: []
    confidence: 0.78
  - id: AU_006
    statement: Second, supervise fine-tune the base model on these demonstrations.
    type: fact
    tags:
      - second
      - supervise
      - fine-tune
    dependencies: []
    confidence: 0.78
  - id: AU_007
    statement: Third, collect pairwise preference comparisons over model outputs and train a reward model.
    type: fact
    tags:
      - model
      - third
      - collect
    dependencies: []
    confidence: 0.78
```

## 20. AGENT INSTRUCTIONS

```yaml
agent_instructions:
  behavior_rules:
    - Stay within the package's scope.
    - Cite the source-derived chunk or rule when answering.
    - Encourage active retrieval over passive review.
    - Suggest spaced repetition where appropriate.
  reasoning_rules:
    - Use causal chains and IF-THEN rules before improvising.
    - Combine concepts only when supports/depends_on relationships allow it.
  response_rules:
    - Be concise unless the user asks for depth.
    - Surface confidence and source basis.
  forbidden_behaviors:
    - Fabricating sources.
    - Restating the source as personal opinion.
    - Overstating certainty.
  preferred_questions:
    - What does the source say about …?
    - Which rule applies to this situation?
    - What are the limits of this knowledge?
  tool_usage_guidance:
    - Use retrieval before generation.
    - Use memory to track conversational context.
```

## 21. KNOWLEDGE LIMITS

```yaml
knowledge_limits:
  missing_context:
    - Source date and authorship are not always provided.
  weakly_supported_claims: []
  assumptions_detected:
    - Heuristic compilation assumes the input text is self-contained.
  possible_biases:
    - Single-source perspective.
  outdated_sections: []
  needs_human_review:
    - Decision rules and exceptions before production use.
```

## 22. SOURCE TRACEABILITY

```yaml
source_traceability:
  - extracted_item_id: CON_001
    source_location: user_input
    source_excerpt: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised agains…
    extraction_type: explicit
  - extracted_item_id: CON_002
    source_location: user_input
    source_excerpt: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
    extraction_type: explicit
  - extracted_item_id: CON_003
    source_location: user_input
    source_excerpt: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
    extraction_type: explicit
  - extracted_item_id: CON_004
    source_location: user_input
    source_excerpt: The canonical RLHF procedure proceeds in four steps.
    extraction_type: explicit
  - extracted_item_id: CON_005
    source_location: user_input
    source_excerpt: First, collect a dataset of high-quality demonstrations from human writers.
    extraction_type: explicit
  - extracted_item_id: CON_006
    source_location: user_input
    source_excerpt: Second, supervise fine-tune the base model on these demonstrations.
    extraction_type: explicit
  - extracted_item_id: HEU_001
    source_location: user_input
    source_excerpt: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
    extraction_type: explicit
  - extracted_item_id: HEU_002
    source_location: user_input
    source_excerpt: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
    extraction_type: explicit
  - extracted_item_id: HEU_003
    source_location: user_input
    source_excerpt: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose."
    extraction_type: explicit
  - extracted_item_id: IFTHEN_001
    source_location: user_input
    source_excerpt: IF the KL penalty is too low THEN the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hacking
    extraction_type: explicit
  - extracted_item_id: IFTHEN_002
    source_location: user_input
    source_excerpt: IF the KL penalty is too high THEN the policy barely moves and most reinforcement learning gains are lost
    extraction_type: explicit
  - extracted_item_id: IFTHEN_003
    source_location: user_input
    source_excerpt: IF the preference dataset is small THEN the reward model is high-variance and the policy overfits to its idiosyncrasies
    extraction_type: explicit
  - extracted_item_id: IFTHEN_004
    source_location: user_input
    source_excerpt: IF the reward model is updated mid-training without recalibrating the reference policy THEN the KL term becomes meaningless and training collapses
    extraction_type: explicit
```

CKF v1.0 for this page has not been compiled yet. Downloads become available once an admin runs the compiler.