Reference
Examples
Six full packages compiled live from curated source text. Same heuristic compiler, six domains — copy, download, or paste into your own pipeline.
Packages
Learning science
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: article
source_title: Untitled source
source_author: Unknown
domain: education / learning science
subdomains: [sleep, memory, consolidation, study]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: education / learning science
subdomains:
- name: sleep
relevance: 1
related_concepts:
- sleep
- name: memory
relevance: 0.85
related_concepts:
- memory
- name: consolidation
relevance: 0.7
related_concepts:
- consolidation
- name: study
relevance: 0.55
related_concepts:
- study
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Learning
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: Retrieval
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: Spaced
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: Hermann Ebbinghaus
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: Sleep
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Teachers
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: First
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Second
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Sleep
definition: Learning improves when students actively retrieve information instead of passively rereading.
domain: education / learning science
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Memory
definition: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
domain: education / learning science
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Consolidation
definition: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
domain: education / learning science
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Study
definition: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
domain: education / learning science
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: Hours
definition: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
domain: education / learning science
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: Without
definition: Teachers should design study sessions in four steps.
domain: education / learning science
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: relevant decision context detected
interpretation: Teachers should design study sessions in four steps.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_002
trigger: relevant decision context detected
interpretation: If students confuse fluency with mastery, they overestimate their readiness and reduce study time prematurely.
recommended_action: Mitigate the described risk.
avoid: —
confidence: 0.76
- id: HEU_003
trigger: context contains a known failure mode
interpretation: Avoid cramming the night before exams, because consolidation requires sleep.
recommended_action: Avoid the described action.
avoid: Avoid cramming the night before exams, because consolidation requires sleep.
confidence: 0.76
- id: HEU_004
trigger: context contains a known failure mode
interpretation: Avoid passive rereading as the primary study technique — it produces fluency without retention.
recommended_action: Avoid the described action.
avoid: Avoid passive rereading as the primary study technique — it produces fluency without retention.
confidence: 0.76
- id: HEU_005
trigger: context contains a known failure mode
interpretation: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
recommended_action: Avoid the described action.
avoid: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules:
- id: RULE_001
condition: Operating context matches the rule's domain.
decision: Teachers should design study sessions in four steps.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_002
condition: Operating context matches the rule's domain.
decision: "Question: How long should a spaced repetition interval be?"
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
```
## 8. PROCEDURES
```yaml
procedures:
- id: PROC_001
name: Source-derived procedure
objective: Apply the sequence implied by the source text.
steps:
- step: 1
action: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
input_required: —
output_expected: —
- step: 2
action: First, set a clear learning objective.
input_required: —
output_expected: —
- step: 3
action: When students sleep fewer than six hours after learning, consolidation drops sharply and the next-day quiz score falls by roughly 20 percent.
input_required: —
output_expected: —
- step: 4
action: "Another heuristic: if a quiz score is below 70 percent, schedule the next review within 24 hours, not later."
input_required: —
output_expected: —
- step: 5
action: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
input_required: —
output_expected: —
success_criteria: All steps applied in order with expected outcomes.
failure_criteria: Steps executed out of order or without prerequisites.
```
## 9. PATTERNS
```yaml
patterns:
- id: PAT_001
name: Recurring pattern 1
observed_when: Source-described conditions are present.
signal: If a concept is reviewed only once, it tends to fade within 48 hours, because memory traces decay without reactivation.
underlying_mechanism: —
response_strategy: Recognize and act according to source guidance.
confidence: 0.7
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: Avoid cramming the night before exams, because consolidation requires sleep.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Avoid passive rereading as the primary study technique — it produces fluency without retention.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains:
- id: CAU_001
cause: If a concept is reviewed only once, it tends to fade within 48 hours,
mechanism: —
effect: memory traces decay without reactivation.
secondary_effects: []
intervention_points: []
confidence: 0.7
- id: CAU_002
cause: Avoid cramming the night before exams,
mechanism: —
effect: consolidation requires sleep.
secondary_effects: []
intervention_points: []
confidence: 0.7
- id: CAU_003
cause: "Answer:"
mechanism: —
effect: consolidation, not exposure, converts fragile traces into durable memory; without sleep, additional exposure yields diminishing returns.
secondary_effects: []
intervention_points: []
confidence: 0.7
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Sleep
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Sleep concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Memory
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Memory concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Consolidation
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Consolidation concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: students actively retrieve information instead of passively
then: rereading
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: a concept is reviewed only once
then: it tends to fade within 48 hours, because memory traces decay without reactivation
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: the spacing interval is too long
then: retrieval becomes effortful and accuracy drops below 60 percent
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: students confuse fluency with mastery
then: they overestimate their readiness and reduce study time prematurely
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions: [] # none extracted
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Sleep model
description: Learning improves when students actively retrieve information instead of passively rereading.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Memory model
description: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: education / learning science response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Sleep.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is sleep and when does it apply?
ideal_answer: Learning improves when students actively retrieve information instead of passively rereading.
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is memory and when does it apply?
ideal_answer: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is consolidation and when does it apply?
ideal_answer: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is study and when does it apply?
ideal_answer: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on days
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Learning improves when students actively retrieve information instead of passively rereading. Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge. Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus. Sleep supports consoli…
activation_queries:
- What does the source say about days?
- What does the source say about learning?
- What does the source say about sleep?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on quiz
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Second, present new material in short blocks of 15 to 20 minutes. Third, interleave related topics instead of blocking them. Fourth, end every block with a low-stakes quiz that surfaces gaps. If a concept is reviewed only once, it tends to fade within 48 hours, because memory traces decay without reactivation. If the spacing interval is too long, retrieval becomes effortful and accuracy drops b…
activation_queries:
- What does the source say about quiz?
- What does the source say about hours?
- What does the source say about drops?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on heuristic
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "A useful heuristic: if you can teach a concept aloud without notes in under two minutes, you have probably mastered the surface layer. Another heuristic: if a quiz score is below 70 percent, schedule the next review within 24 hours, not later. Avoid cramming the night before exams, because consolidation requires sleep. Avoid passive rereading as the primary study technique — it produces fluency…"
activation_queries:
- What does the source say about heuristic?
- What does the source say about without?
- What does the source say about avoid?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on question
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Question: How long should a spaced repetition interval be? Answer: Start at 1 day, then double the interval after each successful recall, capping at 6 months for stable knowledge. Question: Why is sleep more important than extra study hours? Answer: Because consolidation, not exposure, converts fragile traces into durable memory; without sleep, additional exposure yields diminishing returns."
activation_queries:
- What does the source say about question?
- What does the source say about interval?
- What does the source say about answer?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: Learning improves when students actively retrieve information instead of passively rereading.
type: fact
tags:
- learning
- improves
- students
dependencies: []
confidence: 0.78
- id: AU_002
statement: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
type: fact
tags:
- retrieval
- practice
- strengthens
dependencies: []
confidence: 0.78
- id: AU_003
statement: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
type: definition
tags:
- days
- spaced
- repetition
dependencies: []
confidence: 0.78
- id: AU_004
statement: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
type: fact
tags:
- sleep
- supports
- consolidation
dependencies: []
confidence: 0.78
- id: AU_005
statement: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
type: fact
tags:
- hippocampus
- replays
- daytime
dependencies: []
confidence: 0.78
- id: AU_006
statement: Teachers should design study sessions in four steps.
type: rule
tags:
- teachers
- should
- design
dependencies: []
confidence: 0.78
- id: AU_007
statement: First, set a clear learning objective.
type: fact
tags:
- first
- clear
- learning
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Encourage active retrieval over passive review.
- Suggest spaced repetition where appropriate.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Ignoring decision rules in favor of fluency.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: Learning improves when students actively retrieve information instead of passively rereading.
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: Retrieval practice strengthens memory traces by forcing the brain to reconstruct knowledge.
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: Spaced repetition is the deliberate scheduling of review at increasing intervals — typically 1 day, 3 days, 7 days, 21 days — to fight the forgetting curve described by Hermann Ebbinghaus.
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: Sleep supports consolidation, while distraction reduces attention and weakens encoding.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: The hippocampus replays daytime activations during slow-wave sleep, transferring memories into long-term cortical storage.
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: Teachers should design study sessions in four steps.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: Teachers should design study sessions in four steps.
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: If students confuse fluency with mastery, they overestimate their readiness and reduce study time prematurely.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: Avoid cramming the night before exams, because consolidation requires sleep.
extraction_type: explicit
- extracted_item_id: HEU_004
source_location: user_input
source_excerpt: Avoid passive rereading as the primary study technique — it produces fluency without retention.
extraction_type: explicit
- extracted_item_id: HEU_005
source_location: user_input
source_excerpt: Never replace retrieval practice with highlighting; highlighting feels productive but does not strengthen memory.
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: IF students actively retrieve information instead of passively THEN rereading
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF a concept is reviewed only once THEN it tends to fade within 48 hours, because memory traces decay without reactivation
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF the spacing interval is too long THEN retrieval becomes effortful and accuracy drops below 60 percent
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF students confuse fluency with mastery THEN they overestimate their readiness and reduce study time prematurely
extraction_type: explicit
```
Business strategy
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: strategy
source_title: Untitled source
source_author: Unknown
domain: business
subdomains: [value, should, customers, acquisition]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: business
subdomains:
- name: value
relevance: 1
related_concepts:
- value
- name: should
relevance: 0.85
related_concepts:
- should
- name: customers
relevance: 0.7
related_concepts:
- customers
- name: acquisition
relevance: 0.55
related_concepts:
- acquisition
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Sustainable
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: JTBD
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: The North Star Metric
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: Airbnb
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: Slack
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Unit
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: SaaS
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Leaders
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Value
definition: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
domain: business
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Should
definition: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
domain: business
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Customers
definition: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
domain: business
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Acquisition
definition: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
domain: business
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: Customer
definition: Leaders should run quarterly business reviews in five steps.
domain: business
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: First
definition: First, restate the North Star Metric and the current value.
domain: business
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: relevant decision context detected
interpretation: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_002
trigger: relevant decision context detected
interpretation: Leaders should run quarterly business reviews in five steps.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_003
trigger: relevant decision context detected
interpretation: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_004
trigger: relevant decision context detected
interpretation: If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_005
trigger: relevant decision context detected
interpretation: "Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap."
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules:
- id: RULE_001
condition: Operating context matches the rule's domain.
decision: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies l
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_002
condition: Operating context matches the rule's domain.
decision: Leaders should run quarterly business reviews in five steps.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_003
condition: Operating context matches the rule's domain.
decision: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
```
## 8. PROCEDURES
```yaml
procedures:
- id: PROC_001
name: Source-derived procedure
objective: Apply the sequence implied by the source text.
steps:
- step: 1
action: First, restate the North Star Metric and the current value.
input_required: —
output_expected: —
- step: 2
action: "Contradiction worth flagging: classic growth playbooks tell founders to optimise acquisition first, while modern PLG playbooks insist activation and retention should be solved firs"
input_required: —
output_expected: —
- step: 3
action: "Answer: When weekly active retention plateaus below the category benchmark and onboarding completion is under 50 percent — the leaky bucket must be fixed first."
input_required: —
output_expected: —
success_criteria: All steps applied in order with expected outcomes.
failure_criteria: Steps executed out of order or without prerequisites.
```
## 9. PATTERNS
```yaml
patterns: [] # none extracted
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: Avoid discounting as a default response to slow sales, because it trains the market to wait for promotions and erodes brand perception.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Never sacrifice gross margin to win logos that will not expand.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains:
- id: CAU_001
cause: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calc
mechanism: —
effect: growth without margin amplifies losses.
secondary_effects: []
intervention_points: []
confidence: 0.7
- id: CAU_002
cause: Avoid discounting as a default response to slow sales,
mechanism: —
effect: it trains the market to wait for promotions and erodes brand perception.
secondary_effects: []
intervention_points: []
confidence: 0.7
- id: CAU_003
cause: "Edge case: in network-effect businesses, early-stage CAC may legitimately exceed LTV"
mechanism: —
effect: each new customer increases the value of the existing base; standard unit economics under-measure this.
secondary_effects: []
intervention_points: []
confidence: 0.7
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Value
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Value concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Should
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Should concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Customers
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Customers concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: CAC exceeds LTV
then: expansion destroys value and the team must pause paid acquisition
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: churn is rising for two consecutive quarters
then: the company should revisit onboarding and activation before optimising acquisition
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: a pricing change drops conversion by more than 15 percent
then: roll back within seven days unless retention improves materially
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: a sales team consistently closes poor-fit customers
then: support load rises and downstream churn follows within one to two renewal cycles
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions:
- id: EXC_001
general_rule: —
exception_case: If a pricing change drops conversion by more than 15 percent, roll back within seven days unless retention improves materially.
modified_action: Adjust behavior according to the exception.
explanation: Source explicitly notes this edge case.
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Value model
description: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Should model
description: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: business response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Value.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is value and when does it apply?
ideal_answer: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is should and when does it apply?
ideal_answer: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is customers and when does it apply?
ideal_answer: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is acquisition and when does it apply?
ideal_answer: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on business
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features. The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team. Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acqu…
activation_queries:
- What does the source say about business?
- What does the source say about customer?
- What does the source say about value?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on activation
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Third, examine activation, retention and expansion funnels. Fourth, decide which two initiatives to double down on. Fifth, decide which initiative to kill. If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition. If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition. If a pricing change d…
activation_queries:
- What does the source say about activation?
- What does the source say about retention?
- What does the source say about expansion?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on heuristic
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "A useful heuristic: if your top 10 percent of customers generate more than 50 percent of revenue, your pricing is probably under-segmented. Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap. Pricing should reflect the value delivered, not internal costs alone. Avoid discounting as a default response to slow sales, because it tr…"
activation_queries:
- What does the source say about heuristic?
- What does the source say about percent?
- What does the source say about customers?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on first
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Contradiction worth flagging: classic growth playbooks tell founders to optimise acquisition first, while modern PLG playbooks insist activation and retention should be solved first — both can be right depending on stage. Question: When should a startup stop optimising acquisition? Answer: When weekly active retention plateaus below the category benchmark and onboarding completion is under 50 p…"
activation_queries:
- What does the source say about first?
- What does the source say about playbooks?
- What does the source say about acquisition?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
type: fact
tags:
- sustainable
- business
- growth
dependencies: []
confidence: 0.78
- id: AU_002
statement: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
type: definition
tags:
- north
- star
- metric
dependencies: []
confidence: 0.78
- id: AU_003
statement: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifies losses.
type: rule
tags:
- customer
- unit
- economics
dependencies: []
confidence: 0.78
- id: AU_004
statement: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
type: fact
tags:
- healthy
- saas
- business
dependencies: []
confidence: 0.78
- id: AU_005
statement: Leaders should run quarterly business reviews in five steps.
type: rule
tags:
- leaders
- should
- quarterly
dependencies: []
confidence: 0.78
- id: AU_006
statement: First, restate the North Star Metric and the current value.
type: fact
tags:
- first
- restate
- north
dependencies: []
confidence: 0.78
- id: AU_007
statement: Second, review unit economics by cohort.
type: fact
tags:
- second
- review
- unit
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Distinguish between strategy and tactics.
- Surface assumptions about the market.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Ignoring decision rules in favor of fluency.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: Sustainable business growth depends on understanding the customer's underlying job to be done (JTBD), not only the product features.
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: The North Star Metric is the single number that best captures the value delivered to customers — for Airbnb it is nights booked, for Slack it is messages sent per active team.
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifie…
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: A healthy SaaS business targets an LTV / CAC ratio above three and a CAC payback period under twelve months.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: Leaders should run quarterly business reviews in five steps.
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: First, restate the North Star Metric and the current value.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: Unit economics, expressed as the ratio of customer lifetime value (LTV) to customer acquisition cost (CAC), must be calculated before scaling marketing spend, because growth without margin amplifie…
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: Leaders should run quarterly business reviews in five steps.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: If CAC exceeds LTV, expansion destroys value and the team must pause paid acquisition.
extraction_type: explicit
- extracted_item_id: HEU_004
source_location: user_input
source_excerpt: If churn is rising for two consecutive quarters, the company should revisit onboarding and activation before optimising acquisition.
extraction_type: explicit
- extracted_item_id: HEU_005
source_location: user_input
source_excerpt: "Another heuristic: a feature requested by fewer than three paying customers in a quarter should not enter the roadmap."
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: IF CAC exceeds LTV THEN expansion destroys value and the team must pause paid acquisition
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF churn is rising for two consecutive quarters THEN the company should revisit onboarding and activation before optimising acquisition
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF a pricing change drops conversion by more than 15 percent THEN roll back within seven days unless retention improves materially
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF a sales team consistently closes poor-fit customers THEN support load rises and downstream churn follows within one to two renewal cycles
extraction_type: explicit
```
Clinical protocol
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: protocol
source_title: Untitled source
source_author: Unknown
domain: healthcare
subdomains: [sepsis, lactate, pressure, antibiotics]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: healthcare
subdomains:
- name: sepsis
relevance: 1
related_concepts:
- sepsis
- name: lactate
relevance: 0.85
related_concepts:
- lactate
- name: pressure
relevance: 0.7
related_concepts:
- pressure
- name: antibiotics
relevance: 0.55
related_concepts:
- antibiotics
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Sepsis
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: Glasgow Coma Scale
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: The Hour
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: Surviving Sepsis Campaign
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: First
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Second
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: Third
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Fourth
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Sepsis
definition: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
domain: healthcare
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Lactate
definition: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
domain: healthcare
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Pressure
definition: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
domain: healthcare
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Antibiotics
definition: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
domain: healthcare
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: Qsofa
definition: First, measure serum lactate.
domain: healthcare
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: Blood
definition: Second, obtain blood cultures before antibiotics.
domain: healthcare
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: relevant decision context detected
interpretation: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
recommended_action: Mitigate the described risk.
avoid: —
confidence: 0.76
- id: HEU_002
trigger: relevant decision context detected
interpretation: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_003
trigger: context contains a known failure mode
interpretation: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
recommended_action: Avoid the described action.
avoid: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
confidence: 0.76
- id: HEU_004
trigger: context contains a known failure mode
interpretation: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
recommended_action: Avoid the described action.
avoid: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
confidence: 0.76
- id: HEU_005
trigger: relevant decision context detected
interpretation: "Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters."
recommended_action: Apply the technique described.
avoid: —
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules:
- id: RULE_001
condition: Operating context matches the rule's domain.
decision: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
```
## 8. PROCEDURES
```yaml
procedures:
- id: PROC_001
name: Source-derived procedure
objective: Apply the sequence implied by the source text.
steps:
- step: 1
action: First, measure serum lactate.
input_required: —
output_expected: —
- step: 2
action: Fifth, start vasopressors if mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is first-line.
input_required: —
output_expected: —
- step: 3
action: "Question: What is the first-line vasopressor in septic shock?"
input_required: —
output_expected: —
success_criteria: All steps applied in order with expected outcomes.
failure_criteria: Steps executed out of order or without prerequisites.
```
## 9. PATTERNS
```yaml
patterns:
- id: PAT_001
name: Recurring pattern 1
observed_when: Source-described conditions are present.
signal: "Useful heuristic: if a patient looks worse than the numbers suggest, trust the bedside impression — early sepsis often outruns vital-sign abnormalities."
underlying_mechanism: —
response_strategy: Recognize and act according to source guidance.
confidence: 0.7
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains: [] # none extracted
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Sepsis
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Sepsis concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Lactate
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Lactate concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Pressure
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Pressure concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is
then: first-line
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: serum lactate is above 2 mmol/L
then: repeat lactate within 2 hours to confirm clearance
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: blood pressure remains low after the 30 mL/kg fluid bolus
then: start norepinephrine at 0
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: procalcitonin trends down for 72 hours and cultures are negative
then: consider de-escalating antibiotics
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions: [] # none extracted
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Sepsis model
description: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Lactate model
description: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: healthcare response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Sepsis.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is sepsis and when does it apply?
ideal_answer: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is lactate and when does it apply?
ideal_answer: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is pressure and when does it apply?
ideal_answer: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is antibiotics and when does it apply?
ideal_answer: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on sepsis
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement. The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15). A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk. The Hour-1 sepsis bundle,…"
activation_queries:
- What does the source say about sepsis?
- What does the source say about qsofa?
- What does the source say about score?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on lactate
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Third, administer broad-spectrum antibiotics within 60 minutes. Fourth, begin 30 mL/kg crystalloid for hypotension or lactate above 4 mmol/L. Fifth, start vasopressors if mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is first-line. If serum lactate is above 2 mmol/L, repeat lactate within 2 hours to confirm clearance. If blood pressure remains low after the 30 mL/kg…
activation_queries:
- What does the source say about lactate?
- What does the source say about antibiotics?
- What does the source say about within?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on failure
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema. Useful heuristic: if a patient looks worse than the numbers suggest, trust the bedside impression — early sepsis often outruns vital-sign abnormalities. Another heuristic: any febrile, tachycardic patient on immunosuppression is sepsis until pr…"
activation_queries:
- What does the source say about failure?
- What does the source say about avoid?
- What does the source say about heuristic?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on patients
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Edge case: pregnant patients have physiologically higher heart rates and lower blood pressure, so the qSOFA threshold over-triggers; use the obstetric modified early warning score instead. Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters. Question: What is the first-line vasopressor in…"
activation_queries:
- What does the source say about patients?
- What does the source say about pressure?
- What does the source say about early?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
type: fact
tags:
- sepsis
- triage
- emergency
dependencies: []
confidence: 0.78
- id: AU_002
statement: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
type: fact
tags:
- qsofa
- score
- adds
dependencies: []
confidence: 0.78
- id: AU_003
statement: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
type: fact
tags:
- qsofa
- more
- patient
dependencies: []
confidence: 0.78
- id: AU_004
statement: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
type: fact
tags:
- sepsis
- hour-
- bundle
dependencies: []
confidence: 0.78
- id: AU_005
statement: First, measure serum lactate.
type: fact
tags:
- first
- measure
- serum
dependencies: []
confidence: 0.78
- id: AU_006
statement: Second, obtain blood cultures before antibiotics.
type: fact
tags:
- second
- obtain
- blood
dependencies: []
confidence: 0.78
- id: AU_007
statement: Third, administer broad-spectrum antibiotics within 60 minutes.
type: fact
tags:
- third
- administer
- broad-spectrum
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Never diagnose; refer to qualified professionals.
- Cite uncertainty explicitly.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Ignoring decision rules in favor of fluency.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: Sepsis triage in the emergency department combines vital signs, the qSOFA score and lactate measurement.
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: "The qSOFA score adds one point for each of: systolic blood pressure below 100 mmHg, respiratory rate of 22 or above, and altered mental status (Glasgow Coma Scale under 15)."
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: The Hour-1 sepsis bundle, defined by the Surviving Sepsis Campaign, has five steps.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: First, measure serum lactate.
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: Second, obtain blood cultures before antibiotics.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: A qSOFA of 2 or more in a patient with suspected infection identifies a high mortality risk.
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: When patients have a history of congestive heart failure, the 30 mL/kg fluid target should be reassessed at 500 mL increments to avoid pulmonary oedema.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: Avoid delaying antibiotics while waiting for imaging in suspected septic shock.
extraction_type: explicit
- extracted_item_id: HEU_004
source_location: user_input
source_excerpt: Never use hydroxyethyl starch for resuscitation — it increases mortality and renal failure.
extraction_type: explicit
- extracted_item_id: HEU_005
source_location: user_input
source_excerpt: "Contradiction in the literature: aggressive early fluids improve some septic shock outcomes but worsen others in patients with ARDS — individualisation matters."
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: IF mean arterial pressure stays below 65 mmHg despite fluids — norepinephrine is THEN first-line
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF serum lactate is above 2 mmol/L THEN repeat lactate within 2 hours to confirm clearance
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF blood pressure remains low after the 30 mL/kg fluid bolus THEN start norepinephrine at 0
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF procalcitonin trends down for 72 hours and cultures are negative THEN consider de-escalating antibiotics
extraction_type: explicit
```
Legal / GDPR
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: policy
source_title: Untitled source
source_author: Unknown
domain: legal
subdomains: [data, consent, controller, processing]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: legal
subdomains:
- name: data
relevance: 1
related_concepts:
- data
- name: consent
relevance: 0.85
related_concepts:
- consent
- name: controller
relevance: 0.7
related_concepts:
- controller
- name: processing
relevance: 0.55
related_concepts:
- processing
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Under
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: General Data Protection Regulation
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: GDPR
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: The Supervisory Authority
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: Regulation
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Member State
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: Article
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Consent
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Data
definition: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
domain: legal
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Consent
definition: A "data processor" processes personal data on behalf of the controller.
domain: legal
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Controller
definition: A "data subject" is the identified or identifiable person to whom the data relates.
domain: legal
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Processing
definition: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
domain: legal
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: Must
definition: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
domain: legal
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: Personal
definition: Consent must be freely given, specific, informed and unambiguous.
domain: legal
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: relevant decision context detected
interpretation: Consent must be freely given, specific, informed and unambiguous.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_002
trigger: relevant decision context detected
interpretation: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_003
trigger: relevant decision context detected
interpretation: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and freedoms of natural pers
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_004
trigger: relevant decision context detected
interpretation: The controller must inform affected data subjects without undue delay when the breach is likely to result in a high risk.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_005
trigger: relevant decision context detected
interpretation: If a transfer of personal data outside the European Economic Area lacks an adequacy decision, the parties must implement appropriate safeguards such as Standard Contractual Clauses.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules:
- id: RULE_001
condition: Operating context matches the rule's domain.
decision: Consent must be freely given, specific, informed and unambiguous.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_002
condition: Operating context matches the rule's domain.
decision: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_003
condition: Operating context matches the rule's domain.
decision: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and free
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
```
## 8. PROCEDURES
```yaml
procedures: [] # none extracted
```
## 9. PATTERNS
```yaml
patterns:
- id: PAT_001
name: Recurring pattern 1
observed_when: Source-described conditions are present.
signal: "Another heuristic: when in doubt between consent and legitimate interest, prefer the legal basis that gives the data subject the most control — usually consent "
underlying_mechanism: —
response_strategy: Recognize and act according to source guidance.
confidence: 0.7
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: Avoid bundling consent with the acceptance of terms of service; that bundled consent is generally not freely given.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Never retain personal data longer than necessary for the stated purpose; storage limitation is a core principle.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains: [] # none extracted
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Data
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Data concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Consent
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Consent concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Controller
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Controller concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: "at least one of six legal bases applies: consent"
then: contract, legal obligation, vital interests, public task, or legitimate interest
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: the breach is likely to result in a
then: high risk
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: a processor engages another processor (a sub-processor) without prior written authorisation of the controller
then: the original processor remains fully liable
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: a transfer of personal data outside the European Economic Area lacks an adequacy decision
then: the parties must implement appropriate safeguards such as Standard Contractual Clauses
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions:
- id: EXC_001
general_rule: —
exception_case: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and free
modified_action: Adjust behavior according to the exception.
explanation: Source explicitly notes this edge case.
- id: EXC_002
general_rule: —
exception_case: "Answer: Within 72 hours of becoming aware, to the competent Supervisory Authority, unless the breach is unlikely to result in risk to data subjects."
modified_action: Adjust behavior according to the exception.
explanation: Source explicitly notes this edge case.
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Data model
description: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Consent model
description: A "data processor" processes personal data on behalf of the controller.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: legal response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Data.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is data and when does it apply?
ideal_answer: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is consent and when does it apply?
ideal_answer: A "data processor" processes personal data on behalf of the controller.
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is controller and when does it apply?
ideal_answer: A "data subject" is the identified or identifiable person to whom the data relates.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is processing and when does it apply?
ideal_answer: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on data
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data. A "data processor" processes personal data on behalf of the controller. A "data subject" is the identified or identifiable person to whom the data relates. The Supervisory Authority is the independent public body responsible f…
activation_queries:
- What does the source say about data?
- What does the source say about legal?
- What does the source say about regulation?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on controller
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented. A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and freedoms of natural persons. The controller must inform affected data subjects with…
activation_queries:
- What does the source say about controller?
- What does the source say about must?
- What does the source say about data?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on consent
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Useful heuristic: if you cannot articulate the specific purpose of a data collection in one sentence, that purpose is probably not specific enough to ground lawful consent. Another heuristic: when in doubt between consent and legitimate interest, prefer the legal basis that gives the data subject the most control — usually consent for marketing, legitimate interest for fraud prevention. Avoid b…"
activation_queries:
- What does the source say about consent?
- What does the source say about purpose?
- What does the source say about data?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on data
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: 'Contradiction worth flagging: Article 22 restricts solely automated decisions, but many AI-assisted decisions remain "automated in practice" while keeping a token human reviewer — courts disagree on whether this satisfies the safeguard. Question: Within how long must a controller report a personal data breach? Answer: Within 72 hours of becoming aware, to the competent Supervisory Authority, un…'
activation_queries:
- What does the source say about data?
- What does the source say about automated?
- What does the source say about decisions?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
type: definition
tags:
- data
- under
- general
dependencies: []
confidence: 0.78
- id: AU_002
statement: A "data processor" processes personal data on behalf of the controller.
type: fact
tags:
- data
- processor
- processes
dependencies: []
confidence: 0.78
- id: AU_003
statement: A "data subject" is the identified or identifiable person to whom the data relates.
type: definition
tags:
- data
- subject
- identified
dependencies: []
confidence: 0.78
- id: AU_004
statement: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
type: definition
tags:
- supervisory
- authority
- independent
dependencies: []
confidence: 0.78
- id: AU_005
statement: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
type: definition
tags:
- legal
- article
- establishes
dependencies: []
confidence: 0.78
- id: AU_006
statement: Consent must be freely given, specific, informed and unambiguous.
type: rule
tags:
- consent
- must
- freely
dependencies: []
confidence: 0.78
- id: AU_007
statement: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
type: rule
tags:
- where
- processing
- based
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Never provide legal advice; cite the source.
- Always flag jurisdiction-dependent claims.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Ignoring decision rules in favor of fluency.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: Under the General Data Protection Regulation (GDPR), a "data controller" is the natural or legal person who determines the purposes and means of processing personal data.
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: A "data processor" processes personal data on behalf of the controller.
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: A "data subject" is the identified or identifiable person to whom the data relates.
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: The Supervisory Authority is the independent public body responsible for monitoring application of the Regulation in each Member State.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: "Article 6 establishes that processing is lawful only if at least one of six legal bases applies: consent, contract, legal obligation, vital interests, public task, or legitimate interest."
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: Consent must be freely given, specific, informed and unambiguous.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: Consent must be freely given, specific, informed and unambiguous.
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: Where processing is based on consent, the controller must be able to demonstrate that the data subject has consented.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: A controller must notify a personal data breach to the competent Supervisory Authority within 72 hours of becoming aware of it, unless the breach is unlikely to result in a risk to the rights and f…
extraction_type: explicit
- extracted_item_id: HEU_004
source_location: user_input
source_excerpt: The controller must inform affected data subjects without undue delay when the breach is likely to result in a high risk.
extraction_type: explicit
- extracted_item_id: HEU_005
source_location: user_input
source_excerpt: If a transfer of personal data outside the European Economic Area lacks an adequacy decision, the parties must implement appropriate safeguards such as Standard Contractual Clauses.
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: "IF at least one of six legal bases applies: consent THEN contract, legal obligation, vital interests, public task, or legitimate interest"
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF the breach is likely to result in a THEN high risk
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF a processor engages another processor (a sub-processor) without prior written authorisation of the controller THEN the original processor remains fully liable
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF a transfer of personal data outside the European Economic Area lacks an adequacy decision THEN the parties must implement appropriate safeguards such as Standard Contractual Clauses
extraction_type: explicit
```
Engineering runbook
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: runbook
source_title: Untitled source
source_author: Unknown
domain: business
subdomains: [incident, minutes, sev-, within]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: business
subdomains:
- name: incident
relevance: 1
related_concepts:
- incident
- name: minutes
relevance: 0.85
related_concepts:
- minutes
- name: sev-
relevance: 0.7
related_concepts:
- sev-
- name: within
relevance: 0.55
related_concepts:
- within
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Severity
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: The Incident Commander
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: Communications Lead
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: Scribe
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: First
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Second
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: Incident Commander
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Third
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Incident
definition: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
domain: business
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Minutes
definition: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
domain: business
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Sev-
definition: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
domain: business
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Within
definition: The on-call response procedure has six steps.
domain: business
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: First
definition: First, acknowledge the page within 5 minutes.
domain: business
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: Back
definition: Second, open the incident channel and assign the Incident Commander.
domain: business
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: relevant decision context detected
interpretation: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_002
trigger: context contains a known failure mode
interpretation: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
recommended_action: Avoid the described action.
avoid: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
confidence: 0.76
- id: HEU_003
trigger: relevant decision context detected
interpretation: "Question: When should you roll back versus patch forward?"
recommended_action: Follow the recommended practice.
avoid: —
confidence: 0.76
- id: HEU_004
trigger: relevant decision context detected
interpretation: "Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or would cause worse harm."
recommended_action: Mitigate the described risk.
avoid: —
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules:
- id: RULE_001
condition: Operating context matches the rule's domain.
decision: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
- id: RULE_002
condition: Operating context matches the rule's domain.
decision: "Question: When should you roll back versus patch forward?"
reasoning: Derived directly from a normative statement in the source.
required_context: Domain-specific context as described in the source.
output_action: Apply the recommended decision.
failure_mode: Recommendation applied outside its valid context.
confidence: 0.74
```
## 8. PROCEDURES
```yaml
procedures:
- id: PROC_001
name: Source-derived procedure
objective: Apply the sequence implied by the source text.
steps:
- step: 1
action: First, acknowledge the page within 5 minutes.
input_required: —
output_expected: —
- step: 2
action: Third, declare severity and post the first status update within 10 minutes.
input_required: —
output_expected: —
- step: 3
action: If the issue began within 30 minutes of a deploy, the first hypothesis is a regression — roll back before debugging.
input_required: —
output_expected: —
- step: 4
action: "Question: What is the first action after acknowledging an alert?"
input_required: —
output_expected: —
- step: 5
action: "Answer: Open the incident channel, assign an Incident Commander, and post the first status update within 10 minutes."
input_required: —
output_expected: —
success_criteria: All steps applied in order with expected outcomes.
failure_criteria: Steps executed out of order or without prerequisites.
```
## 9. PATTERNS
```yaml
patterns: [] # none extracted
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains: [] # none extracted
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Incident
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Incident concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Minutes
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Minutes concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Sev-
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Sev- concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes
then: page the on-call engineer for that service
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: error rate exceeds 1 percent of requests for 2 minutes
then: escalate to SEV-2
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: the issue began within 30 minutes of a deploy
then: the first hypothesis is a regression — roll back before debugging
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: a rollback does not resolve the issue within 10 minutes
then: expand the suspect set to dependencies and infrastructure changes
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions:
- id: EXC_001
general_rule: —
exception_case: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
modified_action: Adjust behavior according to the exception.
explanation: Source explicitly notes this edge case.
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Incident model
description: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Minutes model
description: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: business response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Incident.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is incident and when does it apply?
ideal_answer: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is minutes and when does it apply?
ideal_answer: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is sev- and when does it apply?
ideal_answer: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is within and when does it apply?
ideal_answer: The on-call response procedure has six steps.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on incident
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service. Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact). The Incident Commander coordinates response; the Communications Lead handles statu…
activation_queries:
- What does the source say about incident?
- What does the source say about sev-?
- What does the source say about severity?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on minutes
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Fourth, stabilise — roll back the most recent change or shed load before deep diagnosis. Fifth, communicate every 30 minutes until resolution. Sixth, schedule a blameless postmortem within 5 business days. If p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes, page the on-call engineer for that service. If error rate exceeds 1 percent of requests for 2 minutes, escalate to SEV-…
activation_queries:
- What does the source say about minutes?
- What does the source say about within?
- What does the source say about roll?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on incident
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Useful heuristic: prefer rolling back a recent change over root-causing during an active incident — mean time to recovery beats mean time to understand. Another heuristic: if three responders are debating the cause, you need a decision, not more data — the Incident Commander picks one path and runs it. Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canar…"
activation_queries:
- What does the source say about incident?
- What does the source say about during?
- What does the source say about heuristic?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on question
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Question: What is the first action after acknowledging an alert? Answer: Open the incident channel, assign an Incident Commander, and post the first status update within 10 minutes. Question: When should you roll back versus patch forward? Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or …"
activation_queries:
- What does the source say about question?
- What does the source say about first?
- What does the source say about answer?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
type: definition
tags:
- incident
- unplanned
- event
dependencies: []
confidence: 0.78
- id: AU_002
statement: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
type: definition
tags:
- sev-
- degradation
- users
dependencies: []
confidence: 0.78
- id: AU_003
statement: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
type: fact
tags:
- incident
- commander
- coordinates
dependencies: []
confidence: 0.78
- id: AU_004
statement: The on-call response procedure has six steps.
type: fact
tags:
- on-call
- response
- procedure
dependencies: []
confidence: 0.78
- id: AU_005
statement: First, acknowledge the page within 5 minutes.
type: fact
tags:
- first
- acknowledge
- page
dependencies: []
confidence: 0.78
- id: AU_006
statement: Second, open the incident channel and assign the Incident Commander.
type: fact
tags:
- incident
- second
- open
dependencies: []
confidence: 0.78
- id: AU_007
statement: Third, declare severity and post the first status update within 10 minutes.
type: fact
tags:
- third
- declare
- severity
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Distinguish between strategy and tactics.
- Surface assumptions about the market.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Ignoring decision rules in favor of fluency.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: An incident is any unplanned event that degrades the availability, latency, error rate or correctness of a production service.
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: Severity is classified into SEV-1 (full outage or data loss), SEV-2 (major degradation for many users), SEV-3 (partial degradation for some users) and SEV-4 (minor issue, no user impact).
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: The Incident Commander coordinates response; the Communications Lead handles status updates; the Scribe records the timeline.
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: The on-call response procedure has six steps.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: First, acknowledge the page within 5 minutes.
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: Second, open the incident channel and assign the Incident Commander.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: Avoid pushing fixes directly to production during a SEV-1 — fixes must still go through canary deploys unless the alternative is data loss.
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: Never silence an alert during an incident; if the alert is noisy, file a follow-up to fix it after the incident.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: "Question: When should you roll back versus patch forward?"
extraction_type: explicit
- extracted_item_id: HEU_004
source_location: user_input
source_excerpt: "Answer: Roll back when the issue began within 30 minutes of a deploy and the previous release was healthy; patch forward only when rollback is impossible or would cause worse harm."
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: IF p99 latency exceeds the SLO by 50 percent for 5 consecutive minutes THEN page the on-call engineer for that service
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF error rate exceeds 1 percent of requests for 2 minutes THEN escalate to SEV-2
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF the issue began within 30 minutes of a deploy THEN the first hypothesis is a regression — roll back before debugging
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF a rollback does not resolve the issue within 10 minutes THEN expand the suspect set to dependencies and infrastructure changes
extraction_type: explicit
```
Scientific paper
ckf_demo_1782852657371
8 entities · 6 concepts · 0 principles
markdown
# CKF — KNOWLEDGE CONTEXT PACKAGE
package_id: ckf_demo_1782852657371
protocol_version: ckf-0.1
source_type: paper
source_title: Untitled source
source_author: Unknown
domain: education / learning science
subdomains: [model, reward, policy, rlhf]
language: en
created_at: 2026-06-30T20:50:57.371Z
compression_level: standard
human_readability: 0.7
ai_utility_score: 0.81
---
## 1. CORE INTENT
```yaml
core_intent:
primary_purpose: Capture and structure the knowledge expressed in the source.
intended_user: Developers, researchers and agents consuming structured knowledge.
intended_agent_use: Retrieval, reasoning, tutoring, decision support.
transformation_goal: Convert prose into structured, agent-usable cognition.
key_value: Portable, traceable, reusable knowledge package.
```
## 2. DOMAIN MAP
```yaml
domain_map:
main_domain: education / learning science
subdomains:
- name: model
relevance: 1
related_concepts:
- model
- name: reward
relevance: 0.85
related_concepts:
- reward
- name: policy
relevance: 0.7
related_concepts:
- policy
- name: rlhf
relevance: 0.55
related_concepts:
- rlhf
adjacent_domains: []
excluded_domains: []
```
## 3. ENTITY GRAPH
```yaml
entities:
- id: ENT_001
name: Reinforcement Learning
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities: []
source_basis: explicit
- id: ENT_002
name: Human Feedback
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_003
name: RLHF
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_004
name: Proximal Policy Optimization
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_005
name: First
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_006
name: Second
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_007
name: Third
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
- id: ENT_008
name: Fourth
type: named_entity
description: Recurring term/entity surfaced from the source.
aliases: []
attributes: []
related_entities:
- entity_id: ENT_001
relation_type: co_occurs_with
confidence: 0.6
source_basis: explicit
```
## 4. CONCEPT GRAPH
```yaml
concepts:
- id: CON_001
label: Model
definition: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
domain: education / learning science
depends_on: []
contradicts: []
supports:
- CON_002
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_002
label: Reward
definition: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
domain: education / learning science
depends_on:
- CON_001
contradicts: []
supports:
- CON_003
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_003
label: Policy
definition: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
domain: education / learning science
depends_on:
- CON_002
contradicts: []
supports:
- CON_004
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_004
label: Rlhf
definition: The canonical RLHF procedure proceeds in four steps.
domain: education / learning science
depends_on:
- CON_003
contradicts: []
supports:
- CON_005
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_005
label: Preference
definition: First, collect a dataset of high-quality demonstrations from human writers.
domain: education / learning science
depends_on:
- CON_004
contradicts: []
supports:
- CON_006
enables: []
risks: []
confidence: 0.78
source_basis: explicit
- id: CON_006
label: Human
definition: Second, supervise fine-tune the base model on these demonstrations.
domain: education / learning science
depends_on:
- CON_005
contradicts: []
supports: []
enables: []
risks: []
confidence: 0.78
source_basis: explicit
```
## 5. PRINCIPLES
```yaml
principles: [] # none extracted
```
## 6. HEURISTICS
```yaml
heuristics:
- id: HEU_001
trigger: context contains a known failure mode
interpretation: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
recommended_action: Avoid the described action.
avoid: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
confidence: 0.76
- id: HEU_002
trigger: context contains a known failure mode
interpretation: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
recommended_action: Avoid the described action.
avoid: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
confidence: 0.76
- id: HEU_003
trigger: context contains a known failure mode
interpretation: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose."
recommended_action: Avoid the described action.
avoid: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose"
confidence: 0.76
```
## 7. DECISION RULES
```yaml
decision_rules: [] # none extracted
```
## 8. PROCEDURES
```yaml
procedures:
- id: PROC_001
name: Source-derived procedure
objective: Apply the sequence implied by the source text.
steps:
- step: 1
action: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally
input_required: —
output_expected: —
- step: 2
action: First, collect a dataset of high-quality demonstrations from human writers.
input_required: —
output_expected: —
success_criteria: All steps applied in order with expected outcomes.
failure_criteria: Steps executed out of order or without prerequisites.
```
## 9. PATTERNS
```yaml
patterns:
- id: PAT_001
name: Recurring pattern 1
observed_when: Source-described conditions are present.
signal: "Useful heuristic: monitor the KL divergence between policy and reference model continuously — sharp jumps usually precede reward hacking."
underlying_mechanism: —
response_strategy: Recognize and act according to source guidance.
confidence: 0.7
```
## 10. ANTI-PATTERNS
```yaml
anti_patterns:
- id: ANTI_001
name: Anti-pattern 1
description: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
- id: ANTI_002
name: Anti-pattern 2
description: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
why_it_fails: Identified by the source as ineffective or harmful.
warning_signals: Behavior matches the described failure mode.
replacement_behavior: Use the recommended alternative from the source.
```
## 11. CAUSAL CHAINS
```yaml
causal_chains: [] # none extracted
```
## 12. CONTEXTUAL TRIGGERS
```yaml
contextual_triggers:
- id: TRG_001
if_user_says_or_context_contains: Model
activate_knowledge:
- CON_001
- CON_002
agent_should: Recall the Model concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_002
if_user_says_or_context_contains: Reward
activate_knowledge:
- CON_002
- CON_003
agent_should: Recall the Reward concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
- id: TRG_003
if_user_says_or_context_contains: Policy
activate_knowledge:
- CON_003
- CON_004
agent_should: Recall the Policy concept and apply related rules.
agent_should_not: Make claims beyond what the source supports.
```
## 13. IF-THEN RULES
```yaml
if_then_rules:
- id: IFTHEN_001
if: the KL penalty is too low
then: the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hacking
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_002
if: the KL penalty is too high
then: the policy barely moves and most reinforcement learning gains are lost
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_003
if: the preference dataset is small
then: the reward model is high-variance and the policy overfits to its idiosyncrasies
because: Inferred from source context.
confidence: 0.72
- id: IFTHEN_004
if: the reward model is updated mid-training without recalibrating the reference policy
then: the KL term becomes meaningless and training collapses
because: Inferred from source context.
confidence: 0.72
```
## 14. EXCEPTIONS AND EDGE CASES
```yaml
exceptions: [] # none extracted
```
## 15. MENTAL MODELS
```yaml
mental_models:
- id: MM_001
name: Model model
description: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
- id: MM_002
name: Reward model
description: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
use_when: The reasoning context maps to this concept.
do_not_use_when: Context lies outside the source's scope.
input_needed: Relevant facts about the situation.
output_generated: A reasoned recommendation aligned with the source.
```
## 16. OPERATIONAL PLAYBOOKS
```yaml
playbooks:
- id: PLAY_001
name: education / learning science response playbook
objective: Apply the source's knowledge to a real interaction.
activation_context: User asks about Model.
steps:
- Identify the concept the question maps to.
- Recall related rules and heuristics.
- Cite the source-derived principle.
- Surface relevant exceptions or limits.
agent_tone: Clear, sourced, non-overstating.
tools_needed:
- retrieval
- memory
expected_output: A grounded answer with traceable reasoning.
failure_modes:
- Hallucinating beyond source
- Ignoring exceptions
```
## 17. QUESTION-ANSWER PAIRS FOR AGENTS
```yaml
qa_pairs:
- id: QA_001
question: What is model and when does it apply?
ideal_answer: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
source_concepts:
- CON_001
difficulty: easy
answer_type: definition_with_context
- id: QA_002
question: What is reward and when does it apply?
ideal_answer: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
source_concepts:
- CON_002
difficulty: medium
answer_type: definition_with_context
- id: QA_003
question: What is policy and when does it apply?
ideal_answer: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
source_concepts:
- CON_003
difficulty: medium
answer_type: definition_with_context
- id: QA_004
question: What is rlhf and when does it apply?
ideal_answer: The canonical RLHF procedure proceeds in four steps.
source_concepts:
- CON_004
difficulty: medium
answer_type: definition_with_context
```
## 18. RETRIEVAL CHUNKS
```yaml
retrieval_chunks:
- id: CHUNK_001
title: Chunk on model
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model trained on human preference comparisons. The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer. Proximal P…
activation_queries:
- What does the source say about model?
- What does the source say about human?
- What does the source say about demonstrations?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_002
title: Chunk on model
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: Third, collect pairwise preference comparisons over model outputs and train a reward model. Fourth, optimise the policy against the reward model with PPO while regularising against the supervised model using a KL divergence penalty. If the KL penalty is too low, the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hackin…
activation_queries:
- What does the source say about model?
- What does the source say about reward?
- What does the source say about policy?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_003
title: Chunk on reward
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Useful heuristic: monitor the KL divergence between policy and reference model continuously — sharp jumps usually precede reward hacking. Another heuristic: a reward model with calibration error above 10 percent on a held-out preference set is not yet reliable enough for RL fine-tuning. Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red…"
activation_queries:
- What does the source say about reward?
- What does the source say about model?
- What does the source say about policy?
related_rules: []
related_entities: []
related_concepts: []
- id: CHUNK_004
title: Chunk on reward
standalone_context: Self-contained passage extracted from the source.
compressed_knowledge: "Contradiction worth flagging: some recent work shows Direct Preference Optimization (DPO) matches or beats RLHF without a separate reward model, but other work shows RLHF still wins on hard reasoning benchmarks — the field has not converged. Question: Why is the KL divergence penalty included in PPO during RLHF? Answer: To keep the optimised policy close to the supervised model, preventing rewa…"
activation_queries:
- What does the source say about reward?
- What does the source say about rlhf?
- What does the source say about model?
related_rules: []
related_entities: []
related_concepts: []
```
## 19. EMBEDDING-READY ATOMIC UNITS
```yaml
atomic_units:
- id: AU_001
statement: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised against a reward model tra…
type: definition
tags:
- human
- model
- reinforcement
dependencies: []
confidence: 0.78
- id: AU_002
statement: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
type: definition
tags:
- reward
- model
- typically
dependencies: []
confidence: 0.78
- id: AU_003
statement: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
type: definition
tags:
- proximal
- policy
- optimization
dependencies: []
confidence: 0.78
- id: AU_004
statement: The canonical RLHF procedure proceeds in four steps.
type: fact
tags:
- canonical
- rlhf
- procedure
dependencies: []
confidence: 0.78
- id: AU_005
statement: First, collect a dataset of high-quality demonstrations from human writers.
type: fact
tags:
- first
- collect
- dataset
dependencies: []
confidence: 0.78
- id: AU_006
statement: Second, supervise fine-tune the base model on these demonstrations.
type: fact
tags:
- second
- supervise
- fine-tune
dependencies: []
confidence: 0.78
- id: AU_007
statement: Third, collect pairwise preference comparisons over model outputs and train a reward model.
type: fact
tags:
- model
- third
- collect
dependencies: []
confidence: 0.78
```
## 20. AGENT INSTRUCTIONS
```yaml
agent_instructions:
behavior_rules:
- Stay within the package's scope.
- Cite the source-derived chunk or rule when answering.
- Encourage active retrieval over passive review.
- Suggest spaced repetition where appropriate.
reasoning_rules:
- Use causal chains and IF-THEN rules before improvising.
- Combine concepts only when supports/depends_on relationships allow it.
response_rules:
- Be concise unless the user asks for depth.
- Surface confidence and source basis.
forbidden_behaviors:
- Fabricating sources.
- Restating the source as personal opinion.
- Overstating certainty.
preferred_questions:
- What does the source say about …?
- Which rule applies to this situation?
- What are the limits of this knowledge?
tool_usage_guidance:
- Use retrieval before generation.
- Use memory to track conversational context.
```
## 21. KNOWLEDGE LIMITS
```yaml
knowledge_limits:
missing_context:
- Source date and authorship are not always provided.
weakly_supported_claims: []
assumptions_detected:
- Heuristic compilation assumes the input text is self-contained.
possible_biases:
- Single-source perspective.
outdated_sections: []
needs_human_review:
- Decision rules and exceptions before production use.
```
## 22. SOURCE TRACEABILITY
```yaml
source_traceability:
- extracted_item_id: CON_001
source_location: user_input
source_excerpt: Reinforcement Learning from Human Feedback (RLHF) is a training pipeline in which a large language model is first pretrained on text, then fine-tuned on demonstrations, and finally optimised agains…
extraction_type: explicit
- extracted_item_id: CON_002
source_location: user_input
source_excerpt: The reward model is typically a transformer that takes a prompt and two candidate completions and predicts which one humans prefer.
extraction_type: explicit
- extracted_item_id: CON_003
source_location: user_input
source_excerpt: Proximal Policy Optimization (PPO) is the most common reinforcement learning algorithm used in this stage.
extraction_type: explicit
- extracted_item_id: CON_004
source_location: user_input
source_excerpt: The canonical RLHF procedure proceeds in four steps.
extraction_type: explicit
- extracted_item_id: CON_005
source_location: user_input
source_excerpt: First, collect a dataset of high-quality demonstrations from human writers.
extraction_type: explicit
- extracted_item_id: CON_006
source_location: user_input
source_excerpt: Second, supervise fine-tune the base model on these demonstrations.
extraction_type: explicit
- extracted_item_id: HEU_001
source_location: user_input
source_excerpt: Avoid using a single annotator per preference comparison; inter-annotator agreement under 70 percent is a red flag.
extraction_type: explicit
- extracted_item_id: HEU_002
source_location: user_input
source_excerpt: Never train the reward model and the policy on the same prompts in the same iteration; the policy will simply memorise the reward model's quirks.
extraction_type: explicit
- extracted_item_id: HEU_003
source_location: user_input
source_excerpt: "Limitation: human preferences over short completions do not reliably transfer to long-form outputs, so models tuned with RLHF tend to be sycophantic and verbose."
extraction_type: explicit
- extracted_item_id: IFTHEN_001
source_location: user_input
source_excerpt: IF the KL penalty is too low THEN the policy diverges from the supervised model and produces high-reward but low-quality outputs — a failure mode known as reward hacking
extraction_type: explicit
- extracted_item_id: IFTHEN_002
source_location: user_input
source_excerpt: IF the KL penalty is too high THEN the policy barely moves and most reinforcement learning gains are lost
extraction_type: explicit
- extracted_item_id: IFTHEN_003
source_location: user_input
source_excerpt: IF the preference dataset is small THEN the reward model is high-variance and the policy overfits to its idiosyncrasies
extraction_type: explicit
- extracted_item_id: IFTHEN_004
source_location: user_input
source_excerpt: IF the reward model is updated mid-training without recalibrating the reference policy THEN the KL term becomes meaningless and training collapses
extraction_type: explicit
```