Skip to content

Knowledge Snapshots

Use this directory to record versioned knowledge states that feed training runs.

Why snapshots matter

Your raw documents, curated knowledge, prompt profiles, and adapters will evolve over time. Without snapshots, you will not know why one run behaved better than another.

Use date plus a short label:

  • 2026-03-11-v1-baseline
  • 2026-03-18-v2-guardrail-relaxed
  • 2026-04-02-v3-finance-refresh

What a snapshot note should include

  • snapshot id
  • date
  • persona version or commit hash
  • domain files changed
  • guardrail profile
  • training dataset paths and hashes
  • source snapshot references from data/raw/
  • model profile
  • adapter output path
  • evaluation report path
  • short explanation of what changed

Suggested file structure

Each snapshot can be a small Markdown note or YAML manifest. Example headings:

snapshot_id: 2026-03-11-v1-baseline
model_profile: qwen2.5-7b-instruct
guardrail_profile: standard
knowledge_files:
  - knowledge/persona.md
  - knowledge/domains/01_infrastructure_platform.md
training_data:
  - data/training/generated/train.jsonl
evaluation_report: evaluation/reports/2026-03-11-v1-baseline.json
notes: First baseline after persona and infra domain curation.

Best practice

  • snapshot before every meaningful LoRA run
  • snapshot before and after major guardrail changes
  • snapshot after large knowledge refreshes
  • keep the note small but complete enough to reproduce the run