Skip to content

Evaluation

Evaluation reports and benchmark artifacts belong here. The scoring implementation lives in src/personal_llm/evaluation/.

  • cases/: benchmark inputs for automated and manual evaluation
  • reports/: generated outputs from evaluation runs

Start with docs/07_evaluation.md.