Evaluation¶
Evaluation reports and benchmark artifacts belong here. The scoring implementation lives in src/personal_llm/evaluation/.
cases/: benchmark inputs for automated and manual evaluationreports/: generated outputs from evaluation runs
Start with docs/07_evaluation.md.