Evaluation Case Suites¶
This folder contains the benchmark files used to measure baseline behavior, guardrail behavior, and post-training changes.
Case files¶
- core_eval_cases.jsonl: the main benchmark suite for first-run baselines and post-LoRA comparison
- guardrail_profile_matrix.jsonl: a smaller repeated-comparison set for guardrail profile selection
- sample_request.json: a sample API request payload for manual end-to-end checks
When to use each file¶
- use
core_eval_cases.jsonlwhen you want a serious baseline or release-candidate comparison - use
guardrail_profile_matrix.jsonlwhen you want a faster local comparison betweenoriginal_model,relaxed,standard, andstrict - use
sample_request.jsonwhen you want to send a prompt through the live API and inspect the full response shape