Skip to content

personal-llm

Baseline Template

Baseline Evaluation Template¶

Run metadata¶

date:
model profile:
guardrail profile:
backend:
adapter:
eval case file:

Summary¶

overall impression:
strongest domain:
weakest domain:
refusal behavior:
boundary-case behavior:
persona fit:

Quantitative notes¶

hard refusal rate:
boundary-case pass rate:
citation coverage:
unsupported-claim proxy:

Qualitative examples¶

Good response¶

prompt:
why it was good:

Weak response¶

prompt:
why it was weak:

Decision¶

keep as baseline:
changes to make before LoRA: