03 Training¶

Scope¶

This repository uses adapter-based fine-tuning only. Full model retraining is intentionally out of scope.

Local MLX LoRA¶

MLX is the Apple-Silicon-native ML framework used in this repository for local model execution and local LoRA training. It is the default local training path because it fits the MacBook Pro M4 goal better than a generic CUDA-first setup.

Default target: Qwen2.5-7B-Instruct
Fast dry-run target: Qwen2.5-3B-Instruct
Recommended for quick iteration, refusal tuning, domain alignment, and prompt-format adaptation
Output adapters are written under adapters/output/
The training manager prepares an MLX-compatible dataset directory under data/training/generated/mlx/ and writes a generated config under adapters/generated/

If your goal is to validate the full process quickly rather than produce the first serious model, start with qwen2.5-3b-instruct. Once the workflow is proven, move back to qwen2.5-7b-instruct for the real adapter.

Example¶

uv run personal-llm train-local-mlx \
  --config config/training.yaml \
  --dataset data/training/example_sft.jsonl

Dataset design¶

Positive technical/professional conversations
Refusal examples for disallowed domains
RAG-grounded instruction/answer pairs
Jurisdiction-aware finance and tax examples with explicit disclaimer behavior

Use docs/10_dataset_design_examples.md and data/training/examples/README.md before building your own dataset. Those files show the expected JSONL schema, good and bad pair design, refusal examples, and boundary cases.

The selected guardrail profile affects generated training data:

strict and standard include refusal examples
relaxed and original_model do not

See docs/09_guardrails.md before generating adapters if you want to preserve more of the original base model behavior. See knowledge/persona.md and docs/11_personal_knowledge.md before generating domain examples if the model should reflect your own decision style rather than only source documents.

Remote LoRA¶

Use RunPod for the default remote automation path when local training is too slow or you need 14B-class adapters.

RunPod is not part of the local system. It is only the optional rented-GPU path for remote adapter jobs.