03 Training¶
Scope¶
This repository uses adapter-based fine-tuning only. Full model retraining is intentionally out of scope.
Local MLX LoRA¶
MLX is the Apple-Silicon-native ML framework used in this repository for local model execution and local LoRA training. It is the default local training path because it fits the MacBook Pro M4 goal better than a generic CUDA-first setup.
- Default target:
Qwen2.5-7B-Instruct - Fast dry-run target:
Qwen2.5-3B-Instruct - Recommended for quick iteration, refusal tuning, domain alignment, and prompt-format adaptation
- Output adapters are written under
adapters/output/ - The training manager prepares an MLX-compatible dataset directory under
data/training/generated/mlx/and writes a generated config underadapters/generated/
If your goal is to validate the full process quickly rather than produce the first serious model, start with qwen2.5-3b-instruct. Once the workflow is proven, move back to qwen2.5-7b-instruct for the real adapter.
Example¶
uv run personal-llm train-local-mlx \
--config config/training.yaml \
--dataset data/training/example_sft.jsonl
Dataset design¶
- Positive technical/professional conversations
- Refusal examples for disallowed domains
- RAG-grounded instruction/answer pairs
- Jurisdiction-aware finance and tax examples with explicit disclaimer behavior
Use docs/10_dataset_design_examples.md and data/training/examples/README.md before building your own dataset. Those files show the expected JSONL schema, good and bad pair design, refusal examples, and boundary cases.
The selected guardrail profile affects generated training data:
strictandstandardinclude refusal examplesrelaxedandoriginal_modeldo not
See docs/09_guardrails.md before generating adapters if you want to preserve more of the original base model behavior.
See knowledge/persona.md and docs/11_personal_knowledge.md before generating domain examples if the model should reflect your own decision style rather than only source documents.
Remote LoRA¶
Use RunPod for the default remote automation path when local training is too slow or you need 14B-class adapters.
RunPod is not part of the local system. It is only the optional rented-GPU path for remote adapter jobs.