← Back to syllabus
Fine-Tuning & RLHF Intuition · Week 6 · Day 5/7
DAY 40 / 210

Unsloth Setup + First 7B QLoRA Run

This ship day launches the finetune phase by executing the first actual training loop on rented L4 hardware. Establishing a working Unsloth + QLoRA pipeline now creates the measurable baseline (loss curve) that all subsequent optimization and scaling work will be judged against.

50 min target📝 2 quiz Qs

Resources

Deliverable

Committed unsloth-7b-qlora.py that completes one training run on L4 with logged train/loss curve

Quiz · 2 questions

1. Why does Unsloth recommend using torch.compile and gradient checkpointing together for a 7B QLoRA run?

2. What single metric from the first run should be recorded to confirm the pipeline is working?

Journal