← Back to syllabus
Inference Economics at Scale · Week 14 · Day 1/7
DAY 92 / 210

Foundations of Production LLM Inference

This opening day of phase-3 establishes why inference differs from training and why it dominates real-world costs. It creates the mental model needed before any optimization or serving work begins.

45 min target📝 3 quiz Qs

Resources

Deliverable

Journal entry listing three inference bottlenecks observed in current app/maku routes plus one candidate fix

Quiz · 3 questions

1. Why is LLM inference typically memory-bound rather than compute-bound?

2. Name one concrete difference between training and inference memory access patterns.

3. How might the rate-limiter in lib/rate-limiter.ts interact with an inference queue under bursty traffic?

Journal