← Back to syllabus
Inference Economics at Scale · Week 12 · Day 2/7
DAY 79 / 210

Core Concepts of LLM Inference Serving

This day opens phase-3 by grounding learners in production inference realities rather than training. It directly supports Maku's StartupTribunal work by clarifying how model outputs reach users at scale. The focus on measurable trade-offs prevents common over-optimism about raw model quality alone.

45 min target📝 2 quiz Qs

Resources

Deliverable

300-word journal entry on inference metrics relevant to StartupTribunal

Quiz · 2 questions

1. Which factor most directly limits throughput when batch size increases?

2. Explain in two sentences why latency and throughput are not always improved by the same technique.

Journal