Why run a huge, costly LLM when a smaller, distilled one can do the job faster, cheaper and with fewer hallucinations?