
A New Framework for AI Evaluation
Mallika Rao, with experience from Twitter and Netflix, presents a new framework for evaluating production AI systems. She argues traditional metrics are outdated and introduces a five-layer evaluation stack and a maturity model to prevent silent failures and manage "evaluation debt."
