A new software benchmark provides a standardized way to measure performance across AI applications. This framework targets specific latency and throughput metrics to reduce guesswork during model deployment. Developers can now compare efficiency across different environments. This incremental update offers a more reliable baseline for optimizing inference costs in production.