The evalgrid-framework now supports over 100 metrics for LLM evaluation. It utilizes parallel async evaluation and batched judging to reduce costs and latency. Native pytest integration allows developers to embed model testing directly into existing CI/CD pipelines. This is an incremental addition to the crowded LLM evaluation toolset.