A score of 1,618 on the GDPval-AA v2 test allows Claude Sonnet 5 to edge past the larger Opus 4.8. Anthropic designed the model to beat its predecessor across all benchmarks. It intentionally scores low on cybersecurity tasks to avoid US government blocks. This narrows the performance gap between mid-tier and premium models.