A cluster of flagship releases including Gemma 4, DeepSeek V4, and GLM-5.1 just hit the open-source ecosystem. This wave of high-performance weights accelerates the transition toward open-weights parity with proprietary systems. Researchers can now benchmark these models against the CAISI V4 assessment to validate real-world reasoning capabilities across diverse tasks.