Apple researchers developed ParaRNN, a framework that enables parallel training for nonlinear Recurrent Neural Networks. This removes the sequential bottleneck that previously prevented RNNs from scaling to billions of parameters. Practitioners can now deploy memory-efficient architectures without sacrificing training speed. It offers a viable, low-compute alternative to attention-based Transformers for LLMs.