Two LLM modules comprising an activation verbalizer and reconstructor now map internal activations to natural language. Researchers trained Natural Language Autoencoders via reinforcement learning to reconstruct residual stream activations. This unsupervised method helped diagnose safety-relevant behaviors during a pre-deployment audit of Claude Opus 4.6. It provides a concrete path for auditing model internals.