Natural Language Autoencoders use two LLM modules to map internal activations to text descriptions. Trained via reinforcement learning, these NLAs reconstruct residual stream activations to produce human-readable interpretations of model internals. Researchers used this method during a pre-deployment audit of Claude Opus 4.6. It provides a concrete tool for diagnosing safety-relevant behaviors before release.