Natural Language Autoencoders use two LLM modules to map activations into text descriptions and back again. Researchers trained these modules via reinforcement learning to reconstruct residual stream activations. This unsupervised method helped diagnose safety-relevant behaviors during a pre-deployment audit of Claude Opus 4.6. It provides model auditing practitioners a concrete way to interpret internal model states.