Two LLM modules, an activation verbalizer and reconstructor, now map internal activations to human-readable text. This unsupervised method uses reinforcement learning to reconstruct residual stream activations. Researchers applied the tool to audit Claude Opus 4.6 before deployment. The system identifies safety-relevant behaviors by translating opaque model internals into plausible, informative natural language explanations.