Training on behavioral traits like truthfulness improved scores on 44 of 53 benchmarks. OpenAI researchers found that reinforcement learning on specific desired traits generalizes across domains, including health data. This method enhances deception detection more effectively than traditional approaches. Practitioners can now harden models against manipulation using smaller, targeted datasets.