The Refactor Arena package allows researchers to test if AI agents inject security vulnerabilities during code refactoring. This configurable platform uses YAML files to define complex software engineering scenarios. It provides a controlled environment for safety evaluations. Practitioners can now quantify the risk of autonomous agents introducing bugs into production codebases via pip install.