A dataset of 1,466 high school physics responses tested various augmentation strategies to improve automated scoring. Researchers used SciBERT to classify 11 binary rubric categories, focusing on rare advanced reasoning markers. The study identifies which resampling methods best mitigate class imbalance. This provides a blueprint for building more accurate educational assessment tools.