A dataset of 1,466 high school science responses reveals a persistent class imbalance in automated scoring. Researchers used SciBERT to test data augmentation and resampling strategies for NGSS-aligned assessments. The study targets rare, advanced reasoning categories that typically confuse classifiers. This provides a blueprint for improving accuracy in specialized educational text classification tasks.