Inference efficiency on benchmarks varies by 100x based on the software environment and contextual documents provided to a model. Researchers Hans Gundlach and colleagues found that these scaffolds often influence price-performance more than the underlying model choice. This interaction means a single scaffold can optimize one model while hindering another.