Inference efficiency on benchmarks varies by 100x depending on the software environment and contextual documents provided to a model. Researchers Hans Gundlach and colleagues found that these scaffolds often influence price-performance more than the underlying model choice. Because scaffold-model interactions differ by task, practitioners must optimize the environment specifically for each model.