Apple developed SpecMD to standardize how Mixture-of-Experts models handle expert prefetching across different hardware. The framework benchmarks ad-hoc caching policies to resolve the gap between sparse activation and actual inference speed. This allows researchers to optimize parameter loading. It provides a necessary technical baseline for reducing latency in sparse model deployments.