Apple researchers created a motion embedding system that operates on large-scale trajectories instead of full video synthesis. This approach generates long, realistic movements via text prompts or spatial pokes. It reduces the computational cost of predicting scene dynamics by orders of magnitude. Practitioners can now simulate complex futures without expensive video rendering.