Large-scale trajectories from tracker models now power a new motion embedding system from Apple. This approach generates long, realistic motions via text prompts or spatial pokes without the overhead of full video synthesis. It operates orders of magnitude more efficiently than current video models. Practitioners gain a faster method for predicting complex scene dynamics.