The VegAS framework introduces a test-time verification step to stabilize multimodal LLMs in embodied tasks. Instead of trusting a single single output, it samples candidate actions and uses a generative verifier to select the best one. This approach reduces brittleness in out-of-distribution scenarios. Practitioners can now implement more reliable action selection for MLLM-based robots.