The VegAS framework replaces single-action decoding with an ensemble sampling and verification process. By using a generative verifier at inference time, the system identifies the most reliable action from multiple candidates. This approach reduces brittleness in out-of-distribution scenarios. Practitioners gain a more robust method for deploying MLLM-based agents in complex environments.