Experiments with 40 volunteers show that robots using visual language models can read human emotions by analyzing facial expressions and context. The systems adjust their behavior in real-time based on these cues. This improves human perception of robot capability during collaborative tasks. It is an incremental step toward more intuitive human-robot interaction.