The latest Sentence Transformers update introduces streamlined workflows for training multimodal embedding and reranker models. Users can now integrate images and text into a shared vector space using a unified API. This removes the need for custom training loops. Practitioners can deploy more accurate retrieval systems by finetuning these models on domain-specific image-text pairs.