The latest Sentence Transformers update introduces multimodal embedding and reranker models. This allows developers to map images and text into a shared vector space for improved retrieval. It simplifies the pipeline for visual question answering and image search. Practitioners can now deploy these unified models using a single, consistent API across different modalities.