The latest Sentence Transformers update adds native support for training multimodal embedding and reranker models. This integration streamlines the process of aligning text and image representations into a shared vector space. Developers can now fine-tune these models with fewer lines of code. It lowers the barrier for building high-accuracy multimodal retrieval systems.