NEW !
VIRAL: Visual Representation Alignment for Multimodal Large Language Models
Arxiv, 2025