2024.06.26 [23’ NIPS] LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day Medical Dataset Multimodal
2024.06.25 [24’ CVPR] VIVL: Towards Better Vision-Inspired Vision-Language Models Multimodal Adapter Prefix Tuning
2024.06.25 [24’] Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models Multimodal Chain-of-Thought