2024.07.05 [24’ ICML] Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Multimodal Analysis Instruction Tuning Visual Encoder
2024.07.05 [24’ CVPR] Osprey: Pixel Understanding with Visual Instruction Tuning Multimodal Instruction Tuning Visual Encoder Visual Perception
2024.07.05 [24’] ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Multimodal Visual Encoder
2024.07.05 [24’] Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Multimodal Analysis Visual Encoder
2024.07.03 [24’] M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models Medical Multimodal Segmentation