2024.07.03 [24’] M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models Medical Multimodal Segmentation
2024.07.02 [24’ CVPR] PerceptionGPT: Effectively Fusing Visual Perception into LLM Multimodal Analysis Detection Segmentation Visual Perception
2024.07.02 [23’] LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models Multimodal Segmentation Visual Grounding
2024.07.02 [24’] F-LMM: Grounding Frozen Large Multimodal Models Multimodal Chain-of-Thought Segmentation Visual Grounding
2024.07.01 [24’ ICLR] KOSMOS-2: Grounding Multimodal Large Language Models to the World Multimodal Visual Grounding