2024.07.03 [24’] LaSagnA: Language-based Segmentation Assistant for Complex Queries Multimodal Referring Segmentation
2024.07.03 [24’ CVPR] Compositional Chain-of-Thought Prompting for Large Multimodal Models Multimodal Chain-of-Thought
2024.07.03 [24’ CVPR] AnyRef: Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception Multimodal Referring Segmentation Visual Grounding
2024.07.02 [24’ CVPR] PerceptionGPT: Effectively Fusing Visual Perception into LLM Multimodal Analysis Detection Segmentation Visual Perception
2024.07.02 [23’] LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models Multimodal Segmentation Visual Grounding