2024.06.07 [24’ CVPR] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Multimodal Analysis Visual Encoder
2024.06.04 [24’ CVPR] Honeybee: Locality-enhanced Projector for Multimodal LLM Multimodal Adapter Analysis
2024.06.03 [24’] Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding Language Meta-Prompting
2024.06.02 [23’] LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model Multimodal Reasoning Segmentation