| Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine | Dec 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Pathology Analysis | Dec 12, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering | Dec 12, 2024 | feature selectionLanguage Modeling | —Unverified | 0 |
| Learning Novel Skills from Language-Generated Demonstrations | Dec 12, 2024 | Imitation LearningLanguage Modeling | —Unverified | 0 |
| When Text Embedding Meets Large Language Model: A Comprehensive Survey | Dec 12, 2024 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework | Dec 11, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| LatentQA: Teaching LLMs to Decode Activations Into Natural Language | Dec 11, 2024 | DecoderLanguage Modeling | —Unverified | 0 |
| Template Matters: Understanding the Role of Instruction Templates in Multimodal Language Model Evaluation and Training | Dec 11, 2024 | Language Model EvaluationLanguage Modeling | CodeCode Available | 1 |
| Position-aware Guided Point Cloud Completion with CLIP Model | Dec 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |