| AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding | Aug 30, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model | Aug 21, 2024 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| Video Emotion Open-vocabulary Recognition Based on Multimodal Large Language Model | Aug 21, 2024 | Emotion RecognitionLanguage Modeling | —Unverified | 0 |
| CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion | Aug 21, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis | Aug 18, 2024 | Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA) | —Unverified | 0 |
| ChatGPT Meets Iris Biometrics | Aug 9, 2024 | Face RecognitionIris Recognition | —Unverified | 0 |
| VideoQA in the Era of LLMs: An Empirical Study | Aug 8, 2024 | Multimodal Large Language ModelVideo Question Answering | CodeCode Available | 0 |
| VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks | Jul 29, 2024 | Deep LearningDomain Generalization | —Unverified | 0 |