| SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis | Jan 1, 2025 | Large Language Model | CodeCode Available | 1 |
| Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation | Jan 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model | Jan 1, 2025 | AttributeLanguage Modeling | —Unverified | 0 |
| Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding | Jan 1, 2025 | 3DGSLarge Language Model | —Unverified | 0 |
| HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models | Jan 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving | Jan 1, 2025 | Autonomous DrivingCARLA longest6 | —Unverified | 0 |
| ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models | Jan 1, 2025 | Large Language ModelObject | —Unverified | 0 |
| VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Jan 1, 2025 | Large Language ModelVideo Segmentation | —Unverified | 0 |
| Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering | Jan 1, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 1 |
| ChatHuman: Chatting about 3D Humans with Tools | Jan 1, 2025 | Human-Object Interaction DetectionIn-Context Learning | —Unverified | 0 |
| Video-Bench: Human-Aligned Video Generation Benchmark | Jan 1, 2025 | Large Language ModelVideo Generation | —Unverified | 0 |
| S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation | Jan 1, 2025 | Autonomous DrivingAutonomous Vehicles | —Unverified | 0 |
| Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers | Jan 1, 2025 | Bias DetectionLarge Language Model | —Unverified | 0 |
| Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro | Jan 1, 2025 | Data AugmentationLanguage Modeling | CodeCode Available | 0 |
| Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines | Jan 1, 2025 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform | Jan 1, 2025 | Code GenerationImage Generation | —Unverified | 0 |
| Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things | Jan 1, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Adjoint sharding for very long context training of state space models | Jan 1, 2025 | GPULarge Language Model | —Unverified | 0 |
| Towards Sustainable Large Language Model Serving | Dec 31, 2024 | GPULanguage Modeling | —Unverified | 0 |
| CancerKG.ORG A Web-scale, Interactive, Verifiable Knowledge Graph-LLM Hybrid for Assisting with Optimal Cancer Treatment and Care | Dec 31, 2024 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| Efficient Standardization of Clinical Notes using Large Language Models | Dec 31, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation | Dec 31, 2024 | Language Model EvaluationLanguage Modeling | —Unverified | 0 |
| Generative Emergent Communication: Large Language Model is a Collective World Model | Dec 31, 2024 | Bayesian InferenceLanguage Modeling | —Unverified | 0 |
| LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts | Dec 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DropMicroFluidAgents (DMFAs): Autonomous Droplet Microfluidic Research Framework Through Large Language Model Agents | Dec 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |