| Phi-4 Technical Report | Dec 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Granite Guardian | Dec 10, 2024 | HallucinationLanguage Modeling | CodeCode Available | 2 |
| LinVT: Empower Your Image-level Large Language Model to Understand Videos | Dec 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| C^2LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation | Dec 6, 2024 | Language Model EvaluationLanguage Modeling | CodeCode Available | 2 |
| FLAIR: VLM with Fine-grained Language-informed Image Representations | Dec 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs | Dec 2, 2024 | AllLanguage Modeling | CodeCode Available | 2 |
| KV Shifting Attention Enhances Language Modeling | Nov 29, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection | Nov 26, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension | Nov 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 |