| OpenHuEval: Evaluating Large Language Model on Hungarian Specifics | Mar 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding | Mar 27, 2025 | FormLanguage Modeling | CodeCode Available | 1 |
| CoLLM: A Large Language Model for Composed Image Retrieval | Mar 25, 2025 | Image RetrievalLanguage Modeling | CodeCode Available | 1 |
| LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation | Mar 25, 2025 | Code CompletionLanguage Modeling | CodeCode Available | 1 |
| CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning | Mar 25, 2025 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model | Mar 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Language Model Uncertainty Quantification with Attention Chain | Mar 24, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |
| Sun-Shine: A Large Language Model for Tibetan Culture | Mar 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| What Makes a Reward Model a Good Teacher? An Optimization Perspective | Mar 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma? | Mar 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control | Mar 14, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |
| MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens | Mar 14, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | CodeCode Available | 1 |
| Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space | Mar 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BiasEdit: Debiasing Stereotyped Language Models via Model Editing | Mar 11, 2025 | counterfactualLanguage Modeling | CodeCode Available | 1 |
| EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees | Mar 11, 2025 | ChatbotLanguage Modeling | CodeCode Available | 1 |
| GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval | Mar 10, 2025 | Causal Language ModelingLanguage Modeling | CodeCode Available | 1 |
| V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation | Mar 10, 2025 | DecoderImage Generation | CodeCode Available | 1 |
| VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion | Mar 8, 2025 | 3D Semantic Scene CompletionAutonomous Driving | CodeCode Available | 1 |
| Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices | Mar 8, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| L^2M: Mutual Information Scaling Law for Long-Context Language Modeling | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model | Mar 4, 2025 | es-enLanguage Modeling | CodeCode Available | 1 |
| Words or Vision: Do Vision-Language Models Have Blind Faith in Text? | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation | Mar 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Enhancing Monocular 3D Scene Completion with Diffusion Model | Mar 2, 2025 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 1 |
| Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable | Mar 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |