| CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing | Feb 4, 2025 | Collaborative InferenceLanguage Modeling | CodeCode Available | 1 |
| Simulating Rumor Spreading in Social Networks using LLM Agents | Feb 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods | Feb 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Speculative Ensemble: Fast Large Language Model Ensemble via Speculation | Feb 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Low-Rank Adapting Models for Sparse Autoencoders | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Scalable-Softmax Is Superior for Attention | Jan 31, 2025 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training | Jan 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| 2SSP: A Two-Stage Framework for Structured Pruning of LLMs | Jan 29, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| RadioLLM: Introducing Large Language Model into Cognitive Radio via Hybrid Prompt and Token Reprogrammings | Jan 28, 2025 | DenoisingDomain Generalization | CodeCode Available | 1 |
| Atla Selene Mini: A General Purpose Evaluation Model | Jan 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Ocean-OCR: Towards General OCR Application via a Vision-Language Model | Jan 26, 2025 | document understandingLanguage Modeling | CodeCode Available | 1 |
| ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer | Jan 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing | Jan 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques | Jan 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Enhancing Biomedical Relation Extraction with Directionality | Jan 23, 2025 | BenchmarkingDocument-level Relation Extraction | CodeCode Available | 1 |
| PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language Model | Jan 21, 2025 | HallucinationImage Captioning | CodeCode Available | 1 |
| Glinthawk: A Two-Tiered Architecture for Offline LLM Inference | Jan 20, 2025 | CPULanguage Modeling | CodeCode Available | 1 |
| EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model | Jan 19, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 1 |
| LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport | Jan 16, 2025 | AudioCapsAudio captioning | CodeCode Available | 1 |
| WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning | Jan 15, 2025 | cross-modal alignmentLanguage Modeling | CodeCode Available | 1 |
| Gandalf the Red: Adaptive Security for LLMs | Jan 14, 2025 | BlockingLanguage Modeling | CodeCode Available | 1 |
| 3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding | Jan 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| VASparse: Towards Efficient Visual Hallucination Mitigation for Large Vision-Language Model via Visual-Aware Sparsification | Jan 11, 2025 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Merging Feed-Forward Sublayers for Compressed Transformers | Jan 10, 2025 | image-classificationImage Classification | CodeCode Available | 1 |