| SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model | Feb 4, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Prompt-based Depth Pruning of Large Language Models | Feb 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Reviving The Classics: Active Reward Modeling in Large Language Model Alignment | Feb 4, 2025 | Computational EfficiencyExperimental Design | CodeCode Available | 2 |
| Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs | Feb 4, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 2 |
| JingFang: A Traditional Chinese Medicine Large Language Model of Expert-Level Medical Diagnosis and Syndrome Differentiation-Based Treatment | Feb 4, 2025 | DiagnosticLanguage Modeling | —Unverified | 0 |
| Flatten Graphs as Sequences: Transformers are Scalable Graph Generators | Feb 4, 2025 | DecoderGraph Generation | —Unverified | 0 |
| Rethinking Homogeneity of Vision and Text Tokens in Large Vision-and-Language Models | Feb 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Analyzing Similarity Metrics for Data Selection for Language Model Pretraining | Feb 4, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales | Feb 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants | Feb 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| LLM-USO: Large Language Model-based Universal Sizing Optimizer | Feb 4, 2025 | Bayesian OptimizationLanguage Modeling | —Unverified | 0 |
| MPIC: Position-Independent Multimodal Context Caching System for Efficient MLLM Serving | Feb 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing | Feb 4, 2025 | Collaborative InferenceLanguage Modeling | CodeCode Available | 1 |
| ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling | Feb 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via Dialogue Interpretation and Saliency Cues | Feb 4, 2025 | Dialogue InterpretationDialogue Understanding | —Unverified | 0 |
| Knowledge Synthesis of Photosynthesis Research Using a Large Language Model | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Eliciting Language Model Behaviors with Investigator Agents | Feb 3, 2025 | Bayesian InferenceHallucination | —Unverified | 0 |
| InfoBridge: Mutual Information estimation via Bridge Matching | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Scaling Embedding Layers in Language Models | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Learning to Learn Weight Generation via Local Consistency Diffusion | Feb 3, 2025 | Domain GeneralizationFew-Shot Learning | —Unverified | 0 |
| Scalable Language Models with Posterior Inference of Latent Thought Vectors | Feb 3, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| The Differences Between Direct Alignment Algorithms are a Blur | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging | Feb 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning | Feb 3, 2025 | Data ValuationLanguage Modeling | CodeCode Available | 0 |