| Small Language Model Makes an Effective Long Text Extractor | Feb 11, 2025 | GPULanguage Modeling | CodeCode Available | 1 |
| JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata | Feb 11, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization | Feb 11, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| RomanLens: Latent Romanization and its role in Multilinguality in LLMs | Feb 11, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More | Feb 11, 2025 | DecoderInformation Retrieval | CodeCode Available | 0 |
| Auditing Prompt Caching in Language Model APIs | Feb 11, 2025 | DecoderLanguage Modeling | CodeCode Available | 0 |
| Implicit Language Models are RNNs: Balancing Parallelization and Expressivity | Feb 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AppVLM: A Lightweight Vision Language Model for Online App Control | Feb 10, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM | Feb 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |