| Incremental Sequence Classification with Temporal Consistency | May 22, 2025 | ClassificationLanguage Modeling | —Unverified | 0 |
| Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models | May 22, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| TensorAR: Refinement is All You Need in Autoregressive Image Generation | May 22, 2025 | AllImage Generation | —Unverified | 0 |
| Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering | May 22, 2025 | Global FactsLanguage Modeling | CodeCode Available | 0 |
| A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP | May 22, 2025 | Continual PretrainingDiagnostic | CodeCode Available | 0 |
| SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning | May 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| CASTILLO: Characterizing Response Length Distributions of Large Language Models | May 22, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 0 |
| Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector | May 21, 2025 | Bias DetectionIn-Context Learning | —Unverified | 0 |
| Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering | May 21, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |