| Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering | May 21, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Diagnosing our datasets: How does my language model learn clinical information? | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective | May 21, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Revealing Language Model Trajectories via Kullback-Leibler Divergence | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning | May 21, 2025 | Domain GeneralizationLanguage Modeling | —Unverified | 0 |
| Listen to the Context: Towards Faithful Large Language Models for Retrieval Augmented Generation on Climate Questions | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | May 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |