| LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama | Mar 14, 2025 | BenchmarkingMMLU | —Unverified | 0 | 0 |
| Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance | Feb 17, 2025 | BenchmarkingDependency Parsing | —Unverified | 0 | 0 |
| Large Language Models Could Be Rote Learners | Apr 11, 2025 | MemorizationMMLU | —Unverified | 0 | 0 |
| Large Language Models Often Know When They Are Being Evaluated | May 28, 2025 | MMLUMultiple-choice | —Unverified | 0 | 0 |
| Learning from "Silly" Questions Improves Large Language Models, But Only Slightly | Nov 21, 2024 | EconometricsGlobal Facts | —Unverified | 0 | 0 |
| Learning What Matters: Probabilistic Task Selection via Mutual Information for Model Finetuning | Jul 16, 2025 | DiversityMMLU | —Unverified | 0 | 0 |
| Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning | Jun 25, 2023 | counterfactualMath | —Unverified | 0 | 0 |
| Leveraging Approximate Caching for Faster Retrieval-Augmented Generation | Mar 7, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Leveraging Uncertainty Estimation for Efficient LLM Routing | Feb 16, 2025 | GSM8KMMLU | —Unverified | 0 | 0 |
| Lizard: An Efficient Linearization Framework for Large Language Models | Jul 11, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |