| Contrastive Decoding Improves Reasoning in Large Language Models | Sep 17, 2023 | GSM8KHellaSwag | —Unverified | 0 |
| Towards Multilingual LLM Evaluation for European Languages | Oct 11, 2024 | ARCGSM8K | —Unverified | 0 |
| GRIN: GRadient-INformed MoE | Sep 18, 2024 | HellaSwagHumanEval | —Unverified | 0 |
| When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation | Nov 16, 2021 | Data AugmentationHellaSwag | —Unverified | 0 |
| Who's Harry Potter? Approximate Unlearning in LLMs | Oct 3, 2023 | ARCGPU | —Unverified | 0 |
| HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning | Feb 17, 2025 | HellaSwag | —Unverified | 0 |
| Domain-Adaptive Continued Pre-Training of Small Language Models | Apr 13, 2025 | Domain AdaptationHellaSwag | —Unverified | 0 |
| You can remove GPT2's LayerNorm by fine-tuning | Sep 6, 2024 | HellaSwag | CodeCode Available | 0 |
| Attacks on Node Attributes in Graph Neural Networks | Feb 19, 2024 | Contrastive LearningHellaSwag | CodeCode Available | 0 |
| FinerWeb-10BT: Refining Web Data with LLM-Based Line-Level Filtering | Jan 13, 2025 | DescriptiveHellaSwag | CodeCode Available | 0 |