| LoLCATs: On Low-Rank Linearizing of Large Language Models | Oct 14, 2024 | MMLU | CodeCode Available | 3 |
| Scaling Instruction-Finetuned Language Models | Oct 20, 2022 | Coreference ResolutionCross-Lingual Question Answering | CodeCode Available | 3 |
| Are We Done with MMLU? | Jun 6, 2024 | MMLUVirology | CodeCode Available | 3 |
| Compact Language Models via Pruning and Knowledge Distillation | Jul 19, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 3 |
| General-Reasoner: Advancing LLM Reasoning Across All Domains | May 20, 2025 | AllMath | CodeCode Available | 3 |
| ChatMusician: Understanding and Generating Music Intrinsically with LLM | Feb 25, 2024 | MMLUText Generation | CodeCode Available | 3 |
| HadaCore: Tensor Core Accelerated Hadamard Transform Kernel | Dec 12, 2024 | GPUMMLU | CodeCode Available | 3 |
| MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark | Jun 3, 2024 | MMLUMulti-task Language Understanding | CodeCode Available | 3 |
| ReasonIR: Training Retrievers for Reasoning Tasks | Apr 29, 2025 | Information RetrievalMMLU | CodeCode Available | 3 |
| YourBench: Easy Custom Evaluation Sets for Everyone | Apr 2, 2025 | MMLU | CodeCode Available | 3 |