| ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study | Dec 19, 2024 | AstronomyDomain Adaptation | CodeCode Available | 0 |
| MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design | Dec 19, 2024 | MMLUQuantization | —Unverified | 0 |
| ChainRank-DPO: Chain Rank Direct Preference Optimization for LLM Rankers | Dec 18, 2024 | MMLUReranking | —Unverified | 0 |
| Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs | Dec 17, 2024 | MMLU | —Unverified | 0 |
| Nanoscaling Floating-Point (NxFP): NanoMantissa, Adaptive Microexponents, and Code Recycling for Direct-Cast Compression of Large Language Models | Dec 15, 2024 | MMLUQuantization | —Unverified | 0 |
| LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering | Dec 13, 2024 | Few-Shot LearningKnowledge Distillation | —Unverified | 0 |
| Llama 3 Meets MoE: Efficient Upcycling | Dec 13, 2024 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation | Dec 4, 2024 | MMLU | —Unverified | 0 |
| Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset | Dec 3, 2024 | ARCMMLU | —Unverified | 0 |
| Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models | Dec 2, 2024 | MMLUMultiple-choice | CodeCode Available | 0 |