| Automatic Instruction Evolving for Large Language Models | Jun 2, 2024 | GSM8KHumanEval | CodeCode Available | 3 |
| GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment | May 30, 2024 | GSM8KKnowledge Distillation | CodeCode Available | 0 |
| SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths | May 30, 2024 | GSM8KHumanEval | —Unverified | 0 |
| Arithmetic Reasoning with LLM: Prolog Generation & Permutation | May 28, 2024 | Arithmetic ReasoningData Augmentation | —Unverified | 0 |
| LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters | May 27, 2024 | BenchmarkingGSM8K | CodeCode Available | 2 |
| Multi-Reference Preference Optimization for Large Language Models | May 26, 2024 | GSM8KTruthfulQA | —Unverified | 0 |
| MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time | May 25, 2024 | GSM8KMath | —Unverified | 0 |
| Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training | May 23, 2024 | GSM8KMixture-of-Experts | CodeCode Available | 7 |
| ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification | May 23, 2024 | GPUGSM8K | CodeCode Available | 1 |
| Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast | May 23, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 1 |