| LM-Cocktail: Resilient Tuning of Language Models via Model Merging | Nov 22, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization | Nov 22, 2023 | GPULanguage Modelling | CodeCode Available | 1 |
| AcademicGPT: Empowering Academic Research | Nov 21, 2023 | Abstract generationGeneral Knowledge | —Unverified | 0 |
| Investigating Data Contamination in Modern Benchmarks for Large Language Models | Nov 16, 2023 | Common Sense ReasoningMMLU | —Unverified | 0 |
| ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology | Nov 16, 2023 | MMLUMultiple-choice | —Unverified | 0 |
| MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning | Nov 16, 2023 | MedQAMMLU | CodeCode Available | 2 |
| Rethinking Benchmark and Contamination for Language Models with Rephrased Samples | Nov 8, 2023 | HumanEvalMMLU | CodeCode Available | 2 |
| The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback | Oct 31, 2023 | GSM8KMMLU | —Unverified | 0 |
| TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise | Oct 29, 2023 | Data AugmentationLanguage Modeling | —Unverified | 0 |
| An Open Source Data Contamination Report for Large Language Models | Oct 26, 2023 | HellaSwagLanguage Modeling | CodeCode Available | 1 |