| ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology | Nov 16, 2023 | MMLUMultiple-choice | —Unverified | 0 |
| The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback | Oct 31, 2023 | GSM8KMMLU | —Unverified | 0 |
| TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise | Oct 29, 2023 | Data AugmentationLanguage Modeling | —Unverified | 0 |
| Evaluation of large language models using an Indian language LGBTI+ lexicon | Oct 26, 2023 | Machine TranslationMMLU | —Unverified | 0 |
| Irreducible Curriculum for Language Model Pretraining | Oct 23, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Instruction Tuning with Human Curriculum | Oct 14, 2023 | ARCMMLU | —Unverified | 0 |
| Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models | Oct 9, 2023 | MMLU | —Unverified | 0 |
| Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models | Sep 27, 2023 | HumanEvalLanguage Modeling | CodeCode Available | 0 |
| Pruning Large Language Models via Accuracy Predictor | Sep 18, 2023 | MMLUModel Compression | —Unverified | 0 |
| Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations | Aug 27, 2023 | Instruction FollowingMMLU | CodeCode Available | 0 |
| The Poison of Alignment | Aug 25, 2023 | MMLU | —Unverified | 0 |
| Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning | Jun 25, 2023 | counterfactualMath | —Unverified | 0 |
| Inconsistencies in Masked Language Models | Dec 30, 2022 | LAMBADAMMLU | CodeCode Available | 0 |
| Measuring Progress on Scalable Oversight for Large Language Models | Nov 4, 2022 | Experimental DesignLanguage Modelling | —Unverified | 0 |
| Transcending Scaling Laws with 0.1% Extra Compute | Oct 20, 2022 | Arithmetic ReasoningCross-Lingual Question Answering | —Unverified | 0 |