| LM-Cocktail: Resilient Tuning of Language Models via Model Merging | Nov 22, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization | Nov 22, 2023 | GPULanguage Modelling | CodeCode Available | 1 |
| AcademicGPT: Empowering Academic Research | Nov 21, 2023 | Abstract generationGeneral Knowledge | —Unverified | 0 |
| Investigating Data Contamination in Modern Benchmarks for Large Language Models | Nov 16, 2023 | Common Sense ReasoningMMLU | —Unverified | 0 |
| ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology | Nov 16, 2023 | MMLUMultiple-choice | —Unverified | 0 |
| MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning | Nov 16, 2023 | MedQAMMLU | CodeCode Available | 2 |
| Rethinking Benchmark and Contamination for Language Models with Rephrased Samples | Nov 8, 2023 | HumanEvalMMLU | CodeCode Available | 2 |
| The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback | Oct 31, 2023 | GSM8KMMLU | —Unverified | 0 |
| TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise | Oct 29, 2023 | Data AugmentationLanguage Modeling | —Unverified | 0 |
| An Open Source Data Contamination Report for Large Language Models | Oct 26, 2023 | HellaSwagLanguage Modeling | CodeCode Available | 1 |
| Evaluation of large language models using an Indian language LGBTI+ lexicon | Oct 26, 2023 | Machine TranslationMMLU | —Unverified | 0 |
| Irreducible Curriculum for Language Model Pretraining | Oct 23, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Instruction Tuning with Human Curriculum | Oct 14, 2023 | ARCMMLU | —Unverified | 0 |
| Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models | Oct 9, 2023 | MMLU | —Unverified | 0 |
| Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models | Oct 8, 2023 | MMLUNatural Language Understanding | CodeCode Available | 1 |
| A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration | Oct 3, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models | Sep 27, 2023 | HumanEvalLanguage Modeling | CodeCode Available | 0 |
| Baichuan 2: Open Large-scale Language Models | Sep 19, 2023 | Feature EngineeringGSM8K | CodeCode Available | 4 |
| OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch | Sep 19, 2023 | BelebeleMMLU | CodeCode Available | 1 |
| Pruning Large Language Models via Accuracy Predictor | Sep 18, 2023 | MMLUModel Compression | —Unverified | 0 |
| Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations | Aug 27, 2023 | Instruction FollowingMMLU | CodeCode Available | 0 |
| The Poison of Alignment | Aug 25, 2023 | MMLU | —Unverified | 0 |
| Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment | Aug 18, 2023 | MMLURed Teaming | CodeCode Available | 1 |
| Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning | Jun 25, 2023 | counterfactualMath | —Unverified | 0 |
| Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In | May 27, 2023 | MMLURetrieval | CodeCode Available | 1 |
| The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models | May 24, 2023 | Language ModellingMath | CodeCode Available | 1 |
| Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers | May 21, 2023 | MMLUZero-shot Generalization | CodeCode Available | 1 |
| Towards Expert-Level Medical Question Answering with Large Language Models | May 16, 2023 | Medical Question AnsweringMedQA | CodeCode Available | 1 |
| From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning | Apr 17, 2023 | MMLUZero-shot Generalization | CodeCode Available | 1 |
| ART: Automatic multi-step reasoning and tool-use for large language models | Mar 16, 2023 | MMLU | CodeCode Available | 6 |
| REPLUG: Retrieval-Augmented Black-Box Language Models | Jan 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Inconsistencies in Masked Language Models | Dec 30, 2022 | LAMBADAMMLU | CodeCode Available | 0 |
| Large Language Models Encode Clinical Knowledge | Dec 26, 2022 | Clinical KnowledgeMedQA | CodeCode Available | 1 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| Measuring Progress on Scalable Oversight for Large Language Models | Nov 4, 2022 | Experimental DesignLanguage Modelling | —Unverified | 0 |
| Scaling Instruction-Finetuned Language Models | Oct 20, 2022 | Coreference ResolutionCross-Lingual Question Answering | CodeCode Available | 3 |
| Transcending Scaling Laws with 0.1% Extra Compute | Oct 20, 2022 | Arithmetic ReasoningCross-Lingual Question Answering | —Unverified | 0 |
| Atlas: Few-shot Learning with Retrieval Augmented Language Models | Aug 5, 2022 | Fact CheckingFew-Shot Learning | CodeCode Available | 2 |
| UL2: Unifying Language Learning Paradigms | May 10, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 1 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 |