| Llama 2: Open Foundation and Fine-Tuned Chat Models | Jul 18, 2023 | Arithmetic Reasoning | CodeCode Available | 8 | 5 |
| LLaMA: Open and Efficient Foundation Language Models | Feb 27, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 7 | 5 |
| GPT-4 Technical Report | Mar 15, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Dec 1, 2023 | 2D Pose EstimationCommon Sense Reasoning | CodeCode Available | 6 | 5 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 | 5 |
| Mistral 7B | Oct 10, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| Factuality Enhanced Language Models for Open-Ended Text Generation | Jun 9, 2022 | MisconceptionsSentence | CodeCode Available | 5 | 5 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 | 5 |
| Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model | Jan 28, 2022 | Few-Shot LearningLanguage Modeling | CodeCode Available | 3 | 5 |
| Finetuned Language Models Are Zero-Shot Learners | Sep 3, 2021 | ARCCommon Sense Reasoning | CodeCode Available | 3 | 5 |
| MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Apr 22, 2024 | Common Sense ReasoningGPU | CodeCode Available | 3 | 5 |
| Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks | Jan 5, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 2 | 5 |
| LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions | Apr 27, 2023 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 2 | 5 |
| The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning | May 23, 2023 | Common Sense ReasoningCommon Sense Reasoning (Zero-Shot) | CodeCode Available | 2 | 5 |
| Crosslingual Generalization through Multitask Finetuning | Nov 3, 2022 | Coreference ResolutionCross-Lingual Transfer | CodeCode Available | 2 | 5 |
| DeBERTa: Decoding-enhanced BERT with Disentangled Attention | Jun 5, 2020 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 2 | 5 |
| Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning | Oct 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 | 5 |
| PaLM: Scaling Language Modeling with Pathways | Apr 5, 2022 | Auto DebuggingCode Generation | CodeCode Available | 2 | 5 |
| UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark | Mar 24, 2021 | Common Sense ReasoningHellaSwag | CodeCode Available | 1 | 5 |
| Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners | Oct 6, 2022 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 | 5 |
| HONEST: Measuring Hurtful Sentence Completion in Language Models | Jun 1, 2021 | Hate Speech DetectionHurtful Sentence Completion | CodeCode Available | 1 | 5 |
| Task Compass: Scaling Multi-task Pre-training with Task Prefix | Oct 12, 2022 | Common Sense ReasoningData Augmentation | CodeCode Available | 1 | 5 |
| GePpeTto Carves Italian into a Language Model | Apr 29, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering | Oct 29, 2022 | Binary ClassificationQuestion Answering | CodeCode Available | 1 | 5 |
| RoBERTa: A Robustly Optimized BERT Pretraining Approach | Jul 26, 2019 | Common Sense ReasoningDocument Image Classification | CodeCode Available | 1 | 5 |
| Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals | May 1, 2022 | SentenceSentence Completion | CodeCode Available | 1 | 5 |
| Exploring the Benefits of Training Expert Language Models over Instruction Tuning | Feb 7, 2023 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 | 5 |
| A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations | Nov 26, 2015 | Information RetrievalQuestion Answering | CodeCode Available | 0 | 5 |
| CODAH: An Adversarially Authored Question-Answer Dataset for Common Sense | Apr 8, 2019 | Common Sense ReasoningQuestion Answering | CodeCode Available | 0 | 5 |
| BloombergGPT: A Large Language Model for Finance | Mar 30, 2023 | Causal JudgmentCommon Sense Reasoning | CodeCode Available | 0 | 5 |
| BTRec: BERT-Based Trajectory Recommendation for Personalized Tours | Oct 30, 2023 | Language ModellingSentence | CodeCode Available | 0 | 5 |
| CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense | Jun 1, 2019 | Common Sense ReasoningQuestion Answering | CodeCode Available | 0 | 5 |
| Dependency Recurrent Neural Language Models for Sentence Completion | Jul 5, 2015 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| DiscoSense: Commonsense Reasoning with Discourse Connectives | Oct 22, 2022 | Sentence Completion | CodeCode Available | 0 | 5 |
| HellaSwag: Can a Machine Really Finish Your Sentence? | May 19, 2019 | HellaSwagNatural Language Inference | CodeCode Available | 0 | 5 |
| Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models | Sep 16, 2023 | Age/Bias-conflictingBias Detection | CodeCode Available | 0 | 5 |
| KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication | Oct 21, 2024 | ChatbotLanguage Modeling | CodeCode Available | 0 | 5 |
| Language Model Sentence Completion with a Parser-Driven Rhetorical Control Method | Feb 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Learning Semantically and Additively Compositional Distributional Representations | Jun 8, 2016 | General ClassificationRelation Classification | CodeCode Available | 0 | 5 |
| mahaNLP: A Marathi Natural Language Processing Library | Nov 5, 2023 | Hate Speech DetectionNER | CodeCode Available | 0 | 5 |
| Mixture-of-Subspaces in Low-Rank Adaptation | Jun 16, 2024 | Common Sense ReasoningImage Generation | CodeCode Available | 0 | 5 |
| Muppet: Massive Multi-task Representations with Pre-Finetuning | Jan 26, 2021 | Abstractive Text SummarizationCommon Sense Reasoning | CodeCode Available | 0 | 5 |
| PaLM 2 Technical Report | May 17, 2023 | Code GenerationCommon Sense Reasoning | CodeCode Available | 0 | 5 |
| Recurrent Memory Networks for Language Modeling | Jan 6, 2016 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning | May 30, 2023 | BenchmarkingIn-Context Learning | CodeCode Available | 0 | 5 |
| SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners | Jun 24, 2022 | SentenceSentence Completion | CodeCode Available | 0 | 5 |
| Solving ESL Sentence Completion Questions via Pre-trained Neural Language Models | Jul 15, 2021 | SentenceSentence Completion | CodeCode Available | 0 | 5 |
| Top-down Tree Long Short-Term Memory Networks | Oct 31, 2015 | Dependency ParsingReranking | CodeCode Available | 0 | 5 |
| Learning Word Representations with Hierarchical Sparse Coding | Jun 8, 2014 | SentenceSentence Completion | —Unverified | 0 | 0 |