| Reasoning with Language Model Prompting: A Survey | Dec 19, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 3 |
| Discovering Language Model Behaviors with Model-Written Evaluations | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Prompting Is Programming: A Query Language for Large Language Models | Dec 12, 2022 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| Human-level play in the game of Diplomacy by combining language models with strategic reasoning | Nov 22, 2022 | AI AgentLanguage Modeling | CodeCode Available | 3 |
| What Language Model to Train if You Have One Million GPU Hours? | Oct 27, 2022 | GPULanguage Modeling | CodeCode Available | 3 |
| Diffusion-LM Improves Controllable Text Generation | May 27, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| A Systematic Evaluation of Large Language Models of Code | Feb 26, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model | Jan 28, 2022 | Few-Shot LearningLanguage Modeling | CodeCode Available | 3 |
| Datasheet for the Pile | Jan 13, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| 8-bit Optimizers via Block-wise Quantization | Oct 6, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Finetuned Language Models Are Zero-Shot Learners | Sep 3, 2021 | ARCCommon Sense Reasoning | CodeCode Available | 3 |
| W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training | Aug 7, 2021 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 |
| Evaluating Large Language Models Trained on Code | Jul 7, 2021 | Code GenerationHumanEval | CodeCode Available | 3 |
| Multi-objective Asynchronous Successive Halving | Jun 23, 2021 | FairnessHyperparameter Optimization | CodeCode Available | 3 |
| GLM: General Language Model Pretraining with Autoregressive Blank Infilling | Mar 18, 2021 | Abstractive Text SummarizationClassification | CodeCode Available | 3 |
| Prefix-Tuning: Optimizing Continuous Prompts for Generation | Jan 1, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| PGL at TextGraphs 2020 Shared Task: Explanation Regeneration using Language and Graph Learning Methods | Dec 1, 2020 | Graph LearningLanguage Modeling | CodeCode Available | 3 |
| ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding | Oct 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 |
| Conformer: Convolution-augmented Transformer for Speech Recognition | May 16, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| Revisiting Pre-Trained Models for Chinese Natural Language Processing | Apr 29, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Longformer: The Long-Document Transformer | Apr 10, 2020 | DecoderLanguage Modeling | CodeCode Available | 3 |
| Semi-Supervised Speech Recognition via Local Prior Matching | Feb 24, 2020 | Knowledge DistillationLanguage Modeling | CodeCode Available | 3 |
| Universal Language Model Fine-tuning for Text Classification | Jan 18, 2018 | General ClassificationLanguage Modeling | CodeCode Available | 3 |
| Order Matters: Sequence to sequence for sets | Nov 19, 2015 | Language Modeling | CodeCode Available | 3 |