| Llama 2: Open Foundation and Fine-Tuned Chat Models | Jul 18, 2023 | Arithmetic Reasoning | CodeCode Available | 8 |
| Stay on topic with Classifier-Free Guidance | Jun 30, 2023 | Code GenerationCommon Sense Reasoning | —Unverified | 0 |
| ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning | May 30, 2023 | BenchmarkingIn-Context Learning | CodeCode Available | 0 |
| The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning | May 23, 2023 | Common Sense ReasoningCommon Sense Reasoning (Zero-Shot) | CodeCode Available | 2 |
| PaLM 2 Technical Report | May 17, 2023 | Code GenerationCommon Sense Reasoning | CodeCode Available | 0 |
| LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions | Apr 27, 2023 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 2 |
| BloombergGPT: A Large Language Model for Finance | Mar 30, 2023 | Causal JudgmentCommon Sense Reasoning | CodeCode Available | 0 |
| GPT-4 Technical Report | Mar 15, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 |
| LLaMA: Open and Efficient Foundation Language Models | Feb 27, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 7 |
| Exploring the Benefits of Training Expert Language Models over Instruction Tuning | Feb 7, 2023 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 |