| Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering | Oct 29, 2022 | Binary ClassificationQuestion Answering | CodeCode Available | 1 |
| Task Compass: Scaling Multi-task Pre-training with Task Prefix | Oct 12, 2022 | Common Sense ReasoningData Augmentation | CodeCode Available | 1 |
| Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners | Oct 6, 2022 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 |
| Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals | May 1, 2022 | SentenceSentence Completion | CodeCode Available | 1 |
| HONEST: Measuring Hurtful Sentence Completion in Language Models | Jun 1, 2021 | Hate Speech DetectionHurtful Sentence Completion | CodeCode Available | 1 |
| UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark | Mar 24, 2021 | Common Sense ReasoningHellaSwag | CodeCode Available | 1 |
| GePpeTto Carves Italian into a Language Model | Apr 29, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| RoBERTa: A Robustly Optimized BERT Pretraining Approach | Jul 26, 2019 | Common Sense ReasoningDocument Image Classification | CodeCode Available | 1 |
| Evaluating Gender Bias in Large Language Models | Nov 14, 2024 | Model SelectionSentence | —Unverified | 0 |
| KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication | Oct 21, 2024 | ChatbotLanguage Modeling | CodeCode Available | 0 |