| Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions | Oct 3, 2023 | MisconceptionsMultiple-choice | CodeCode Available | 0 | 5 |
| KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting | Dec 1, 2024 | Multiple-choiceMultiple Choice Question Answering (MCQA) | CodeCode Available | 0 | 5 |
| Differentiating Choices via Commonality for Multiple-Choice Question Answering | Aug 21, 2024 | Multiple-choiceMultiple Choice Question Answering (MCQA) | CodeCode Available | 0 | 5 |
| Iterative Forward Tuning Boosts In-Context Learning in Language Models | May 22, 2023 | Decision MakingIn-Context Learning | CodeCode Available | 0 | 5 |
| It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning | Nov 13, 2023 | Multiple-choice | CodeCode Available | 0 | 5 |
| iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers | May 25, 2024 | Common Sense ReasoningMultiple-choice | CodeCode Available | 0 | 5 |
| IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models | Jun 18, 2024 | ManagementMultiple-choice | CodeCode Available | 0 | 5 |
| Exposing the Limits of Video-Text Models through Contrast Sets | Jul 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Extracting Keywords from Open-Ended Business Survey Questions | Aug 31, 2018 | Multiple-choiceSurvey | CodeCode Available | 0 | 5 |
| Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit Scales | Oct 2, 2024 | Multiple-choice | CodeCode Available | 0 | 5 |