Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 1107 papers

Title	Date	Tasks	Status	Score
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Are Large Language Models Consistent over Value-laden Questions?	Jul 3, 2024	Multiple-choice	CodeCode Available	5
Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A	Feb 20, 2024	Language ModellingLarge Language Model	CodeCode Available	5
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	5
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding	Jan 10, 2025	Automatic Speech RecognitionClassification	CodeCode Available	5
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available	5
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models	Feb 9, 2025	Answer GenerationLanguage Modeling	CodeCode Available	5
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models	Jun 18, 2024	ManagementMultiple-choice	CodeCode Available	5
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning	Feb 8, 2025	Legal ReasoningMultiple-choice	CodeCode Available	5
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers	May 25, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available	5
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	5
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions	May 6, 2024	Decision MakingMultiple-choice	CodeCode Available	5
StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding	Oct 19, 2023	Multiple-choiceNatural Language Understanding	CodeCode Available	5
Introducing a framework to assess newly created questions with Natural Language Processing	Apr 28, 2020	Multiple-choice	CodeCode Available	5
DE-COP: Detecting Copyrighted Content in Language Models Training Data	Feb 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
An Automatic Question Usability Evaluation Toolkit	May 30, 2024	Multiple-choiceWord Embeddings	CodeCode Available	5
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit Scales	Oct 2, 2024	Multiple-choice	CodeCode Available	5
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?	Jul 2, 2024	Graph MiningLanguage Modeling	CodeCode Available	5
A Profit-Maximizing Strategy for Advertising on the e-Commerce Platforms	Oct 31, 2022	ManagementMultiple-choice	CodeCode Available	5
Fusing Models with Complementary Expertise	Oct 2, 2023	Multiple-choicetext-classification	CodeCode Available	5
TAXI: Evaluating Categorical Knowledge Editing for Language Models	Apr 23, 2024	knowledge editingMultiple-choice	CodeCode Available	5
Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions	May 30, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and Applications	Jun 26, 2023	Combinatorial OptimizationMultiple-choice	CodeCode Available	5
Improving Question Answering with External Knowledge	Feb 3, 2019	ARCMultiple-choice	CodeCode Available	5
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine	Nov 14, 2024	FormHallucination	CodeCode Available	5

Show:10 25 50

← PrevPage 18 of 45Next →

No leaderboard results yet.