Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic	May 5, 2023	Epistemic ReasoningLanguage Modeling	CodeCode Available	1	5
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages	Nov 8, 2020	Genre classificationMultiple-choice	CodeCode Available	1	5
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training	Jun 15, 2024	Domain AdaptationLanguage Modeling	CodeCode Available	1	5
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning	Oct 1, 2024	Common Sense ReasoningDeepFake Detection	CodeCode Available	1	5
IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models	Mar 23, 2024	Common Sense ReasoningIn-Context Learning	CodeCode Available	1	5
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models	Sep 5, 2023	Code GenerationMultiple-choice	CodeCode Available	1	5
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge	Nov 2, 2018	Common Sense ReasoningMultiple-choice	CodeCode Available	1	5
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1	5
Clues Before Answers: Generation-Enhanced Multiple-Choice QA	Apr 30, 2022	DecoderMultiple-choice	CodeCode Available	1	5
Ranked Voting based Self-Consistency of Large Language Models	May 16, 2025	Multiple-choiceOpen-Ended Question Answering	CodeCode Available	1	5
An Open Source Data Contamination Report for Large Language Models	Oct 26, 2023	HellaSwagLanguage Modeling	CodeCode Available	1	5
Annealed Winner-Takes-All for Motion Forecasting	Sep 17, 2024	AllAutonomous Driving	CodeCode Available	1	5
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering	Apr 7, 2025	Chart Question AnsweringChart Understanding	CodeCode Available	1	5
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing	Jul 22, 2024	AllDiversity	CodeCode Available	1	5
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning	Jan 25, 2024	Multiple-choicePosition	CodeCode Available	1	5
HCQA @ Ego4D EgoSchema Challenge 2024	Jun 22, 2024	Caption Generation	CodeCode Available	1	5
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models	Mar 20, 2025	Multiple-choiceVideo Understanding	CodeCode Available	1	5
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance	Jun 13, 2024	Multiple-choiceVisual Reasoning	CodeCode Available	1	5
An MRC Framework for Semantic Role Labeling	Sep 14, 2021	Computational EfficiencyMachine Reading Comprehension	CodeCode Available	1	5
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1	5
An In-depth Look at Gemini's Language Abilities	Dec 18, 2023	Instruction FollowingMath	CodeCode Available	1	5
General-Purpose Question-Answering with Macaw	Sep 6, 2021	Generative Question AnsweringMultiple-choice	CodeCode Available	1	5
Generating Distractors for Reading Comprehension Questions from Real Examinations	Sep 8, 2018	DecoderDistractor Generation	CodeCode Available	1	5
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities	Jan 11, 2023	Multiple-choice	CodeCode Available	1	5
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture	Jun 16, 2024	DiversityMultiple-choice	CodeCode Available	1	5

Show:10 25 50

← PrevPage 6 of 45Next →

No leaderboard results yet.