Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning	Jan 25, 2024	Multiple-choicePosition	CodeCode Available	1	5
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?	Dec 3, 2024	Multiple-choice	CodeCode Available	1	5
Clues Before Answers: Generation-Enhanced Multiple-Choice QA	Apr 30, 2022	DecoderMultiple-choice	CodeCode Available	1	5
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages	Nov 8, 2020	Genre classificationMultiple-choice	CodeCode Available	1	5
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1	5
Benchmarking AI scientists in omics data-driven biological research	May 13, 2025	BenchmarkingMultiple-choice	CodeCode Available	1	5
An MRC Framework for Semantic Role Labeling	Sep 14, 2021	Computational EfficiencyMachine Reading Comprehension	CodeCode Available	1	5
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions	Feb 28, 2024	BenchmarkingMultiple-choice	CodeCode Available	1	5
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1	5
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce	Jun 14, 2024	Multiple-choiceQuestion Answering	CodeCode Available	1	5
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing	Jul 22, 2024	AllDiversity	CodeCode Available	1	5
Conformal Prediction with Large Language Models for Multi-Choice Question Answering	May 28, 2023	Conformal PredictionMultiple-choice	CodeCode Available	1	5
Annealed Winner-Takes-All for Motion Forecasting	Sep 17, 2024	AllAutonomous Driving	CodeCode Available	1	5
Counterfactual Variable Control for Robust and Interpretable Question Answering	Oct 12, 2020	Causal Inferencecounterfactual	CodeCode Available	1	5
An Open Source Data Contamination Report for Large Language Models	Oct 26, 2023	HellaSwagLanguage Modeling	CodeCode Available	1	5
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models	Nov 27, 2024	BenchmarkingEarth Observation	CodeCode Available	1	5
Ranked Voting based Self-Consistency of Large Language Models	May 16, 2025	Multiple-choiceOpen-Ended Question Answering	CodeCode Available	1	5
CUPCase: Clinically Uncommon Patient Cases and Diagnoses Dataset	Mar 8, 2025	Multiple-choice	CodeCode Available	1	5
Large Language Models Encode Clinical Knowledge	Dec 26, 2022	Clinical KnowledgeMedQA	CodeCode Available	1	5
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models	Mar 20, 2025	Multiple-choiceVideo Understanding	CodeCode Available	1	5
IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models	Mar 23, 2024	Common Sense ReasoningIn-Context Learning	CodeCode Available	1	5
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers	Dec 7, 2023	MathMultiple-choice	CodeCode Available	1	5
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?	Oct 24, 2024	Multiple-choice	CodeCode Available	1	5
Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework	May 22, 2025	Multiple-choiceVisual Question Answering (VQA)	CodeCode Available	1	5
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering	Apr 7, 2025	Chart Question AnsweringChart Understanding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 7 of 45Next →

No leaderboard results yet.