SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 331–340 of 1107 papers

Title	Date	Tasks	Status	Hype
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models	Oct 10, 2024	Conformal PredictionLanguage Modeling	—Unverified	0
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models	Oct 10, 2024	Multiple-choiceQuestion Answering	—Unverified	0
TVBench: Redesigning Video-Language Evaluation	Oct 10, 2024	Multiple-choiceOpen-Ended Question Answering	—Unverified	0
Answering Questions in Stages: Prompt Chaining for Contract QA	Oct 9, 2024	Multiple-choice	—Unverified	0
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning	Oct 9, 2024	HallucinationMultiple-choice	CodeCode Available	0
ACPBench: Reasoning about Action, Change, and Planning	Oct 8, 2024	Multiple-choice	—Unverified	0
ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition	Oct 8, 2024	Action RecognitionMultiple-choice	—Unverified	0
Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning	Oct 6, 2024	Multiple-choice	CodeCode Available	0
Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA	Oct 3, 2024	Multiple-choiceQuestion Answering	—Unverified	0
Video Instruction Tuning With Synthetic Data	Oct 3, 2024	3D Question Answering (3D-QA)	—Unverified	0

Show:10 25 50

← PrevPage 34 of 111Next →

No leaderboard results yet.