SOTAVerified

Multiple-choice

Papers

Showing 5160 of 1107 papers

TitleStatusHype
Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning FrameworkCode1
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack0
Set-LLM: A Permutation-Invariant LLM0
Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets0
Uncovering Cultural Representation Disparities in Vision-Language Models0
VideoEval-Pro: Robust and Realistic Long Video Understanding EvaluationCode4
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications0
MR. Judge: Multimodal Reasoner as a Judge0
LEXam: Benchmarking Legal Reasoning on 340 Law Exams0
Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It TeachesCode0
Show:102550
← PrevPage 6 of 111Next →

No leaderboard results yet.