SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 121–130 of 1107 papers

Title	Date	Tasks	Status	Hype
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing	Jul 22, 2024	AllDiversity	CodeCode Available	1
Evaluating language models as risk scores	Jul 19, 2024	Multiple-choiceQuestion Answering	CodeCode Available	1
TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish	Jul 17, 2024	MathMultiple-choice	CodeCode Available	1
Fine-tuning Multimodal Large Language Models for Product Bundling	Jul 16, 2024	In-Context LearningMultiple-choice	CodeCode Available	1
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models	Jul 15, 2024	Backdoor AttackMultiple-choice	CodeCode Available	1
ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open Radio Access Networks	Jul 8, 2024	Anomaly DetectionCode Generation	CodeCode Available	1
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts	Jul 6, 2024	Logical ReasoningMathematical Reasoning	CodeCode Available	1
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation	Jun 29, 2024	Multiple-choice	CodeCode Available	1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1
HCQA @ Ego4D EgoSchema Challenge 2024	Jun 22, 2024	Caption Generation	CodeCode Available	1

Show:10 25 50

← PrevPage 13 of 111Next →

No leaderboard results yet.