SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 261–270 of 1107 papers

Title	Date	Tasks	Status	Hype
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects	Dec 6, 2024	2kAnomaly Detection	—Unverified	0
Establishing Task Scaling Laws via Compute-Efficient Model Ladders	Dec 5, 2024	Language ModelingLanguage Modelling	—Unverified	0
GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering	Dec 5, 2024	Information RetrievalMultiple-choice	—Unverified	0
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?	Dec 3, 2024	Multiple-choice	CodeCode Available	1
SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages	Dec 2, 2024	Multiple-choice	CodeCode Available	1
The use of large language models to enhance cancer clinical trial educational materials	Dec 2, 2024	MisinformationMultiple-choice	—Unverified	0
Unlocking Video-LLM via Agent-of-Thoughts Distillation	Dec 2, 2024	Language ModelingLanguage Modelling	—Unverified	0
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models	Dec 2, 2024	MMLUMultiple-choice	CodeCode Available	0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified	0
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information	Dec 1, 2024	Multiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 27 of 111Next →

No leaderboard results yet.