SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 131–140 of 1107 papers

Title	Date	Tasks	Status	Hype
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture	Jun 16, 2024	DiversityMultiple-choice	CodeCode Available	1
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training	Jun 15, 2024	Domain AdaptationLanguage Modeling	CodeCode Available	1
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce	Jun 14, 2024	Multiple-choiceQuestion Answering	CodeCode Available	1
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages	Jun 14, 2024	Multiple-choice	CodeCode Available	1
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance	Jun 13, 2024	Multiple-choiceVisual Reasoning	CodeCode Available	1
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding	Jun 13, 2024	Multiple-choiceScene Understanding	CodeCode Available	1
A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding	Jun 8, 2024	DescriptiveLanguage Modelling	CodeCode Available	1
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners	Jun 4, 2024	Multiple-choiceSpatial Reasoning	CodeCode Available	1
Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning	May 22, 2024	Mathematical ReasoningMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 14 of 111Next →

No leaderboard results yet.