SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 361–370 of 1107 papers

Title	Date	Tasks	Status	Hype
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM	Mar 12, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available	0
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models	Mar 10, 2025	Image DescriptionMultiple-choice	CodeCode Available	0
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words	Mar 10, 2025	Multiple-choice	—Unverified	0
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations	Mar 10, 2025	FormMultiple-choice	—Unverified	0
Towards Conversational AI for Disease Management	Mar 8, 2025	Clinical KnowledgeDiagnostic	—Unverified	0
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces	Mar 8, 2025	Benchmarkingcounterfactual	—Unverified	0
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	0
This Is Your Doge, If It Please You: Exploring Deception and Robustness in Mixture of LLMs	Mar 7, 2025	Large Language ModelMultiple-choice	CodeCode Available	0
Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework	Mar 7, 2025	Conformal PredictionMedical Question Answering	—Unverified	0
Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction	Mar 5, 2025	In-Context LearningMultiple-choice	CodeCode Available	0

Show:10 25 50

← PrevPage 37 of 111Next →

No leaderboard results yet.