SOTAVerified

Multiple-choice

Papers

Showing 361370 of 1107 papers

TitleStatusHype
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAMCode0
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language ModelsCode0
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words0
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations0
Towards Conversational AI for Disease Management0
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces0
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense ScenariosCode0
This Is Your Doge, If It Please You: Exploring Deception and Robustness in Mixture of LLMsCode0
Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework0
Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of AbstractionCode0
Show:102550
← PrevPage 37 of 111Next →

No leaderboard results yet.