SOTAVerified

Benchmarking

Papers

Showing 591600 of 5548 papers

TitleStatusHype
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?Code1
Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?Code1
Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable SummarizationCode1
A multi-schematic classifier-independent oversampling approach for imbalanced datasetsCode1
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language ModelsCode1
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasksCode1
Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERTCode1
AdaPool: Exponential Adaptive Pooling for Information-Retaining DownsamplingCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object DetectionCode1
Show:102550
← PrevPage 60 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified