SOTAVerified

Benchmarking

Papers

Showing 27212730 of 5548 papers

TitleStatusHype
Deep Unlearn: Benchmarking Machine Unlearning0
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset0
FMBench: Benchmarking Fairness in Multimodal Large Language Models on Medical Tasks0
Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents0
Match Stereo Videos via Bidirectional Alignment0
Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration0
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs0
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity LearningCode0
Constrained Reinforcement Learning for Safe Heat Pump ControlCode0
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks0
Show:102550
← PrevPage 273 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified