SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2721–2730 of 5548 papers

Title	Date	Tasks	Status	Hype
Deep Unlearn: Benchmarking Machine Unlearning	Oct 2, 2024	BenchmarkingMachine Unlearning	—Unverified	0
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset	Oct 1, 2024	BenchmarkingContrastive Learning	—Unverified	0
FMBench: Benchmarking Fairness in Multimodal Large Language Models on Medical Tasks	Oct 1, 2024	BenchmarkingFairness	—Unverified	0
Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents	Oct 1, 2024	BenchmarkingConversational Question Answering	—Unverified	0
Match Stereo Videos via Bidirectional Alignment	Sep 30, 2024	BenchmarkingStereo Matching	—Unverified	0
Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration	Sep 30, 2024	BenchmarkingIntent Detection	—Unverified	0
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs	Sep 30, 2024	BenchmarkingMultiple-choice	—Unverified	0
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning	Sep 30, 2024	BenchmarkingDisparity Estimation	CodeCode Available	0
Constrained Reinforcement Learning for Safe Heat Pump Control	Sep 29, 2024	Benchmarkingreinforcement-learning	CodeCode Available	0
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks	Sep 29, 2024	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 273 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified