SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 871–880 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM	Nov 26, 2024	BenchmarkingText-to-Video Generation	CodeCode Available	1	5
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics	Jun 8, 2021	Age And Gender ClassificationBenchmarking	CodeCode Available	1	5
JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes	May 10, 2025	BenchmarkingGPU	CodeCode Available	1	5
Job-SDF: A Multi-Granularity Dataset for Job Skill Demand Forecasting and Benchmarking	Jun 17, 2024	BenchmarkingDemand Forecasting	CodeCode Available	1	5
Benchmarking Low-Shot Robustness to Natural Distribution Shifts	Apr 21, 2023	Benchmarking	CodeCode Available	1	5
Jojajovai: A Parallel Guarani-Spanish Corpus for MT Benchmarking	Jun 1, 2022	BenchmarkingSentence	CodeCode Available	1	5
ClearPose: Large-scale Transparent Object Dataset and Benchmark	Mar 8, 2022	BenchmarkingDepth Completion	CodeCode Available	1	5
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning	Dec 11, 2023	BenchmarkingHuman-Object Interaction Detection	CodeCode Available	1	5
Benchmarking and scaling of deep learning models for land cover image classification	Nov 18, 2021	BenchmarkingClassification	CodeCode Available	1	5
Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign Recognition	Sep 25, 2023	Autonomous DrivingBenchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 88 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified