SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 231–240 of 5548 papers

Title	Date	Tasks	Status	Hype
WayveScenes101: A Dataset and Benchmark for Novel View Synthesis in Autonomous Driving	Jul 11, 2024	Autonomous DrivingBenchmarking	CodeCode Available	2
InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior	Jul 10, 2024	BenchmarkingDecoder	CodeCode Available	2
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance	Jul 9, 2024	BenchmarkingConditional Image Generation	CodeCode Available	2
SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry	Jul 5, 2024	Benchmarkingobject-detection	CodeCode Available	2
Benchmarking Complex Instruction-Following with Multiple Constraints Composition	Jul 4, 2024	BenchmarkingInstruction Following	CodeCode Available	2
Craftium: An Extensible Framework for Creating Reinforcement Learning Environments	Jul 4, 2024	BenchmarkingMinecraft	CodeCode Available	2
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models	Jul 3, 2024	BenchmarkingCode Search	CodeCode Available	2
Benchmarking Predictive Coding Networks -- Made Simple	Jul 1, 2024	Benchmarking	CodeCode Available	2
FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models	Jul 1, 2024	BenchmarkingFairness	CodeCode Available	2
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations	Jul 1, 2024	Benchmarkingdocument understanding	CodeCode Available	2

Show:10 25 50

← PrevPage 24 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified