SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2561–2570 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking LLMs' Judgments with No Gold Standard	Nov 11, 2024	BenchmarkingMachine Translation	CodeCode Available	0
MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design	Nov 10, 2024	3D geometryBenchmarking	—Unverified	0
Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication	Nov 9, 2024	BenchmarkingIntegrated sensing and communication	—Unverified	0
Benchmarking Distributional Alignment of Large Language Models	Nov 8, 2024	Benchmarking	CodeCode Available	0
Open-set object detection: towards unified problem formulation and benchmarking	Nov 8, 2024	Autonomous DrivingBenchmarking	—Unverified	0
Benchmarking 3D multi-coil NC-PDNet MRI reconstruction	Nov 8, 2024	3D ReconstructionBenchmarking	—Unverified	0
FactLens: Benchmarking Fine-Grained Fact Verification	Nov 8, 2024	BenchmarkingFact Verification	—Unverified	0
A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics	Nov 8, 2024	Benchmarking	—Unverified	0
Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis	Nov 7, 2024	BenchmarkingModel Selection	—Unverified	0
HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images	Nov 7, 2024	AnatomyBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 257 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified