SOTAVerified

Benchmarking

Papers

Showing 25612570 of 5548 papers

TitleStatusHype
Benchmarking LLMs' Judgments with No Gold StandardCode0
MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design0
Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication0
Benchmarking Distributional Alignment of Large Language ModelsCode0
Open-set object detection: towards unified problem formulation and benchmarking0
Benchmarking 3D multi-coil NC-PDNet MRI reconstruction0
FactLens: Benchmarking Fine-Grained Fact Verification0
A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics0
Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis0
HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images0
Show:102550
← PrevPage 257 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified