SOTAVerified

Benchmarking

Papers

Showing 31313140 of 5548 papers

TitleStatusHype
The Principle of Unchanged Optimality in Reinforcement Learning Generalization0
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting0
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO0
Benchmarking Robot Manipulation with the Rubik's Cube0
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness0
The Seeker's Dilemma: Realistic Formulation and Benchmarking for Hardware Trojan Detection0
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions0
IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models0
IO-VNBD: Inertial and Odometry Benchmark Dataset for Ground Vehicle Positioning0
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks0
Show:102550
← PrevPage 314 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified