SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3131–3140 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
The Principle of Unchanged Optimality in Reinforcement Learning Generalization	Jun 2, 2019	Benchmarkingreinforcement-learning	—Unverified	0	0
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting	Apr 30, 2024	BenchmarkingDepth Completion	—Unverified	0	0
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO	Aug 30, 2023	BenchmarkingReinforcement Learning (RL)	—Unverified	0	0
Benchmarking Robot Manipulation with the Rubik's Cube	Feb 14, 2022	BenchmarkingRobot Manipulation	—Unverified	0	0
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness	May 13, 2024	Benchmarkingcounterfactual	—Unverified	0	0
The Seeker's Dilemma: Realistic Formulation and Benchmarking for Hardware Trojan Detection	Feb 27, 2024	Benchmarking	—Unverified	0	0
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions	Dec 31, 2022	Autonomous DrivingBenchmarking	—Unverified	0	0
IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models	Oct 3, 2024	BenchmarkingIn-Context Learning	—Unverified	0	0
IO-VNBD: Inertial and Odometry Benchmark Dataset for Ground Vehicle Positioning	May 4, 2020	Autonomous VehiclesBenchmarking	—Unverified	0	0
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks	Sep 30, 2023	Benchmarking	—Unverified	0	0

Show:10 25 50

← PrevPage 314 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified