SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 831–840 of 5548 papers

Title	Date	Tasks	Status	Hype
Evaluating Robustness of Deep Reinforcement Learning for Autonomous Surface Vehicle Control in Field Tests	May 15, 2025	BenchmarkingDeep Reinforcement Learning	CodeCode Available	1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods	Aug 2, 2022	BenchmarkingCausal Discovery	CodeCode Available	1
4D Panoptic LiDAR Segmentation	Feb 24, 2021	4D Panoptic SegmentationBenchmarking	CodeCode Available	1
Graph Neural Network-Based Anomaly Detection for River Network Systems	Apr 19, 2023	Anomaly DetectionBenchmarking	CodeCode Available	1
Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning	Nov 29, 2024	BenchmarkingDeepFake Detection	CodeCode Available	1
GraphWorld: Fake Graphs Bring Real Insights for GNNs	Feb 28, 2022	Benchmarking	CodeCode Available	1
Large Scale MRI Collection and Segmentation of Cirrhotic Liver	Oct 6, 2024	BenchmarkingDiagnostic	CodeCode Available	1
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine Learning	Feb 20, 2024	Atomic number classificationBenchmarking	CodeCode Available	1
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning Robustness	Jul 13, 2020	Benchmarking	CodeCode Available	1
CIBench: Evaluating Your LLMs with a Code Interpreter Plugin	Jul 15, 2024	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 84 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified