SOTAVerified

Benchmarking

Papers

Showing 831840 of 5548 papers

TitleStatusHype
Evaluating Robustness of Deep Reinforcement Learning for Autonomous Surface Vehicle Control in Field TestsCode1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methodsCode1
4D Panoptic LiDAR SegmentationCode1
Graph Neural Network-Based Anomaly Detection for River Network SystemsCode1
Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learningCode1
GraphWorld: Fake Graphs Bring Real Insights for GNNsCode1
Large Scale MRI Collection and Segmentation of Cirrhotic LiverCode1
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine LearningCode1
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning RobustnessCode1
CIBench: Evaluating Your LLMs with a Code Interpreter PluginCode1
Show:102550
← PrevPage 84 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified