SOTAVerified

Benchmarking

Papers

Showing 10011010 of 5548 papers

TitleStatusHype
Benchmarking Omni-Vision Representation through the Lens of Visual RealmsCode1
Neural Multi-Hop Reasoning With Logical Rules on Biomedical Knowledge GraphsCode1
EXPObench: Benchmarking Surrogate-based Optimisation Algorithms on Expensive Black-box FunctionsCode1
DLBacktrace: A Model Agnostic Explainability for any Deep Learning ModelsCode1
dMelodies: A Music Dataset for Disentanglement LearningCode1
Benchmarking Offline Reinforcement Learning on Real-Robot HardwareCode1
Benchmarking Object Detectors with COCO: A New Path ForwardCode1
Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?Code1
Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable SummarizationCode1
Benchmarking: Past, Present and FutureCode1
Show:102550
← PrevPage 101 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified