SOTAVerified

Benchmarking

Papers

Showing 42214230 of 5548 papers

TitleStatusHype
SATBench: Benchmarking the speed-accuracy tradeoff in object recognition by humans and dynamic neural networksCode0
Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability0
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models0
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents0
EmProx: Neural Network Performance Estimation For Neural Architecture SearchCode0
CodeS: Towards Code Model Generalization Under Distribution ShiftCode0
SAIBench: Benchmarking AI for Science0
Functional Code Building Genetic Programming0
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization0
Benchmarking Bayesian neural networks and evaluation metrics for regression tasks0
Show:102550
← PrevPage 423 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified