SOTAVerified

Benchmarking

Papers

Showing 37013725 of 5548 papers

TitleStatusHype
Predicting Football Match Outcomes with eXplainable Machine Learning and the Kelly Index0
Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling0
Predicting the Performance of a Computing System with Deep Networks0
Predicting the Probability of Collision of a Satellite with Space Debris: A Bayesian Machine Learning Approach0
Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift0
Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks0
Prediction of the Influence of Navigation Scan-path on Perceived Quality of Free-Viewpoint Videos0
Predictive modelling of a novel anti-adhesion therapy to combat bacterial colonisation of burn wounds0
Predictive Models from Quantum Computer Benchmarks0
Prepare for Trouble and Make it Double. Supervised and Unsupervised Stacking for AnomalyBased Intrusion Detection0
Benchmarking Machine Reading Comprehension: A Psychological Perspective0
Pretraining boosts out-of-domain robustness for pose estimation0
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms0
PRISM: Complete Online Decentralized Multi-Agent Pathfinding with Rapid Information Sharing using Motion Constraints0
Prism: Dynamic and Flexible Benchmarking of LLMs Code Generation with Monte Carlo Tree Search0
Privacy-Preserving Language Model Inference with Instance Obfuscation0
Privacy Protection in Street-View Panoramas using Depth and Multi-View Imagery0
Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide0
ProBench: Benchmarking Large Language Models in Competitive Programming0
Problem-solving benefits of down-sampled lexicase selection0
Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning0
Procedural Generalization by Planning with Self-Supervised World Models0
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions0
Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning0
Progressive Class-level Distillation0
Show:102550
← PrevPage 149 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified