SOTAVerified

Benchmarking

Papers

Showing 32213230 of 5548 papers

TitleStatusHype
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User ManualsCode0
Benchmarking Foundation Models with Language-Model-as-an-Examiner0
Self-Adjusting Weighted Expected Improvement for Bayesian OptimizationCode0
ICON^2: Reliably Benchmarking Predictive Inequity in Object Detection0
Benchmarking Robustness of AI-Enabled Multi-sensor Fusion Systems: Challenges and Opportunities0
Explainable AI using expressive Boolean formulas0
Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models0
Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging0
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot LearningCode3
Str2Str: A Score-based Framework for Zero-shot Protein Conformation SamplingCode1
Show:102550
← PrevPage 323 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified