SOTAVerified

Benchmarking

Papers

Showing 21012125 of 5548 papers

TitleStatusHype
Evolutionary Multimodal Optimization: A Short Survey0
Can AI Master Construction Management (CM)? Benchmarking State-of-the-Art Large Language Models on CM Certification Exams0
Can AI Freelancers Compete? Benchmarking Earnings, Reliability, and Task Success at Scale0
Benchmarking air-conditioning energy performance of residential rooms based on regression and clustering techniques0
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol0
Benchmarking Model Predictive Control Algorithms in Building Optimization Testing Framework (BOPTEST)0
A Benchmarking Protocol for SAR Colorization: From Regression to Deep Learning Approaches0
Evolving Evolutionary Algorithms using Linear Genetic Programming0
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation0
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography0
Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability0
CallNavi, A Challenge and Empirical Study on LLM Function Calling and Routing0
Call for Action: towards the next generation of symbolic regression benchmark0
Benchmarking Agility and Reconfigurability in Satellite Systems for Tropical Cyclone Monitoring0
A Data-Driven Method to Identify IBRs with Dominant Participation in Sub-Synchronous Oscillations0
Benchmarking Aggression Identification in Social Media0
Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative Machine Learning approaches0
Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline0
Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift0
Analyzing the behaviour of D'WAVE quantum annealer: fine-tuning parameterization and tests with restrictive Hamiltonian formulations0
Ev-Layout: A Large-scale Event-based Multi-modal Dataset for Indoor Layout Estimation and Tracking0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
Evolving Hard Maximum Cut Instances for Quantum Approximate Optimization Algorithms0
Exact lattice-based stochastic cell culture simulation algorithms incorporating spontaneous and contact-dependent reactions0
Explainable AI using expressive Boolean formulas0
Show:102550
← PrevPage 85 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified