SOTAVerified

Benchmarking

Papers

Showing 24512475 of 5548 papers

TitleStatusHype
Benchmarking symbolic regression constant optimization schemes0
Benchmarking Surrogate-Assisted Genetic Recommender Systems0
A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values0
A large-scale, physically-based synthetic dataset for satellite pose estimation0
Benchmarking Super-Resolution Algorithms on Real Data0
A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models0
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects0
Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach0
Benchmarking Sub-Genre Classification For Mainstage Dance Music0
A large-scale heterogeneous 3D magnetic resonance brain imaging dataset for self-supervised learning0
Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations0
Generation of Large District Heating System Models Using Open-Source Data and Tools: An Exemplary Workflow0
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors0
Genicious: Contextual Few-shot Prompting for Insights Discovery0
Benchmarking state-of-the-art gradient boosting algorithms for classification0
Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture0
Benchmarking State-of-the-Art Deep Learning Software Tools0
A Large-Scale Evaluation of Speech Foundation Models0
Generating Artificial Outliers in the Absence of Genuine Ones -- a Survey0
Benchmarking Spiking Neural Network Learning Methods with Varying Locality0
A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images0
A2Perf: Real-World Autonomous Agents Benchmark0
A 28-nm Convolutional Neuromorphic Processor Enabling Online Learning with Spike-Based Retinas0
Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems0
Benchmarking sparse system identification with low-dimensional chaos0
Show:102550
← PrevPage 99 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified