SOTAVerified

Benchmarking

Papers

Showing 35013510 of 5548 papers

TitleStatusHype
Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat0
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations0
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors0
Matrix-Free Preconditioning in Online Learning0
Benchmarking Large Language Model Volatility0
Benchmarking Large Language Models with Integer Sequence Generation Tasks0
Maximum Categorical Cross Entropy (MCCE): A noise-robust alternative loss function to mitigate racial bias in Convolutional Neural Networks (CNNs) by reducing overfitting0
MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors0
Benchmarking Pre-Trained Time Series Models for Electricity Price Forecasting0
MBA-VO: Motion Blur Aware Visual Odometry0
Show:102550
← PrevPage 351 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified