SOTAVerified

Benchmarking

Papers

Showing 21712180 of 5548 papers

TitleStatusHype
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series0
Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization0
PsychBench: A comprehensive and professional benchmark for evaluating the performance of LLM-assisted psychiatric clinical practice0
ProBench: Benchmarking Large Language Models in Competitive Programming0
NeuroMorse: A Temporally Structured Dataset For Neuromorphic ComputingCode0
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments0
MMSciBench: Benchmarking Language Models on Multimodal Scientific Problems0
LimeSoDa: A Dataset Collection for Benchmarking of Machine Learning Regressors in Digital Soil MappingCode0
Machine-learning for photoplethysmography analysis: Benchmarking feature, image, and signal-based approachesCode0
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering0
Show:102550
← PrevPage 218 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified