SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2171–2180 of 5548 papers

Title	Date	Tasks	Status	Hype
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series	Feb 28, 2025	BenchmarkingSolar Irradiance Forecasting	—Unverified	0
Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization	Feb 28, 2025	BenchmarkingLanguage Modeling	—Unverified	0
PsychBench: A comprehensive and professional benchmark for evaluating the performance of LLM-assisted psychiatric clinical practice	Feb 28, 2025	BenchmarkingDiagnostic	—Unverified	0
ProBench: Benchmarking Large Language Models in Competitive Programming	Feb 28, 2025	AttributeBenchmarking	—Unverified	0
NeuroMorse: A Temporally Structured Dataset For Neuromorphic Computing	Feb 28, 2025	Benchmarking	CodeCode Available	0
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments	Feb 27, 2025	BenchmarkingCode Generation	—Unverified	0
MMSciBench: Benchmarking Language Models on Multimodal Scientific Problems	Feb 27, 2025	BenchmarkingVisual Reasoning	—Unverified	0
LimeSoDa: A Dataset Collection for Benchmarking of Machine Learning Regressors in Digital Soil Mapping	Feb 27, 2025	Benchmarking	CodeCode Available	0
Machine-learning for photoplethysmography analysis: Benchmarking feature, image, and signal-based approaches	Feb 27, 2025	BenchmarkingPhotoplethysmography (PPG)	CodeCode Available	0
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering	Feb 26, 2025	BenchmarkingQuestion Answering	—Unverified	0

Show:10 25 50

← PrevPage 218 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified