SOTAVerified

Benchmarking

Papers

Showing 22512260 of 5548 papers

TitleStatusHype
Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban IntersectionCode1
Benchmarking Mobile Device Control Agents across Diverse Configurations0
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual ComprehensionCode3
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey beesCode0
SynthEval: A Framework for Detailed Utility and Privacy Evaluation of Tabular Synthetic DataCode1
Empirical Analysis of the Dynamic Binary Value Problem with IOHprofiler0
ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value ExtractionCode1
DPO: A Differential and Pointwise Control Approach to Reinforcement Learning0
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image ClassificationCode0
The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking0
Show:102550
← PrevPage 226 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified