SOTAVerified

Benchmarking

Papers

Showing 26612670 of 5548 papers

TitleStatusHype
Free Performance Gain from Mixing Multiple Partially Labeled Samples in Multi-label Image Classification0
Benchmarking Single-Image Reflection Removal Algorithms0
Benchmarking projective simulation in navigation problems0
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano0
A Survey on LLM-based News Recommender Systems0
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems0
Benchmarking SMT Performance for Farsi Using the TEP++ Corpus0
From Code to Play: Benchmarking Program Search for Games Using Large Language Models0
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks0
Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning0
Show:102550
← PrevPage 267 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified