SOTAVerified

Benchmarking

Papers

Showing 12611270 of 5548 papers

TitleStatusHype
PASS: An ImageNet replacement for self-supervised pretraining without humansCode1
Disentangled Feature Representation for Few-shot Image ClassificationCode1
Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue SystemCode1
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and BenchmarkingCode1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge GraphsCode1
AI Accelerator Survey and TrendsCode1
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation DatasetCode1
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle CommunicationCode1
Benchmarking the Spectrum of Agent CapabilitiesCode1
RobustART: Benchmarking Robustness on Architecture Design and Training TechniquesCode1
Show:102550
← PrevPage 127 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified