SOTAVerified

Benchmarking

Papers

Showing 621630 of 5548 papers

TitleStatusHype
Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRTCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
Curious Hierarchical Actor-Critic Reinforcement LearningCode1
D2S: Document-to-Slide Generation Via Query-Based Text SummarizationCode1
Benchmarking Graph Neural Networks on Dynamic Link PredictionCode1
ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and LocalizationCode1
CRoW: Benchmarking Commonsense Reasoning in Real-World TasksCode1
Benchmarking Graph Neural Networks for FMRI analysisCode1
CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version)Code1
DACBench: A Benchmark Library for Dynamic Algorithm ConfigurationCode1
Show:102550
← PrevPage 63 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified