SOTAVerified

Benchmarking

Papers

Showing 13511360 of 5548 papers

TitleStatusHype
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World ScenariosCode1
IMGTB: A Framework for Machine-Generated Text Detection BenchmarkingCode1
Benchmarking Test-Time Adaptation against Distribution Shifts in Image ClassificationCode1
A Unified Taxonomy and Multimodal Dataset for Events in Invasion GamesCode1
Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle?Code1
Benchmarking the Spectrum of Agent CapabilitiesCode1
DFGC 2021: A DeepFake Game CompetitionCode1
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement LearningCode1
ImageNet-E: Benchmarking Neural Network Robustness via Attribute EditingCode1
Show:102550
← PrevPage 136 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified