SOTAVerified

Benchmarking

Papers

Showing 39013910 of 5548 papers

TitleStatusHype
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization0
SDFR: Synthetic Data for Face Recognition Competition0
Uncertainty in GNN Learning Evaluations: The Importance of a Consistent Benchmark for Community Detection0
SE Arena: An Interactive Platform for Evaluating Foundation Models in Software Engineering0
SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification0
SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification0
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity0
SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories0
Secure Neuroimaging Analysis using Federated Learning with Homomorphic Encryption0
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions0
Show:102550
← PrevPage 391 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified