SOTAVerified

Benchmarking

Papers

Showing 12011210 of 5548 papers

TitleStatusHype
Benchmarking Large Language Models for Automated Verilog RTL Code GenerationCode1
ByzFL: Research Framework for Robust Federated LearningCode1
GraphGallery: A Platform for Fast Benchmarking and Easy Development of Graph Neural Networks Based Intelligent SoftwareCode1
Benchmarking Object Detectors with COCO: A New Path ForwardCode1
A Reinforcement Learning Environment for Multi-Service UAV-enabled Wireless SystemsCode1
Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot SystemsCode1
Benchmarking Self-Supervised Learning on Diverse Pathology DatasetsCode1
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMsCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Graph Neural Network-Based Anomaly Detection for River Network SystemsCode1
Show:102550
← PrevPage 121 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified