SOTAVerified

Benchmarking

Papers

Showing 21212130 of 5548 papers

TitleStatusHype
MarineGym: A High-Performance Reinforcement Learning Platform for Underwater Robotics0
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models0
Ev-Layout: A Large-scale Event-based Multi-modal Dataset for Indoor Layout Estimation and Tracking0
Comprehensive Benchmarking of Machine Learning Methods for Risk Prediction Modelling from Large-Scale Survival Data: A UK Biobank Study0
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and ChallengesCode0
Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning modelsCode0
ResBench: Benchmarking LLM-Generated FPGA Designs with Resource Awareness0
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies0
Skelite: Compact Neural Networks for Efficient Iterative SkeletonizationCode0
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models0
Show:102550
← PrevPage 213 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified