SOTAVerified

Benchmarking

Papers

Showing 781790 of 5548 papers

TitleStatusHype
Benchmarking Vision, Language, & Action Models on Robotic Learning TasksCode1
GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution ShiftsCode1
4D Panoptic LiDAR SegmentationCode1
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam DatasetCode1
Disentangled Feature Representation for Few-shot Image ClassificationCode1
Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action ConstraintsCode1
Benchmark on Drug Target Interaction Modeling from a Structure PerspectiveCode1
BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scaleCode1
Benchmarks for Deep Off-Policy EvaluationCode1
Benchmarking Large Language Models on Controllable Generation under Diversified InstructionsCode1
Show:102550
← PrevPage 79 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified