SOTAVerified

Benchmarking

Papers

Showing 921930 of 5548 papers

TitleStatusHype
Evaluation of large language models for discovery of gene set functionCode1
A skeletonization algorithm for gradient-based optimizationCode1
Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow SimulationCode1
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph EngineeringCode1
Benchmarking the Generation of Fact Checking ExplanationsCode1
Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadataCode1
MLLM-DataEngine: An Iterative Refinement Approach for MLLMCode1
LLMRec: Benchmarking Large Language Models on Recommendation TaskCode1
VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical RepresentationsCode1
Benchmarking Neural Network Generalization for Grammar InductionCode1
Show:102550
← PrevPage 93 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified