SOTAVerified

Benchmarking

Papers

Showing 14811490 of 5548 papers

TitleStatusHype
featsel: A framework for benchmarking of feature selection algorithms and cost functionsCode1
SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure InterpretationCode1
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of ThingsCode1
FedCV: A Federated Learning Framework for Diverse Computer Vision TasksCode1
Beyond neural scaling laws: beating power law scaling via data pruningCode1
Beyond Normal: On the Evaluation of Mutual Information EstimatorsCode1
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMsCode1
KoLA: Carefully Benchmarking World Knowledge of Large Language ModelsCode1
LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking SuiteCode1
LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field EnlargementCode1
Show:102550
← PrevPage 149 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified