SOTAVerified

Benchmarking

Papers

Showing 461470 of 5548 papers

TitleStatusHype
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and OptimizationCode3
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information0
Autoregressive Stochastic Clock Jitter Compensation in Analog-to-Digital Converters0
Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering PerspectiveCode0
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization0
QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation0
scDrugMap: Benchmarking Large Foundation Models for Drug Response PredictionCode1
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action EnvironmentsCode1
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations0
PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation modelsCode1
Show:102550
← PrevPage 47 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified