SOTAVerified

Benchmarking

Papers

Showing 36013610 of 5548 papers

TitleStatusHype
Benchmarking Histopathology Foundation Models for Ovarian Cancer Bevacizumab Treatment Response Prediction from Whole Slide Images0
Benchmarking high-fidelity pedestrian tracking systems for research, real-time monitoring and crowd control0
What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI0
Benchmarking Hierarchical Image Pyramid Transformer for the classification of colon biopsies and polyps in histopathology images0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems0
MIRAI: Evaluating LLM Agents for Event Forecasting0
MIR-Bench: Can Your LLM Recognize Complicated Patterns via Many-Shot In-Context Reasoning?0
Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability0
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models0
Show:102550
← PrevPage 361 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified