SOTAVerified

Benchmarking

Papers

Showing 36013625 of 5548 papers

TitleStatusHype
Benchmarking Histopathology Foundation Models for Ovarian Cancer Bevacizumab Treatment Response Prediction from Whole Slide Images0
Benchmarking high-fidelity pedestrian tracking systems for research, real-time monitoring and crowd control0
What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI0
Benchmarking Hierarchical Image Pyramid Transformer for the classification of colon biopsies and polyps in histopathology images0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems0
MIRAI: Evaluating LLM Agents for Event Forecasting0
MIR-Bench: Can Your LLM Recognize Complicated Patterns via Many-Shot In-Context Reasoning?0
Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability0
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models0
Benchmarking Hebbian learning rules for associative memory0
Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic0
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices0
A Dataset Similarity Evaluation Framework for Wireless Communications and Sensing0
Benchmarking Harmonized Tariff Schedule Classification Models0
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation0
Towards Large-Scale Small Object Detection: Survey and Benchmarks0
MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking0
Towards Long-Term predictions of Turbulence using Neural Operators0
Benchmarking Graph Neural Networks on Link Prediction0
MLHarness: A Scalable Benchmarking System for MLCommons0
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs0
MLModelScope: A Distributed Platform for ML Model Evaluation and Benchmarking at Scale0
MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale0
A Dataset for Movie Description0
Show:102550
← PrevPage 145 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified