SOTAVerified

Benchmarking

Papers

Showing 22512275 of 5548 papers

TitleStatusHype
A CUDA-Based Real Parameter Optimization Benchmark0
Beyond Text: A Deep Dive into Large Language Models' Ability on Understanding Graph Data0
BEADs: Bias Evaluation Across Domains0
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning0
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge0
Energy Models for Better Pseudo-Labels: Improving Semi-Supervised Classification with the 1-Laplacian Graph Energy0
Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages0
Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages0
BEACON: A Benchmark for Efficient and Accurate Counting of Subgraphs0
FIMP: Foundation Model-Informed Message Passing for Graph Neural Networks0
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms0
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems0
ActPlan-1K: Benchmarking the Procedural Planning Ability of Visual Language Models in Household Activities0
BBOB Instance Analysis: Landscape Properties and Algorithm Performance across Problem Instances0
A Benchmark for Multi-speaker Anonymization0
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization0
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks0
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition0
A Modular Framework for Centrality and Clustering in Complex Networks0
Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding0
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior0
Bayesian Neural Networks at Scale: A Performance Analysis and Pruning Study0
SPINEX-TimeSeries: Similarity-based Predictions with Explainable Neighbors Exploration for Time Series and Forecasting Problems0
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks0
Bayesian Multi-type Mean Field Multi-agent Imitation Learning0
Show:102550
← PrevPage 91 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified