SOTAVerified

Benchmarking

Papers

Showing 46514675 of 5548 papers

TitleStatusHype
Mamba-Based Ensemble learning for White Blood Cell ClassificationCode0
Better Late Than Never: Formulating and Benchmarking Recommendation EditingCode0
Better force fields start with better data -- A data set of cation dipeptide interactionsCode0
MANTRA: The Manifold Triangulations AssemblageCode0
BeSt-LeS: Benchmarking Stroke Lesion Segmentation using Deep SupervisionCode0
debiaSAE: Benchmarking and Mitigating Vision-Language Model BiasCode0
VizSeq: A Visual Analysis Toolkit for Text Generation TasksCode0
PATH: A Discrete-sequence Dataset for Evaluating Online Unsupervised Anomaly Detection Approaches for Multivariate Time SeriesCode0
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE CorpusCode0
Margin-bounded Confidence Scores for Out-of-Distribution DetectionCode0
Benchmarks for Graph Embedding EvaluationCode0
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsCode0
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation DatasetCode0
MARTA: a model for the automatic phonemic grouping of the parkinsonian speechCode0
High-Dynamic-Range Imaging for Cloud SegmentationCode0
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific AbstractsCode0
The Freiburg Groceries DatasetCode0
AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptidesCode0
Z_2 Z_2 Equivariant Quantum Neural Networks: Benchmarking against Classical Neural NetworksCode0
Benchmark of Deep Learning Models on Large Healthcare MIMIC DatasetsCode0
Hi-EF: Benchmarking Emotion Forecasting in Human-interactionCode0
Heterogeneous Datasets for Federated Survival Analysis SimulationCode0
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot StudyCode0
Robust 2D/3D Vehicle Parsing in Arbitrary Camera Views for CVISCode0
Adaptive Visual Scene Understanding: Incremental Scene Graph GenerationCode0
Show:102550
← PrevPage 187 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified