SOTAVerified

Benchmarking

Papers

Showing 35763600 of 5548 papers

TitleStatusHype
Alexpaca: Learning Factual Clarification Question Generation Without Examples0
BanglaNLP at BLP-2023 Task 1: Benchmarking different Transformer Models for Violence Inciting Text Detection in Bengali0
A Novel Benchmarking Paradigm and a Scale- and Motion-Aware Model for Egocentric Pedestrian Trajectory Prediction0
An Empirical Study of Super-resolution on Low-resolution Micro-expression Recognition0
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language ModelsCode0
Assessing Encoder-Decoder Architectures for Robust Coronary Artery Segmentation0
Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning0
Prompting Scientific Names for Zero-Shot Species Recognition0
Benchmarking the Sim-to-Real Gap in Cloth Manipulation0
Randomized Benchmarking of Local Zeroth-Order Optimizers for Variational Quantum SystemsCode0
Mirage: Model-Agnostic Graph Distillation for Graph ClassificationCode0
BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media PostsCode0
A Benchmarking Protocol for SAR Colorization: From Regression to Deep Learning Approaches0
Who Said That? Benchmarking Social Media AI Detection0
Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images0
Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms0
Deep Reinforcement Learning for Autonomous Cyber Defence: A Survey0
Risk Aware Benchmarking of Large Language Models0
Transformers for Green Semantic Communication: Less Energy, More SemanticsCode0
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning0
Hypergraph Neural Networks through the Lens of Message Passing: A Common Perspective to Homophily and Architecture Design0
BeSt-LeS: Benchmarking Stroke Lesion Segmentation using Deep SupervisionCode0
On the Evaluation and Refinement of Vision-Language Instruction Tuning Datasets0
CAFA-evaluator: A Python Tool for Benchmarking Ontological Classification Methods0
Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization0
Show:102550
← PrevPage 144 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified