SOTAVerified

Benchmarking

Papers

Showing 28012825 of 5548 papers

TitleStatusHype
Benchmarking and Validation of Sub-mW 30GHz VG-LNAs in 22nm FDSOI CMOS for 5G/6G Phased-Array Receivers0
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud RegistrationsCode0
VoiceWukong: Benchmarking Deepfake Voice Detection0
Benchmarking Sub-Genre Classification For Mainstage Dance Music0
Ransomware Detection Using Machine Learning in the Linux Kernel0
MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context UnderstandingCode0
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs0
Selecting Differential Splicing Methods: Practical Considerations0
Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E50
RBoard: A Unified Platform for Reproducible and Reusable Recommender System Benchmarks0
NeIn: Telling What You Don't Want0
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection0
A Framework for Evaluating PM2.5 Forecasts from the Perspective of Individual Decision MakingCode0
Quantum Kernel Methods under Scrutiny: A Benchmarking Study0
Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms0
Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm0
Question-Answering Dense Video EventsCode0
Shuffle Vision Transformer: Lightweight, Fast and Efficient Recognition of Driver Facial Expression0
LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like PostsCode0
InfraLib: Enabling Reinforcement Learning and Decision-Making for Large-Scale Infrastructure Management0
Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift0
Benchmarking Spurious Bias in Few-Shot Image ClassifiersCode0
PUB: Plot Understanding Benchmark and Dataset for Evaluating Large Language Models on Synthetic Visual Data Interpretation0
NUMOSIM: A Synthetic Mobility Dataset with Anomaly Detection Benchmarks0
EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision0
Show:102550
← PrevPage 113 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified