SOTAVerified

Benchmarking

Papers

Showing 15511575 of 5548 papers

TitleStatusHype
Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares0
Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking0
Universal Music Representations? Evaluating Foundation Models on World Music CorporaCode0
A Comparative Analysis of Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) as Dimensionality Reduction Techniques0
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents0
Spotting tell-tale visual artifacts in face swapping videos: strengths and pitfalls of CNN detectors0
Finance Language Model Evaluation (FLaME)0
Q2SAR: A Quantum Multiple Kernel Learning Approach for Drug Discovery0
ImpliRet: Benchmarking the Implicit Fact Retrieval ChallengeCode0
Egocentric Human-Object Interaction Detection: A New Benchmark and Method0
A large-scale heterogeneous 3D magnetic resonance brain imaging dataset for self-supervised learning0
PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions0
Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis0
Robustness of Reinforcement Learning-Based Traffic Signal Control under Incidents: A Comparative Study0
JENGA: Object selection and pose estimation for robotic grasping from a stack0
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects0
Few-Shot Learning for Industrial Time Series: A Comparative Analysis Using the Example of Screw-Fastening Process Monitoring0
C-TLSAN: Content-Enhanced Time-Aware Long- and Short-Term Attention Network for Personalized RecommendationCode0
MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library ScenariosCode0
A large-scale, physically-based synthetic dataset for satellite pose estimation0
Learning Best Paths in Quantum Networks0
Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and BenchmarkCode0
Temporal cross-validation impacts multivariate time series subsequence anomaly detection evaluation0
SemanticST: Spatially Informed Semantic Graph Learning for Clustering, Integration, and Scalable Analysis of Spatial Transcriptomics0
EconGym: A Scalable AI Testbed with Diverse Economic Tasks0
Show:102550
← PrevPage 63 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified