SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 52515275 of 661570 papers

TitleStatusHype
Cropping outperforms dropout as an augmentation strategy for self-supervised training of text embeddings0
STEMTOX: From Social Tags to Fine-Grained Toxic Meme Detection via Entropy-Guided Multi-Task Learning0
Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark0
Benchmarking LLM-based agents for single-cell omics analysis0
Surgical Video Understanding with Label Interpolation0
EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer0
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask0
Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm0
YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection0
Convergence of Distributionally Robust Q-Learning with Linear Function Approximation0
Near-Equilibrium Propagation training in nonlinear wave systems0
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL0
Diverse Text-to-Image Generation via Contrastive Noise Optimization0
Watch and Learn: Learning to Use Computers from Online Videos0
Dynamic Stress Detection: A Study of Temporal Progression Modelling of Stress in Speech0
Data-intrinsic approximation in metric spaces0
Qubit-centric Transformer for Surface Code Decoding0
A Functional Perspective on Knowledge Distillation in Neural Networks0
Feature-driven reinforcement learning for photovoltaic in continuous intraday trading0
SemBench: A Benchmark for Semantic Query Processing Engines0
First Proof0
VLAD-Grasp: Zero-shot Grasp Detection via Vision-Language Models0
MedPT: A Massive Medical Question Answering Dataset for Brazilian-Portuguese Speakers0
Tractable Probabilistic Models for Investment Planning0
Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving0
Show:102550
← PrevPage 211 of 26463Next →