SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 52515300 of 661570 papers

TitleStatusHype
Cropping outperforms dropout as an augmentation strategy for self-supervised training of text embeddings0
STEMTOX: From Social Tags to Fine-Grained Toxic Meme Detection via Entropy-Guided Multi-Task Learning0
Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark0
Benchmarking LLM-based agents for single-cell omics analysis0
Surgical Video Understanding with Label Interpolation0
EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer0
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask0
Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm0
YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection0
Convergence of Distributionally Robust Q-Learning with Linear Function Approximation0
Near-Equilibrium Propagation training in nonlinear wave systems0
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL0
Diverse Text-to-Image Generation via Contrastive Noise Optimization0
Watch and Learn: Learning to Use Computers from Online Videos0
Dynamic Stress Detection: A Study of Temporal Progression Modelling of Stress in Speech0
Data-intrinsic approximation in metric spaces0
Qubit-centric Transformer for Surface Code Decoding0
A Functional Perspective on Knowledge Distillation in Neural Networks0
Feature-driven reinforcement learning for photovoltaic in continuous intraday trading0
SemBench: A Benchmark for Semantic Query Processing Engines0
First Proof0
VLAD-Grasp: Zero-shot Grasp Detection via Vision-Language Models0
MedPT: A Massive Medical Question Answering Dataset for Brazilian-Portuguese Speakers0
Tractable Probabilistic Models for Investment Planning0
Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving0
SVG360: Multi-View SVG Generation with Geometric and Color Consistency from a Single SVG0
ConsistCompose: Unified Multimodal Layout Control for Image Composition0
Scale Where It Matters: Training-Free Localized Scaling for Diffusion Models0
GENA3D: Generative Amodal 3D Modeling by Bridging 2D Priors and 3D Coherence0
STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative0
MatAnyone 2: Scaling Video Matting via a Learned Quality Evaluator4
Setting the Stage: Text-Driven Scene-Consistent Image Generation0
Training-Free Global Geometric Association for 4D LiDAR Panoptic Segmentation0
Assessing generative modeling approaches for free energy estimates in condensed matter0
Agentic Retoucher for Text-To-Image Generation0
WebCoderBench: Benchmarking Web Application Generation with Comprehensive and Interpretable Evaluation Metrics0
MorphGS: Morphology-Adaptive Articulated 3D Motion Transfer from Videos0
From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence0
LAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz Divergence0
Boosting Latent Diffusion Models via Disentangled Representation Alignment0
RAG-3DSG: Enhancing 3D Scene Graphs with Re-Shot Guided Retrieval-Augmented Generation0
Sparks of Cooperative Reasoning: LLMs as Strategic Hanabi Agents0
NaVIDA: Vision-Language Navigation with Inverse Dynamics Augmentation0
BabyReasoningBench: Generating Developmentally-Inspired Reasoning Tasks for Evaluating Baby Language Models0
The Geometric Mechanics of Contrastive Learning: Alignment Potentials, Entropic Dispersion, and Modality Gap0
CRAFT: Calibrated Reasoning with Answer-Faithful Traces via Reinforcement Learning for Multi-Hop Question Answering0
The Wisdom of Many Queries: Complexity-Diversity Principle for Dense Retriever Training0
Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language ModelsCode0
SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis3
HyperTokens: Controlling Token Dynamics for Continual Video-Language Understanding0
Show:102550
← PrevPage 106 of 13232Next →