SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 97019725 of 474278 papers

TitleStatusHype
InqEduAgent: Adaptive AI Learning Partners with Gaussian Process AugmentationCode0
ArabJobs: A Multinational Corpus of Arabic Job AdsCode0
Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy0
KV Cache Steering for Controlling Frozen LLMs0
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon ScenariosCode0
SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet0
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation0
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning0
StateX: Enhancing RNN Recall via Post-training State Expansion0
SPARK: Synergistic Policy And Reward Co-Evolving Framework0
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning0
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing0
SpotEdit: Evaluating Visually-Guided Image Editing MethodsCode0
A benchmark for vericoding: formally verified program synthesisCode0
Infusing Theory of Mind into Socially Intelligent LLM Agents0
VideoScore2: Think before You Score in Generative Video Evaluation0
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement LearningCode0
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents0
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs0
CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation ModelsCode0
Debiased Front-Door Learners for Heterogeneous EffectsCode0
HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion StrategyCode0
LFA-Net: A Lightweight Network with LiteFusion Attention for Retinal Vessel SegmentationCode0
LongScape: Advancing Long-Horizon Embodied World Models with Context-Aware MoECode0
Graph of Agents: Principled Long Context Modeling by Emergent Multi-Agent CollaborationCode0
Show:102550
← PrevPage 389 of 18972Next →