SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 55015525 of 474278 papers

TitleStatusHype
Dynamic Early Exit in Reasoning ModelsCode2
Text-based Animatable 3D Avatars with Morphable Model AlignmentCode2
WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksCode2
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM AgentsCode2
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for ReasoningCode2
MARFT: Multi-Agent Reinforcement Fine-TuningCode2
Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose EstimationCode2
Learning Adaptive Parallel Reasoning with Language ModelsCode2
FlowReasoner: Reinforcing Query-Level Meta-AgentsCode2
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive SurveyCode2
DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual UnderstandingCode2
Roll the dice & look before you leap: Going beyond the creative limits of next-token predictionCode2
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMsCode2
Generative Auto-Bidding with Value-Guided ExplorationsCode2
NTIRE 2025 Challenge on Image Super-Resolution (4): Methods and ResultsCode2
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group LearningCode2
SG-Reg: Generalizable and Efficient Scene Graph RegistrationCode2
Seurat: From Moving Points to DepthCode2
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent RepresentationCode2
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at ScaleCode2
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative ReasonersCode2
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive SurveyCode2
LangCoop: Collaborative Driving with LanguageCode2
EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and ModelCode2
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web TutorialsCode2
Show:102550
← PrevPage 221 of 18972Next →