SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 82268250 of 474278 papers

TitleStatusHype
Structure-Aware Fusion with Progressive Injection for Multimodal Molecular Representation LearningCode0
Text-conditioned State Space Model For Domain-generalized Change Detection Visual Question AnsweringCode0
AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite0
Sparser Block-Sparse Attention via Token PermutationCode0
ReDit: Reward Dithering for Improved LLM Policy Optimization0
Unified Implementations of Recurrent Neural Networks in Multiple Deep Learning Frameworks0
Towards Comprehensive Scene Understanding: Integrating First and Third-Person Views for LVLMsCode0
zip2zip: Inference-Time Adaptive Tokenization via Online Compression0
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning0
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent0
SolarBoost: Distributed Photovoltaic Power Forecasting Amid Time-varying Grid CapacityCode0
Model Merging with Functional Dual Anchors0
Pctx: Tokenizing Personalized Context for Generative RecommendationCode0
Redefining Retrieval Evaluation in the Era of LLMs0
PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis0
Co-Sight: Enhancing LLM-Based Agents via Conflict-Aware Meta-Verification and Trustworthy Reasoning with Structured FactsCode0
WorldGrow: Generating Infinite 3D World0
Visual Diffusion Models are Geometric Solvers0
ArchISMiner: A Framework for Automatic Mining of Architectural Issue-Solution Pairs from Online Developer CommunitiesCode0
FlowOpt: Fast Optimization Through Whole Flow Processes for Training-Free Editing0
FITS: Towards an AI-Driven Fashion Information Tool for SustainabilityCode0
DictPFL: Efficient and Private Federated Learning on Encrypted GradientsCode0
ExpressNet-MoE: A Hybrid Deep Neural Network for Emotion RecognitionCode0
Understanding and Mitigating Numerical Sources of Nondeterminism in LLM InferenceCode0
Scalable Valuation of Human Feedback through Provably Robust Model AlignmentCode0
Show:102550
← PrevPage 330 of 18972Next →