SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 17511775 of 177339 papers

TitleStatusHype
REFINE: Inversion-Free Backdoor Defense via Model ReprogrammingCode4
Relationships are Complicated! An Analysis of Relationships Between Datasets on the WebCode4
Benchmarking Graphormer on Large-Scale Molecular Modeling DatasetsCode4
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic AlignmentCode4
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative RefinementCode4
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal FormalizationCode4
Recurrent Partial Kernel Network for Efficient Optical Flow EstimationCode4
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to RealityCode4
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement LearningCode4
Are Transformers Effective for Time Series Forecasting?Code4
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning ModelsCode4
Repurposing Diffusion-Based Image Generators for Monocular Depth EstimationCode4
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotCode4
AlignScore: Evaluating Factual Consistency with a Unified Alignment FunctionCode4
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild VideosCode4
TableGPT2: A Large Multimodal Model with Tabular Data IntegrationCode4
Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from DemonstrationCode4
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized SoundsCode4
MovieChat+: Question-aware Sparse Memory for Long Video Question AnsweringCode4
Knowledge Fusion of Chat LLMs: A Preliminary Technical ReportCode4
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement LearningCode4
The case for 4-bit precision: k-bit Inference Scaling LawsCode4
ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object DetectionCode4
DepGraph: Towards Any Structural PruningCode4
Improving Training Stability for Multitask Ranking Models in Recommender SystemsCode4
Show:102550
← PrevPage 71 of 7094Next →