SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 13761400 of 659983 papers

TitleStatusHype
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive ReinforcementCode4
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context ControlCode4
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT ModelCode4
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement LearningCode4
Unified Reward Model for Multimodal Understanding and GenerationCode4
ReasonGraph: Visualisation of Reasoning PathsCode4
Factorio Learning EnvironmentCode4
DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement LearningCode4
UniTok: A Unified Tokenizer for Visual Generation and UnderstandingCode4
HVI: A New color space for Low-light Image EnhancementCode4
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic KernelsCode4
Distill Any Depth: Distillation Creates a Stronger Monocular Depth EstimatorCode4
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning AgentsCode4
SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model InferenceCode4
TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot ControlCode4
R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep ReasoningCode4
LettuceDetect: A Hallucination Detection Framework for RAG ApplicationsCode4
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic EvaluationCode4
REFINE: Inversion-Free Backdoor Defense via Model ReprogrammingCode4
Natural Language GenerationCode4
SurveyX: Academic Survey Automation via Large Language ModelsCode4
Building reliable sim driving agents by scaling self-playCode4
LServe: Efficient Long-sequence LLM Serving with Unified Sparse AttentionCode4
Craw4LLM: Efficient Web Crawling for LLM PretrainingCode4
A deep learning framework for efficient pathology image analysisCode4
Show:102550
← PrevPage 56 of 26400Next →