The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4901–4950 of 661570 papers

Title	Date	Status	Hype
Spanning the Visual Analogy Space with a Weight Basis of LoRAs	Feb 17, 2026	—Unverified	2
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents	Feb 16, 2026	—Unverified	2
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories	Feb 16, 2026	—Unverified	2
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents	Feb 15, 2026	—Unverified	2
Experiential Reinforcement Learning	Feb 15, 2026	—Unverified	2
Endless Terminals: Scaling RL Environments for Terminal Agents	Feb 14, 2026	—Unverified	2
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference	Feb 14, 2026	—Unverified	2
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model	Feb 14, 2026	—Unverified	2
Latent Denoising Makes Good Tokenizers	Feb 14, 2026	—Unverified	2
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics	Feb 13, 2026	—Unverified	2
FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation	Feb 13, 2026	—Unverified	2
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions	Feb 13, 2026	—Unverified	2
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs	Feb 12, 2026	—Unverified	2
ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation	Feb 12, 2026	—Unverified	2
Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation	Feb 12, 2026	—Unverified	2
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation	Feb 12, 2026	—Unverified	2
CLI-Gym: Scalable CLI Task Generation via Agentic Environment Inversion	Feb 11, 2026	—Unverified	2
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories	Feb 11, 2026	—Unverified	2
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression	Feb 11, 2026	—Unverified	2
The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder	Feb 11, 2026	—Unverified	2
Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey	Feb 10, 2026	—Unverified	2
Olaf-World: Orienting Latent Actions for Video World Modeling	Feb 10, 2026	—Unverified	2
Evolving Interactive Diagnostic Agents in a Virtual Clinical Environment	Feb 10, 2026	—Unverified	2
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger	Feb 9, 2026	—Unverified	2
Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems	Feb 9, 2026	—Unverified	2
Bolmo: Byteifying the Next Generation of Language Models	Feb 9, 2026	—Unverified	2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation	Feb 9, 2026	—Unverified	2
How to Correctly Report LLM-as-a-Judge Evaluations	Feb 9, 2026	—Unverified	2
MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE	Feb 9, 2026	—Unverified	2
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models	Feb 9, 2026	—Unverified	2
PISCO: Precise Video Instance Insertion with Sparse Control	Feb 9, 2026	—Unverified	2
RAP: 3D Rasterization Augmented End-to-End Planning	Feb 8, 2026	—Unverified	2
Learning to Continually Learn via Meta-learning Agentic Memory Designs	Feb 8, 2026	—Unverified	2
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics	Feb 7, 2026	—Unverified	2
RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data	Feb 7, 2026	—Unverified	2
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data	Feb 6, 2026	—Unverified	2
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation	Feb 6, 2026	—Unverified	2
Learning a Generative Meta-Model of LLM Activations	Feb 6, 2026	—Unverified	2
EEG Foundation Models: Progresses, Benchmarking, and Open Problems	Feb 5, 2026	—Unverified	2
Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation	Feb 5, 2026	—Unverified	2
Context Forcing: Consistent Autoregressive Video Generation with Long Context	Feb 5, 2026	—Unverified	2
Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?	Feb 4, 2026	—Unverified	2
Rethinking the Trust Region in LLM Reinforcement Learning	Feb 4, 2026	—Unverified	2
Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory	Feb 3, 2026	—Unverified	2
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis	Feb 3, 2026	—Unverified	2
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models	Feb 2, 2026	—Unverified	2
SERA: Soft-Verified Efficient Repository Agents	Feb 2, 2026	—Unverified	2
A Survey on Efficient Vision-Language-Action Models	Feb 2, 2026	—Unverified	2
RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents	Feb 2, 2026	—Unverified	2
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation	Feb 2, 2026	—Unverified	2