The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4951–5000 of 661570 papers

Title	Date	Tasks	Status	Hype
SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation	Feb 24, 2026		—Unverified	2
A Survey on Efficient Vision-Language-Action Models	Feb 2, 2026		—Unverified	2
BPMN Assistant: An LLM-Based Approach to Business Process Modeling	Jan 22, 2026		—Unverified	2
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling	Mar 4, 2026		—Unverified	2
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection	Mar 1, 2026		—Unverified	2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation	Feb 9, 2026		—Unverified	2
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing	Feb 20, 2026		—Unverified	2
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation	Mar 3, 2026		—Unverified	2
Physical Simulator In-the-Loop Video Generation	Mar 6, 2026		—Unverified	2
On Predictability of Reinforcement Learning Dynamics for Large Language Models	Feb 22, 2026		—Unverified	2
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models	Jan 28, 2026		—Unverified	2
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents	Feb 24, 2026		—Unverified	2
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video	Mar 4, 2026		—Unverified	2
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing	Mar 5, 2026		—Unverified	2
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle	Mar 3, 2026		—Unverified	2
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation	Feb 12, 2026		—Unverified	2
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation	Feb 6, 2026		—Unverified	2
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger	Feb 9, 2026		—Unverified	2
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering	Mar 16, 2026		—Unverified	2
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing	Mar 3, 2026		—Unverified	2
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising	Mar 9, 2026		—Unverified	2
Bolmo: Byteifying the Next Generation of Language Models	Feb 9, 2026		—Unverified	2
Spanning the Visual Analogy Space with a Weight Basis of LoRAs	Feb 17, 2026		—Unverified	2
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents	Feb 15, 2026		—Unverified	2
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data	Feb 6, 2026		—Unverified	2
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning	Feb 1, 2026		—Unverified	2
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models	Mar 17, 2026		—Unverified	2
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding	Mar 4, 2026		—Unverified	2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models	Mar 19, 2026		—Unverified	2
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models	Mar 4, 2026		—Unverified	2
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning	Feb 28, 2026		—Unverified	2
Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation	Feb 5, 2026		—Unverified	2
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models	Feb 9, 2026		—Unverified	2
RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data	Feb 7, 2026		—Unverified	2
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training	Mar 12, 2026		—Unverified	2
VLANeXt: Recipes for Building Strong VLA Models	Feb 20, 2026		—Unverified	2
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels	Mar 5, 2026		—Unverified	2
Olaf-World: Orienting Latent Actions for Video World Modeling	Feb 10, 2026		—Unverified	2
Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism	Jan 27, 2026		—Unverified	2
LARGE: Legal Retrieval Augmented Generation Evaluation Tool	Apr 2, 2025	RAGRetrieval	CodeCode Available	2
Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off	Apr 17, 2025	Garment ReconstructionImage Generation	CodeCode Available	2
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution	Feb 21, 2022		CodeCode Available	2
Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow	Aug 5, 2024		CodeCode Available	2
AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval	Apr 9, 2024	AllInformation Retrieval	CodeCode Available	2
Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation	Dec 18, 2024	Image SegmentationKnowledge Distillation	CodeCode Available	2
DiffMM: Multi-Modal Diffusion Model for Recommendation	Jun 17, 2024	Contrastive Learningmodel	CodeCode Available	2
Blockwise Parallel Transformer for Large Context Models	May 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
VkD: Improving Knowledge Distillation using Orthogonal Projections	Jan 1, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
Mixture of Tokens: Continuous MoE through Cross-Example Aggregation	Oct 24, 2023	Language ModellingLarge Language Model	CodeCode Available	2
Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery	Jan 12, 2024	Object RecognitionRoad Segmentation	CodeCode Available	2