The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2751–2775 of 661570 papers

Title	Date	Tasks	Status	Hype
MDCrow: Automating Molecular Dynamics Workflows with Large Language Models	Feb 13, 2025		CodeCode Available	3
MetaDE: Evolving Differential Evolution by Differential Evolution	Feb 13, 2025	Computational EfficiencyGPU	CodeCode Available	3
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Safety Analysis	Feb 13, 2025	Safety Alignment	CodeCode Available	3
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation	Feb 12, 2025	cross-modal alignmentmultimodal generation	CodeCode Available	3
Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning	Feb 12, 2025	RAGText to SQL	CodeCode Available	3
GENERator: A Long-Context Generative Genomic Foundation Model	Feb 11, 2025	model	CodeCode Available	3
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	Feb 11, 2025	Automated Theorem ProvingLarge Language Model	CodeCode Available	3
FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents	Feb 11, 2025		CodeCode Available	3
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	Feb 10, 2025	Math	CodeCode Available	3
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models	Feb 10, 2025	Decoder	CodeCode Available	3
History-Guided Video Diffusion	Feb 10, 2025	Video Generation	CodeCode Available	3
Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding	Feb 9, 2025	Image CaptioningImage-text Retrieval	CodeCode Available	3
PINGS: Gaussian Splatting Meets Distance Fields within a Point-Based Implicit Neural Map	Feb 9, 2025		CodeCode Available	3
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy	Feb 8, 2025	Q-LearningSafe Exploration	CodeCode Available	3
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation	Feb 7, 2025	Computational EfficiencyText-to-Video Generation	CodeCode Available	3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray	Feb 7, 2025	4kGeneral Knowledge	CodeCode Available	3
VideoRoPE: What Makes for Good Video Rotary Position Embedding?	Feb 7, 2025	HallucinationPosition	CodeCode Available	3
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks	Feb 7, 2025	Benchmarking	CodeCode Available	3
Multi-agent Architecture Search via Agentic Supernet	Feb 6, 2025	Language ModelingLanguage Modelling	CodeCode Available	3
MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot	Feb 6, 2025	DiagnosticLarge Language Model	CodeCode Available	3
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features	Feb 6, 2025	Image SegmentationSegmentation	CodeCode Available	3
Ola: Pushing the Frontiers of Omni-Modal Language Model	Feb 6, 2025	cross-modal alignmentLanguage Modeling	CodeCode Available	3
Demystifying Long Chain-of-Thought Reasoning in LLMs	Feb 5, 2025	Reinforcement Learning (RL)	CodeCode Available	3
Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries	Feb 4, 2025	GPU	CodeCode Available	3
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization	Feb 4, 2025	Quantization	CodeCode Available	3