The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1526–1550 of 661570 papers

Title	Date	Tasks	Status	Hype
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement	Oct 26, 2024	Large Language Model	CodeCode Available	4
Blendify -- Python rendering framework for Blender	Oct 23, 2024	10-shot image generation	CodeCode Available	4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces	Oct 21, 2024	Code Generationscientific discovery	CodeCode Available	4
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree	Oct 21, 2024	Heuristic SearchObject	CodeCode Available	4
SNAC: Multi-Scale Neural Audio Codec	Oct 18, 2024	Audio CompressionAudio Generation	CodeCode Available	4
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents	Oct 17, 2024	Experimental Design	CodeCode Available	4
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance	Oct 16, 2024	Human Agent Collaboration	CodeCode Available	4
One Step Diffusion via Shortcut Models	Oct 16, 2024	DenoisingScheduling	CodeCode Available	4
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI	Oct 15, 2024	Benchmarking	CodeCode Available	4
MoH: Multi-Head Attention as Mixture-of-Head Attention	Oct 15, 2024	Mixture-of-Experts	CodeCode Available	4
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents	Oct 14, 2024	RAGRetrieval	CodeCode Available	4
EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations	Oct 14, 2024	Answer GenerationQuestion Answering	CodeCode Available	4
Agent-as-a-Judge: Evaluate Agents with Agents	Oct 14, 2024	Code Generation	CodeCode Available	4
Generalizable Humanoid Manipulation with 3D Diffusion Policies	Oct 14, 2024	Camera CalibrationPoint Cloud Segmentation	CodeCode Available	4
Depth Any Video with Scalable Synthetic Data	Oct 14, 2024	Depth Estimation	CodeCode Available	4
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads	Oct 14, 2024	GPUQuantization	CodeCode Available	4
When Does Perceptual Alignment Benefit Vision Representations?	Oct 14, 2024	Depth EstimationImage Generation	CodeCode Available	4
LLMMapReduce: Simplified Long-Sequence Processing using Large Language Models	Oct 12, 2024	document understanding	CodeCode Available	4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4
Generalizable and Animatable Gaussian Head Avatar	Oct 10, 2024		CodeCode Available	4
Taking a turn for the better: Conversation redirection throughout the course of mental-health therapy	Oct 9, 2024		CodeCode Available	4
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models	Oct 9, 2024	Multi-Task Learning	CodeCode Available	4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts	Oct 9, 2024	GPUMixture-of-Experts	CodeCode Available	4
Improving Data Augmentation-based Cross-Speaker Style Transfer for TTS with Singing Voice, Style Filtering, and F0 Matching	Oct 8, 2024	Data AugmentationStyle Transfer	CodeCode Available	4