The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 13001–13050 of 474278 papers

Title	Date	Tasks	Status	Hype
PyPop7: A Pure-Python Library for Population-Based Black-Box Optimization	Dec 12, 2022	BenchmarkingEvolutionary Algorithms	CodeCode Available	2
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management	Jun 28, 2024	ManagementText Generation	CodeCode Available	2
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL	Dec 18, 2023	SQL ParsingText to SQL	CodeCode Available	2
Benchmarking Agentic Workflow Generation	Oct 10, 2024	Benchmarking	CodeCode Available	2
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following	Oct 21, 2024	BenchmarkingInstruction Following	CodeCode Available	2
Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services	Jun 27, 2024	Scheduling	CodeCode Available	2
Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception	Jun 10, 2023	3D Object DetectionBenchmarking	CodeCode Available	2
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model	May 3, 2024	Decision MakingFew-Shot Learning	CodeCode Available	2
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens	Mar 20, 2025	3D Generation	CodeCode Available	2
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model	May 18, 2023	Image GenerationLanguage Modeling	CodeCode Available	2
PointLLM: Empowering Large Language Models to Understand Point Clouds	Aug 31, 2023	3D Object Captioning3D Object Classification	CodeCode Available	2
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors	Mar 7, 2024	Facial Action Unit DetectionTransfer Learning	CodeCode Available	2
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review	Jan 9, 2023	Medical Image Analysis	CodeCode Available	2
InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation	Jun 30, 2024	Image GenerationStyle Transfer	CodeCode Available	2
Palu: Compressing KV-Cache with Low-Rank Projection	Jul 30, 2024	GPUQuantization	CodeCode Available	2
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search	Aug 16, 2024	Language ModellingLarge Language Model	CodeCode Available	2
Image Segmentation in Foundation Model Era: A Survey	Aug 23, 2024	Image SegmentationInstance Segmentation	CodeCode Available	2
Monocular Obstacle Avoidance Based on Inverse PPO for Fixed-wing UAVs	Nov 27, 2024	Collision AvoidanceDeep Reinforcement Learning	CodeCode Available	2
SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction	Dec 10, 2023	Lifelike 3D Human Generation	CodeCode Available	2
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning	Jan 11, 2025	Drug Discovery	CodeCode Available	2
Generative AI for Cel-Animation: A Survey	Jan 8, 2025	ColorizationLayout Design	CodeCode Available	2
AutoPatent: A Multi-Agent Framework for Automatic Patent Generation	Dec 13, 2024	Text Generation	CodeCode Available	2
ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping	Jan 28, 2024	Feature UpsamplingImage Reconstruction	CodeCode Available	2
Torchattacks: A PyTorch Repository for Adversarial Attacks	Sep 24, 2020	Deep Learning	CodeCode Available	2
Scale-Aware Modulation Meet Transformer	Jul 17, 2023	object-detectionObject Detection	CodeCode Available	2
Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models	Jul 14, 2024	Anomaly DetectionVideo Anomaly Detection	CodeCode Available	2
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding	May 22, 2024	Dense Video CaptioningHighlight Detection	CodeCode Available	2
Let LLMs Break Free from Overthinking via Self-Braking Tuning	May 20, 2025	GSM8K	CodeCode Available	2
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks	Jun 12, 2025	GitHub issue resolutionvalid	CodeCode Available	2
Towards Automatic Power Battery Detection: New Challenge, Benchmark Dataset and Baseline	Dec 5, 2023	Crowd Countingobject-detection	CodeCode Available	2
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning	Sep 25, 2020	Gesture GenerationMuJoCo	CodeCode Available	2
Mask3D: Mask Transformer for 3D Semantic Instance Segmentation	Oct 6, 2022	3D Instance Segmentation3D Semantic Instance Segmentation	CodeCode Available	2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models	Aug 17, 2023	Decision MakingHallucination	CodeCode Available	2
A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space	Dec 19, 2024	Computational Efficiencyobject-detection	CodeCode Available	2
MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits	Apr 2, 2025		CodeCode Available	2
Test-Time Zero-Shot Temporal Action Localization	Apr 8, 2024	Action LocalizationLanguage Modelling	CodeCode Available	2
Equivariant Diffusion for Molecule Generation in 3D	Mar 31, 2022	Unconditional Molecule Generation	CodeCode Available	2
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	2
Neural Fields in Visual Computing and Beyond	Nov 22, 2021	3D ReconstructionImage Animation	CodeCode Available	2
RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar	May 22, 2024	Autonomous DrivingPrediction	CodeCode Available	2
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning	Jul 7, 2024	Continual LearningRepresentation Learning	CodeCode Available	2
UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning	Mar 27, 2025	Model OptimizationReinforcement Learning (RL)	CodeCode Available	2
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?	Jul 1, 2024	MathMathematical Reasoning	CodeCode Available	2
Fine-Grained Prototypes Distillation for Few-Shot Object Detection	Jan 15, 2024	Few-Shot Object DetectionMeta-Learning	CodeCode Available	2
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark	Feb 18, 2024	Benchmarking	CodeCode Available	2
Generative Image Dynamics	Sep 14, 2023		CodeCode Available	2
Reasoning Language Models: A Blueprint	Jan 20, 2025	Reinforcement Learning (RL)Retrieval-augmented Generation	CodeCode Available	2
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Jan 21, 2025	Video Understanding	CodeCode Available	2
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions	Mar 22, 2023	NeRF	CodeCode Available	2
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training	Nov 7, 2023	GPU	CodeCode Available	2