The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4151–4175 of 661570 papers

Title	Date	Tasks	Status	Hype
Putting the Object Back into Video Object Segmentation	Oct 19, 2023	ObjectSegmentation	CodeCode Available	3
AgentTuning: Enabling Generalized Agent Abilities for LLMs	Oct 19, 2023	Memorization	CodeCode Available	3
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3
Llemma: An Open Language Model For Mathematics	Oct 16, 2023	Arithmetic ReasoningAutomated Theorem Proving	CodeCode Available	3
MotionDirector: Motion Customization of Text-to-Video Diffusion Models	Oct 12, 2023		CodeCode Available	3
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting	Oct 12, 2023	DecoderProbabilistic Time Series Forecasting	CodeCode Available	3
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research	Oct 12, 2023	Autonomous Driving	CodeCode Available	3
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration	Oct 11, 2023		CodeCode Available	3
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous Driving	Oct 11, 2023	Autonomous DrivingBenchmarking	CodeCode Available	3
MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents	Oct 10, 2023		CodeCode Available	3
Text Embeddings Reveal (Almost) As Much As Text	Oct 10, 2023		CodeCode Available	3
Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis	Oct 9, 2023	BenchmarkingMultivariate Time Series Forecasting	CodeCode Available	3
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition	Oct 9, 2023	Code GenerationInstruction Following	CodeCode Available	3
Evaluating Hallucinations in Chinese Large Language Models	Oct 5, 2023	HallucinationQuestion Answering	CodeCode Available	3
T^3Bench: Benchmarking Current Progress in Text-to-3D Generation	Oct 4, 2023	3D GenerationBenchmarking	CodeCode Available	3
MagicDrive: Street View Generation with Diverse 3D Geometry Control	Oct 4, 2023	3D geometry3D Object Detection	CodeCode Available	3
Conceptual Framework for Autonomous Cognitive Entities	Oct 3, 2023		CodeCode Available	3
OceanGPT: A Large Language Model for Ocean Science Tasks	Oct 3, 2023	Language ModelingLanguage Modelling	CodeCode Available	3
UltraFeedback: Boosting Language Models with Scaled AI Feedback	Oct 2, 2023	Language Modelling	CodeCode Available	3
AutoAgents: A Framework for Automatic Agent Generation	Sep 29, 2023		CodeCode Available	3
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving	Sep 29, 2023	Arithmetic ReasoningComputational Efficiency	CodeCode Available	3
Data Filtering Networks	Sep 29, 2023	Language ModelingLanguage Modelling	CodeCode Available	3
SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation	Sep 29, 2023	3D Human Pose Estimation3D Human Reconstruction	CodeCode Available	3
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation	Sep 27, 2023	GPUText-to-Video Generation	CodeCode Available	3
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction	Sep 22, 2023	Dynamic ReconstructionNeural Rendering	CodeCode Available	3