The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–1000 of 659983 papers

Title	Date	Tasks	Status	Hype
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts	May 18, 2024	Mixture-of-ExpertsVisual Question Answering	CodeCode Available	5
RLHF Workflow: From Reward Modeling to Online RLHF	May 13, 2024	ChatbotHumanEval	CodeCode Available	5
Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solvers	May 10, 2024		CodeCode Available	5
Evaluating Real-World Robot Manipulation Policies in Simulation	May 9, 2024	Robotic GraspingRobot Manipulation	CodeCode Available	5
Granite Code Models: A Family of Open Foundation Models for Code Intelligence	May 7, 2024	Code GenerationDecoder	CodeCode Available	5
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding	May 6, 2024	Metric LearningSelf-Supervised Learning	CodeCode Available	5
When LLMs Meet Cybersecurity: A Systematic Literature Review	May 6, 2024	Systematic Literature Review	CodeCode Available	5
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation	May 2, 2024	MuJoCoReinforcement Learning (RL)	CodeCode Available	5
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models	May 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
XFeat: Accelerated Features for Lightweight Image Matching	Apr 30, 2024	CPUKeypoint detection and image matching	CodeCode Available	5
Make Your LLM Fully Utilize the Context	Apr 25, 2024	4kInformation Retrieval	CodeCode Available	5
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving	Apr 25, 2024	Diversity	CodeCode Available	5
NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results	Apr 22, 2024	4kImage Enhancement	CodeCode Available	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5
Do "English" Named Entity Recognizers Work Well on Global Englishes?	Apr 20, 2024	named-entity-recognitionNamed Entity Recognition	CodeCode Available	5
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs	Apr 19, 2024	Event ExtractionIn-Context Learning	CodeCode Available	5
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean	Apr 18, 2024	Automated Theorem ProvingHallucination	CodeCode Available	5
Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes	Apr 16, 2024	3DGSNovel View Synthesis	CodeCode Available	5
Magic Clothing: Controllable Garment-Driven Image Synthesis	Apr 15, 2024	Image Generation	CodeCode Available	5
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks	Apr 15, 2024	Quantization	CodeCode Available	5
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization	Apr 15, 2024	Audio Generation	CodeCode Available	5
MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts	Apr 13, 2024	DiversityLanguage Modeling	CodeCode Available	5
The Path To Autonomous Cyber Defense	Apr 12, 2024		CodeCode Available	5
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models	Apr 10, 2024	Decision Making	CodeCode Available	5
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders	Apr 9, 2024	Contrastive LearningDecoder	CodeCode Available	5
SpeechAlign: Aligning Speech Generation to Human Preferences	Apr 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer	Apr 8, 2024	MuJoCoPhysical Simulations	CodeCode Available	5
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators	Apr 7, 2024	Text-to-Video GenerationVideo Generation	CodeCode Available	5
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators	Apr 6, 2024	Chatbotcounterfactual	CodeCode Available	5
SpatialTracker: Tracking Any 2D Pixels in 3D Space	Apr 5, 2024		CodeCode Available	5
ReFT: Representation Finetuning for Language Models	Apr 4, 2024	Arithmetic Reasoning	CodeCode Available	5
Masked Completion via Structured Diffusion with White-Box Transformers	Apr 3, 2024	Representation Learning	CodeCode Available	5
Long-context LLMs Struggle with Long In-context Learning	Apr 2, 2024	2kIn-Context Learning	CodeCode Available	5
CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians	Apr 1, 2024	3DGS3D Scene Reconstruction	CodeCode Available	5
Measuring Taiwanese Mandarin Language Understanding	Mar 29, 2024		CodeCode Available	5
TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods	Mar 29, 2024	BenchmarkingMultivariate Time Series Forecasting	CodeCode Available	5
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds	Mar 29, 2024	3D ReconstructionNovel View Synthesis	CodeCode Available	5
GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond	Mar 28, 2024	3DGSNovel View Synthesis	CodeCode Available	5
UniDepth: Universal Monocular Metric Depth Estimation	Mar 27, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	5
ChatDBG: Augmenting Debugging with Large Language Models	Mar 25, 2024	C++ codeNavigate	CodeCode Available	5
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text	Mar 21, 2024	Text-to-Video GenerationVideo Generation	CodeCode Available	5
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images	Mar 21, 2024	3D ReconstructionGeneralizable Novel View Synthesis	CodeCode Available	5
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework	Mar 20, 2024	Image to Video GenerationText-to-Video Generation	CodeCode Available	5
Evolutionary Optimization of Model Merging Recipes	Mar 19, 2024	Evolutionary AlgorithmsMath	CodeCode Available	5
FeatUp: A Model-Agnostic Framework for Features at Any Resolution	Mar 15, 2024	Depth EstimationDepth Prediction	CodeCode Available	5
Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator	Mar 13, 2024		CodeCode Available	5
Fundamental Components of Deep Learning: A category-theoretic approach	Mar 13, 2024	Deep LearningDescriptive	CodeCode Available	5
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?	Mar 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation	Mar 12, 2024	Image GenerationLanguage Modelling	CodeCode Available	5
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions	Mar 12, 2024	Model Editing	CodeCode Available	5