The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,319 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–500 of 658356 papers

Title	Date	Tasks	Status	Hype
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding	May 14, 2024	Image GenerationLanguage Modeling	CodeCode Available	7
MambaOut: Do We Really Need Mamba for Vision?	May 13, 2024	image-classificationImage Classification	CodeCode Available	7
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents	May 11, 2024		CodeCode Available	7
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers	May 9, 2024		CodeCode Available	7
Mirage: A Multi-Level Superoptimizer for Tensor Programs	May 9, 2024	GPUNavigate	CodeCode Available	7
xLSTM: Extended Long Short-Term Memory	May 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
Labeling supervised fine-tuning data with the scaling law	May 5, 2024	coreference-resolutionCoreference Resolution	CodeCode Available	7
PuLID: Pure and Lightning ID Customization via Contrastive Alignment	Apr 24, 2024	Image GenerationText to Image Generation	CodeCode Available	7
Semantic Routing for Enhanced Performance of LLM-Assisted Intent-Based 5G Core Network Management and Orchestration	Apr 24, 2024	ManagementPrompt Engineering	CodeCode Available	7
Better Synthetic Data by Retrieving and Transforming Existing Datasets	Apr 22, 2024	Dataset GenerationDiversity	CodeCode Available	7
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models	Apr 19, 2024		CodeCode Available	7
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents	Apr 16, 2024	Fact CheckingRetrieval-augmented Generation	CodeCode Available	7
Long-form music generation with latent diffusion	Apr 16, 2024	Audio GenerationForm	CodeCode Available	7
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments	Apr 11, 2024	Benchmarking	CodeCode Available	7
Interactive Prompt Debugging with Sequence Salience	Apr 11, 2024	Sentencetext-classification	CodeCode Available	7
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models	Apr 10, 2024	Image to 3D	CodeCode Available	7
AutoCodeRover: Autonomous Program Improvement	Apr 8, 2024	Bug fixingCode Search	CodeCode Available	7
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models	Apr 8, 2024		CodeCode Available	7
Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach	Apr 7, 2024	Efficient ExplorationHyperparameter Optimization	CodeCode Available	7
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation	Apr 3, 2024	Image GenerationText to Image Generation	CodeCode Available	7
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models	Mar 27, 2024	Image ClassificationImage Comprehension	CodeCode Available	7
2D Gaussian Splatting for Geometrically Accurate Radiance Fields	Mar 26, 2024	3DGSNovel View Synthesis	CodeCode Available	7
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	Mar 22, 2024	Action ClassificationAction Recognition	CodeCode Available	7
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy	Mar 21, 2024	Contrastive LearningDescriptive	CodeCode Available	7
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance	Mar 21, 2024	Animated GIF GenerationImage Animation	CodeCode Available	7
Foundation Models for Time Series Analysis: A Tutorial and Survey	Mar 21, 2024	SurveyTime Series	CodeCode Available	7
One-Step Image Translation with Text-to-Image Models	Mar 18, 2024	DenoisingTranslation	CodeCode Available	7
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers	Mar 15, 2024	Text GenerationVideo Generation	CodeCode Available	7
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences	Mar 14, 2024	HumanEval	CodeCode Available	7
GenAD: Generalized Predictive Model for Autonomous Driving	Mar 14, 2024	Autonomous Drivingmodel	CodeCode Available	7
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation	Mar 13, 2024	Image GenerationPrompt Engineering	CodeCode Available	7
DragAnything: Motion Control for Anything using Entity Representation	Mar 12, 2024	ObjectVideo Generation	CodeCode Available	7
Chronos: Learning the Language of Time Series	Mar 12, 2024	Gaussian ProcessesLanguage Modeling	CodeCode Available	7
Better than classical? The subtle art of benchmarking quantum machine learning models	Mar 11, 2024	BenchmarkingBinary Classification	CodeCode Available	7
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7
Improving Diffusion Models for Authentic Virtual Try-on in the Wild	Mar 8, 2024	Virtual Try-on	CodeCode Available	7
Symmetry Considerations for Learning Task Symmetric Robot Policies	Mar 7, 2024	Data AugmentationDeep Reinforcement Learning	CodeCode Available	7
Cradle: Empowering Foundation Agents Towards General Computer Control	Mar 5, 2024	Efficient Exploration	CodeCode Available	7
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation	Mar 1, 2024		CodeCode Available	7
SoftTiger: A Clinical Foundation Model for Healthcare Workflows	Mar 1, 2024	Language ModellingLarge Language Model	CodeCode Available	7
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables	Feb 29, 2024	Time SeriesTime Series Forecasting	CodeCode Available	7
StarCoder 2 and The Stack v2: The Next Generation	Feb 29, 2024	Code CompletionCode Generation	CodeCode Available	7
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models	Feb 29, 2024	Language ModellingMamba	CodeCode Available	7
Transparent Image Layer Diffusion using Latent Transparency	Feb 27, 2024		CodeCode Available	7
Dynamic Evaluation of Large Language Models by Meta Probing Agents	Feb 21, 2024	Data Augmentation	CodeCode Available	7
Revisiting Feature Prediction for Learning Visual Representations from Video	Feb 15, 2024	Prediction	CodeCode Available	7
On the Vulnerability of LLM/VLM-Controlled Robotics	Feb 15, 2024	Language ModellingRobot Manipulation	CodeCode Available	7
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models	Feb 8, 2024	BenchmarkingDiversity	CodeCode Available	7
Fast Timing-Conditioned Latent Audio Diffusion	Feb 7, 2024	Audio GenerationGPU	CodeCode Available	7