The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3051–3100 of 659983 papers

Title	Date	Tasks	Status	Hype
DPLM-2: A Multimodal Diffusion Protein Language Model	Oct 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Automatically Interpreting Millions of Features in Large Language Models	Oct 17, 2024	Semantic SimilaritySemantic Textual Similarity	CodeCode Available	3
Movie Gen: A Cast of Media Foundation Models	Oct 17, 2024	Audio GenerationVideo Editing	CodeCode Available	3
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models	Oct 16, 2024	DiagnosticHallucination	CodeCode Available	3
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio	Oct 16, 2024	Hallucination	CodeCode Available	3
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models	Oct 16, 2024	HallucinationKnowledge Graphs	CodeCode Available	3
Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception	Oct 16, 2024	Binary ClassificationChunking	CodeCode Available	3
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation	Oct 16, 2024	AttributeImage Generation	CodeCode Available	3
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking	Oct 16, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies	Oct 15, 2024		CodeCode Available	3
Latent Action Pretraining from Videos	Oct 15, 2024	QuantizationRobot Manipulation	CodeCode Available	3
GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation	Oct 14, 2024	Time SeriesTime Series Forecasting	CodeCode Available	3
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory	Oct 14, 2024	BenchmarkingLarge Language Model	CodeCode Available	3
Predicting from Strings: Language Model Embeddings for Bayesian Optimization	Oct 14, 2024	Bayesian OptimizationExperimental Design	CodeCode Available	3
LoLCATs: On Low-Rank Linearizing of Large Language Models	Oct 14, 2024	MMLU	CodeCode Available	3
UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation	Oct 14, 2024	Semantic SegmentationSemi-supervised Change Detection	CodeCode Available	3
Large-Scale 3D Medical Image Pre-training with Geometric Context Priors	Oct 13, 2024	Contrastive LearningMedical Image Analysis	CodeCode Available	3
FlatQuant: Flatness Matters for LLM Quantization	Oct 12, 2024	Quantization	CodeCode Available	3
MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection	Oct 12, 2024	Anomaly Detection	CodeCode Available	3
C-Adapter: Adapting Deep Classifiers for Efficient Conformal Prediction Sets	Oct 12, 2024	Conformal PredictionPrediction	CodeCode Available	3
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation	Oct 12, 2024	Conditional Image GenerationGPU	CodeCode Available	3
SceneCraft: Layout-Guided 3D Scene Generation	Oct 11, 2024	3D GenerationImage Generation	CodeCode Available	3
Baichuan-Omni Technical Report	Oct 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning	Oct 10, 2024	3D Parameter-Efficient Fine-Tuning for Classification3D Point Cloud Classification	CodeCode Available	3
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis	Oct 10, 2024	Feature CompressionImage Generation	CodeCode Available	3
Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond	Oct 10, 2024	Large Language ModelRecommendation Systems	CodeCode Available	3
Fast Feedforward 3D Gaussian Splatting Compression	Oct 10, 2024	3DGSNovel View Synthesis	CodeCode Available	3
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow	Oct 9, 2024		CodeCode Available	3
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making	Oct 9, 2024	BenchmarkingDecision Making	CodeCode Available	3
TopoTune : A Framework for Generalized Combinatorial Complex Neural Networks	Oct 9, 2024	Graph Neural Network	CodeCode Available	3
Rethinking the Evaluation of Visible and Infrared Image Fusion	Oct 9, 2024	object-detectionObject Detection	CodeCode Available	3
AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation	Oct 8, 2024	DenoisingImage Generation	CodeCode Available	3
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design	Oct 8, 2024	Video AlignmentVideo Generation	CodeCode Available	3
AgentSquare: Automatic LLM Agent Search in Modular Design Space	Oct 8, 2024		CodeCode Available	3
Residual Kolmogorov-Arnold Network for Enhanced Deep Learning	Oct 7, 2024	Computational EfficiencyDeep Learning	CodeCode Available	3
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents	Oct 7, 2024	Natural Language Visual GroundingNavigate	CodeCode Available	3
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference	Oct 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
High-Speed Stereo Visual SLAM for Low-Powered Computing Devices	Oct 5, 2024	GPU	CodeCode Available	3
Accelerating Diffusion Transformers with Token-wise Feature Caching	Oct 5, 2024	Video Generation	CodeCode Available	3
Neuron-Level Sequential Editing for Large Language Models	Oct 5, 2024	Model Editing	CodeCode Available	3
MELODI: Exploring Memory Compression for Long Contexts	Oct 4, 2024		CodeCode Available	3
CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control	Oct 4, 2024	Motion GenerationReinforcement Learning (RL)	CodeCode Available	3
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation	Oct 4, 2024	16kCode Generation	CodeCode Available	3
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models	Oct 3, 2024	knowledge editingModel Editing	CodeCode Available	3
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly	Oct 3, 2024	RAG	CodeCode Available	3
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs	Oct 3, 2024	Red Teaming	CodeCode Available	3
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph	Oct 3, 2024	Code Generation	CodeCode Available	3
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models	Oct 3, 2024		CodeCode Available	3
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1	Oct 3, 2024	Scheduling	CodeCode Available	3
ControlAR: Controllable Image Generation with Autoregressive Models	Oct 3, 2024	Image Generation	CodeCode Available	3