The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3301–3350 of 659983 papers

Title	Date	Tasks	Status	Hype
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders	Jul 19, 2024		CodeCode Available	3
Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python	Jul 18, 2024	Molecular Property PredictionProperty Prediction	CodeCode Available	3
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models	Jul 17, 2024	Instruction FollowingVision and Language Navigation	CodeCode Available	3
E5-V: Universal Embeddings with Multimodal Large Language Models	Jul 17, 2024		CodeCode Available	3
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases	Jul 17, 2024	Autonomous DrivingBackdoor Attack	CodeCode Available	3
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection	Jul 16, 2024	2D Object Detectionobject-detection	CodeCode Available	3
VISA: Reasoning Video Object Segmentation via Large Language Models	Jul 16, 2024	DecoderObject	CodeCode Available	3
TCFormer: Visual Recognition via Token Clustering Transformer	Jul 16, 2024	Clusteringimage-classification	CodeCode Available	3
Scaling Diffusion Transformers to 16 Billion Parameters	Jul 16, 2024	AttributeConditional Image Generation	CodeCode Available	3
The Oscars of AI Theater: A Survey on Role-Playing with Language Models	Jul 16, 2024	Survey	CodeCode Available	3
OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer	Jul 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition	Jul 15, 2024	Automated Theorem Proving	CodeCode Available	3
Evaluating Large Language Models with fmeval	Jul 15, 2024	Question Answering	CodeCode Available	3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases	Jul 15, 2024	Attributecounterfactual	CodeCode Available	3
Fast Matrix Multiplications for Lookup Table-Quantized LLMs	Jul 15, 2024	Quantization	CodeCode Available	3
Learning Dynamics of LLM Finetuning	Jul 15, 2024	Hallucination	CodeCode Available	3
Restoring Images in Adverse Weather Conditions via Histogram Transformer	Jul 14, 2024	Image Restoration	CodeCode Available	3
Human-like Episodic Memory for Infinite Context LLMs	Jul 12, 2024	Computational EfficiencyEvent Segmentation	CodeCode Available	3
A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization	Jul 12, 2024	Anomaly DetectionDefect Detection	CodeCode Available	3
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models	Jul 12, 2024	Image EnhancementLow-Light Image Enhancement	CodeCode Available	3
Single-Image Shadow Removal Using Deep Learning: A Comprehensive Survey	Jul 11, 2024	Deep LearningImage Restoration	CodeCode Available	3
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights	Jul 11, 2024	Motion GenerationSurvey	CodeCode Available	3
Unifying 3D Representation and Control of Diverse Robots with a Single Camera	Jul 11, 2024		CodeCode Available	3
WildGaussians: 3D Gaussian Splatting in the Wild	Jul 11, 2024	3DGS3D Scene Reconstruction	CodeCode Available	3
Video Diffusion Alignment via Reward Gradients	Jul 11, 2024		CodeCode Available	3
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion	Jul 10, 2024	Object DetectionZero-Shot Object Detection	CodeCode Available	3
Inference Performance Optimization for Large Language Models on CPUs	Jul 10, 2024	CPUGPU	CodeCode Available	3
Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation	Jul 10, 2024	3D human pose and shape estimation	CodeCode Available	3
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models	Jul 10, 2024	GPUQuantization	CodeCode Available	3
BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark	Jul 10, 2024	Imitation Learning	CodeCode Available	3
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models	Jul 9, 2024	Vision and Language Navigation	CodeCode Available	3
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective	Jul 9, 2024	Information RetrievalRetrieval	CodeCode Available	3
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts	Jul 9, 2024	3D Object Editing3D Reconstruction	CodeCode Available	3
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore	Jul 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation	Jul 9, 2024	BenchmarkingDomain Adaptation	CodeCode Available	3
A Survey on LoRA of Large Language Models	Jul 8, 2024	Federated Learningparameter-efficient fine-tuning	CodeCode Available	3
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks	Jul 7, 2024	Arithmetic Reasoning	CodeCode Available	3
Unified Approach for Hedging Impermanent Loss of Liquidity Provision	Jul 6, 2024		CodeCode Available	3
LoRA-GA: Low-Rank Adaptation with Gradient Approximation	Jul 6, 2024	GSM8Kparameter-efficient fine-tuning	CodeCode Available	3
LaRa: Efficient Large-Baseline Radiance Fields	Jul 5, 2024	3D ReconstructionNovel View Synthesis	CodeCode Available	3
CountGD: Multi-Modal Open-World Counting	Jul 5, 2024	Object CountingOpen-vocabulary object counting	CodeCode Available	3
Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data	Jul 5, 2024	Classificationregression	CodeCode Available	3
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation	Jul 5, 2024	Drum TranscriptionDrum Transcription in Music (DTM)	CodeCode Available	3
Simplifying Deep Temporal Difference Learning	Jul 5, 2024	Q-LearningReinforcement Learning (RL)	CodeCode Available	3
OneRestore: A Universal Restoration Framework for Composite Degradation	Jul 5, 2024	Image DehazingImage Restoration	CodeCode Available	3
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards	Jul 4, 2024	Code Completion	CodeCode Available	3
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models	Jul 2, 2024	Navigate	CodeCode Available	3
Consistency Flow Matching: Defining Straight Flows with Velocity Consistency	Jul 2, 2024	Image Generation	CodeCode Available	3
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models	Jul 2, 2024		CodeCode Available	3
TokenPacker: Efficient Visual Projector for Multimodal LLM	Jul 2, 2024	Language ModellingLarge Language Model	CodeCode Available	3