The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11251–11300 of 661570 papers

Title	Date	Status	Hype
Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development	Mar 4, 2026	—Unverified	0
Using Vision + Language Models to Predict Item Difficulty	Mar 4, 2026	—Unverified	0
Category-Level Object Shape and Pose Estimation in Less Than a Millisecond	Mar 4, 2026	CodeCode Available	0
LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection	Mar 4, 2026	CodeCode Available	0
LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance	Mar 4, 2026	CodeCode Available	0
PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology	Mar 4, 2026	CodeCode Available	0
BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models	Mar 4, 2026	CodeCode Available	0
Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models	Mar 4, 2026	CodeCode Available	0
EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset	Mar 4, 2026	CodeCode Available	0
Still Fresh? Evaluating Temporal Drift in Retrieval Benchmarks	Mar 4, 2026	CodeCode Available	0
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning	Mar 4, 2026	CodeCode Available	0
On Imbalanced Regression with Hoeffding Trees	Mar 4, 2026	CodeCode Available	0
Parallel Token Prediction for Language Models	Mar 4, 2026	CodeCode Available	0
AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2	Mar 4, 2026	CodeCode Available	0
Do We Need All the Synthetic Data? Targeted Image Augmentation via Diffusion Models	Mar 4, 2026	CodeCode Available	0
NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization	Mar 4, 2026	CodeCode Available	0
Generalization of RLVR Using Causal Reasoning as a Testbed	Mar 4, 2026	CodeCode Available	0
DeNuC: Decoupling Nuclei Detection and Classification in Histopathology	Mar 4, 2026	CodeCode Available	0
MOO: A Multi-view Oriented Observations Dataset for Viewpoint Analysis in Cattle Re-Identification	Mar 4, 2026	CodeCode Available	0
Hold-One-Shot-Out (HOSO) for Validation-Free Few-Shot CLIP Adapters	Mar 4, 2026	CodeCode Available	0
TabStruct: Measuring Structural Fidelity of Tabular Data	Mar 4, 2026	CodeCode Available	0
Optimizing Language Models for Crosslingual Knowledge Consistency	Mar 4, 2026	CodeCode Available	0
Improving Multi-View Reconstruction via Texture-Guided Gaussian-Mesh Joint Optimization	Mar 4, 2026	CodeCode Available	0
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding	Mar 4, 2026	—Unverified	2
RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies	Mar 4, 2026	—Unverified	2
V_1: Unifying Generation and Self-Verification for Parallel Reasoners	Mar 4, 2026	—Unverified	1
Helios: Real Real-Time Long Video Generation Model	Mar 4, 2026	—Unverified	5
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video	Mar 4, 2026	—Unverified	2
Discovering mathematical concepts through a multi-agent system	Mar 4, 2026	—Unverified	0
SELDON: Supernova Explosions Learned by Deep ODE Networks	Mar 4, 2026	—Unverified	0
Code Fingerprints: Disentangled Attribution of LLM-Generated Code	Mar 4, 2026	CodeCode Available	0
Scriboora: Rethinking Human Pose Forecasting	Mar 4, 2026	—Unverified	0
A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving	Mar 4, 2026	—Unverified	0
Implicit Bias of Per-sample Adam on Separable Data: Departure from the Full-batch Regime	Mar 4, 2026	—Unverified	0
Reducing hyperparameter sensitivity in measurement-feedback based Ising machines	Mar 4, 2026	—Unverified	0
NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect	Mar 4, 2026	—Unverified	0
Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling	Mar 4, 2026	—Unverified	0
Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition	Mar 4, 2026	—Unverified	0
CaRe-BN: Precise Moving Statistics for Stabilizing Spiking Neural Networks in Reinforcement Learning	Mar 4, 2026	CodeCode Available	0
Invariance-Based Dynamic Regret Minimization	Mar 4, 2026	—Unverified	0
FastWave: Optimized Diffusion Model for Audio Super-Resolution	Mar 4, 2026	—Unverified	0
Towards Generalized Multimodal Homography Estimation	Mar 4, 2026	—Unverified	0
Learning in Markov Decision Processes with Exogenous Dynamics	Mar 4, 2026	—Unverified	0
On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy	Mar 4, 2026	—Unverified	0
Enhancing Feature Fusion of U-like Networks with Dynamic Skip Connections	Mar 4, 2026	—Unverified	0
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions	Mar 4, 2026	—Unverified	0
Machine Pareidolia: Protecting Facial Image with Emotional Editing	Mar 4, 2026	—Unverified	0
Structure-aware Prompt Adaptation from Seen to Unseen for Open-Vocabulary Compositional Zero-Shot Learning	Mar 4, 2026	—Unverified	0
Generalized non-exponential Gaussian splatting	Mar 4, 2026	—Unverified	0
Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models	Mar 4, 2026	—Unverified	0