The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8901–8925 of 474278 papers

Title	Date	Status
STEAM: A Semantic-Level Knowledge Editing Framework for Large Language Models	Oct 12, 2025	CodeCode Available
Graph Your Own Prompt	Oct 12, 2025	CodeCode Available
Multi-Task Learning with Feature-Similarity Laplacian Graphs for Predicting Alzheimer's Disease Progression	Oct 12, 2025	CodeCode Available
Anchor-based Maximum Discrepancy for Relative Similarity Testing	Oct 12, 2025	CodeCode Available
A Simple and Better Baseline for Visual Grounding	Oct 12, 2025	CodeCode Available
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization	Oct 12, 2025	CodeCode Available
ProteinAE: Protein Diffusion Autoencoders for Structure Encoding	Oct 12, 2025	CodeCode Available
Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine Translation	Oct 12, 2025	CodeCode Available
Are Language Models Consequentialist or Deontological Moral Reasoners?	Oct 12, 2025	CodeCode Available
Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting	Oct 12, 2025	CodeCode Available
Probing the Difficulty Perception Mechanism of Large Language Models	Oct 12, 2025	CodeCode Available
RECON: Reasoning with Condensation for Efficient Retrieval-Augmented Generation	Oct 12, 2025	CodeCode Available
Towards Self-Refinement of Vision-Language Models with Triangular Consistency	Oct 12, 2025	CodeCode Available
MSM-Seg: A Modality-and-Slice Memory Framework with Category-Agnostic Prompting for Multi-Modal Brain Tumor Segmentation	Oct 12, 2025	CodeCode Available
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining	Oct 12, 2025	CodeCode Available
Fast and Interpretable Protein Substructure Alignment via Optimal Transport	Oct 12, 2025	CodeCode Available
RobotFleet: An Open-Source Framework for Centralized Multi-Robot Task Planning	Oct 12, 2025	CodeCode Available
RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering	Oct 11, 2025	CodeCode Available
EditCast3D: Single-Frame-Guided 3D Editing with Video Propagation and View Selection	Oct 11, 2025	CodeCode Available
Multi-Scale Diffusion Transformer for Jointly Simulating User Mobility and Mobile Traffic Pattern	Oct 11, 2025	CodeCode Available
VL Norm: Rethink Loss Aggregation in RLVR	Oct 11, 2025	CodeCode Available
Latent Reasoning via Sentence Embedding Prediction	Oct 11, 2025	—Unverified
Language Surgery in Multilingual Large Language Models	Oct 11, 2025	—Unverified
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny	Oct 11, 2025	—Unverified
Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction	Oct 11, 2025	CodeCode Available