The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8001–8025 of 474278 papers

Title	Date	Status
Buffer layers for Test-Time Adaptation	Oct 30, 2025	CodeCode Available
StructLayoutFormer:Conditional Structured Layout Generation via Structure Serialization and Disentanglement	Oct 30, 2025	CodeCode Available
Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections	Oct 30, 2025	CodeCode Available
SA^2Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection Imaging	Oct 30, 2025	CodeCode Available
Curly Flow Matching for Learning Non-gradient Field Dynamics	Oct 30, 2025	CodeCode Available
Accurate Target Privacy Preserving Federated Learning Balancing Fairness and Utility	Oct 30, 2025	CodeCode Available
Towards Realistic Earth-Observation Constellation Scheduling: Benchmark and Methodology	Oct 30, 2025	CodeCode Available
TEXT2DB: Integration-Aware Information Extraction with Large Language Model Agents	Oct 30, 2025	CodeCode Available
ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language Models	Oct 30, 2025	CodeCode Available
SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over Earth	Oct 30, 2025	CodeCode Available
Angular Steering: Behavior Control via Rotation in Activation Space	Oct 30, 2025	CodeCode Available
MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data	Oct 30, 2025	CodeCode Available
UnifiedFL: A Dynamic Unified Learning Framework for Equitable Federation	Oct 30, 2025	CodeCode Available
Simulating and Experimenting with Social Media Mobilization Using LLM Agents	Oct 30, 2025	CodeCode Available
Emu3.5: Native Multimodal Models are World Learners	Oct 30, 2025	CodeCode Available
BRIQA: Balanced Reweighting in Image Quality Assessment of Pediatric Brain MRI	Oct 30, 2025	CodeCode Available
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains	Oct 30, 2025	—Unverified
GMTRouter: Personalized LLM Router over Multi-turn User Interactions	Oct 29, 2025	CodeCode Available
MaGNet: A Mamba Dual-Hypergraph Network for Stock Prediction via Temporal-Causal and Global Relational Learning	Oct 29, 2025	CodeCode Available
H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of Experts	Oct 29, 2025	CodeCode Available
Seeing Clearly and Deeply: An RGBD Imaging Approach with a Bio-inspired Monocentric Design	Oct 29, 2025	CodeCode Available
Modular Linear Tokenization (MLT)	Oct 29, 2025	CodeCode Available
Precise In-Parameter Concept Erasure in Large Language Models	Oct 29, 2025	—Unverified
OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models	Oct 29, 2025	—Unverified
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation	Oct 29, 2025	—Unverified