The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 15651–15700 of 474278 papers

Title	Date	Tasks	Status	Hype
DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces	May 24, 2025	Dictionary Learning	CodeCode Available	1
MSLAU-Net: A Hybird CNN-Transformer Network for Medical Image Segmentation	May 24, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available	1
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models	May 24, 2025		CodeCode Available	1
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks	May 24, 2025	Image GenerationInstruction Following	CodeCode Available	1
MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention	May 24, 2025	16k4k	CodeCode Available	1
MLRan: A Behavioural Dataset for Ransomware Analysis and Detection	May 24, 2025	feature selection	CodeCode Available	1
Audio Jailbreak Attacks: Exposing Vulnerabilities in SpeechGPT in a White-Box Framework	May 24, 2025	Adversarial AttackSpeech Tokenization	CodeCode Available	1
Flex-Judge: Think Once, Judge Anywhere	May 24, 2025		CodeCode Available	1
VORTA: Efficient Video Diffusion via Routing Sparse Attention	May 24, 2025	Video Generation	CodeCode Available	1
Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation	May 24, 2025	Semantic SimilaritySemantic Textual Similarity	CodeCode Available	1
Removal of Hallucination on Hallucination: Debate-Augmented RAG	May 24, 2025	HallucinationRAG	CodeCode Available	1
Signal, Image, or Symbolic: Exploring the Best Input Representation for Electrocardiogram-Language Models Through a Unified Framework	May 24, 2025		CodeCode Available	1
GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis	May 24, 2025	RAGRetrieval	CodeCode Available	1
Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting	May 24, 2025	Time SeriesTime Series Forecasting	CodeCode Available	1
ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation	May 24, 2025	Mixture-of-Experts	CodeCode Available	1
Generative Distribution Embeddings	May 23, 2025		CodeCode Available	1
VEAttack: Downstream-agnostic Vision Encoder Attack against Large Vision Language Models	May 23, 2025	Question AnsweringVisual Question Answering	CodeCode Available	1
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation	May 23, 2025	Pose Estimation	CodeCode Available	1
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis	May 23, 2025	Instruction Following	CodeCode Available	1
UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information	May 23, 2025	Large Language ModelQuantization	CodeCode Available	1
BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs	May 23, 2025	Model OptimizationTask Planning	CodeCode Available	1
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation	May 23, 2025	Image Generationreinforcement-learning	CodeCode Available	1
Value-Guided Search for Efficient Chain-of-Thought Reasoning	May 23, 2025	Math	CodeCode Available	1
RaDeR: Reasoning-aware Dense Retrieval Models	May 23, 2025	MathMathematical Problem-Solving	CodeCode Available	1
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms	May 23, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Frankentext: Stitching random text fragments into long-form narratives	May 23, 2025	Form	CodeCode Available	1
MetaGen Blended RAG: Higher Accuracy for Domain-Specific Q&A Without Fine-Tuning	May 23, 2025	Few-Shot LearningQuestion Answering	CodeCode Available	1
HRSim: An agent-based simulation platform for high-capacity ride-sharing services	May 23, 2025		CodeCode Available	1
RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning	May 23, 2025	Image GenerationLanguage Modeling	CodeCode Available	1
Taming Diffusion for Dataset Distillation with High Representativeness	May 23, 2025	Dataset DistillationImage Generation	CodeCode Available	1
The Cell Must Go On: Agar.io for Continual Reinforcement Learning	May 23, 2025	Continual LearningDeep Reinforcement Learning	CodeCode Available	1
CENet: Context Enhancement Network for Medical Image Segmentation	May 23, 2025	DecoderImage Segmentation	CodeCode Available	1
Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph	May 23, 2025	BenchmarkingManagement	CodeCode Available	1
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning	May 23, 2025	MathReinforcement Learning (RL)	CodeCode Available	1
Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction	May 23, 2025	Click-Through Rate PredictionRecommendation Systems	CodeCode Available	1
Object-level Cross-view Geo-localization with Location Enhancement and Multi-Head Cross Attention	May 23, 2025	Few-Shot Learninggeo-localization	CodeCode Available	1
T2VUnlearning: A Concept Erasing Method for Text-to-Video Diffusion Models	May 23, 2025		CodeCode Available	1
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models	May 23, 2025	MambaTime Series Classification	CodeCode Available	1
Reinforcement Learning for Ballbot Navigation in Uneven Terrain	May 23, 2025	MuJoCoreinforcement-learning	CodeCode Available	1
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders	May 23, 2025	Semantic Segmentation	CodeCode Available	1
Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing	May 23, 2025	de novo peptide sequencingReranking	CodeCode Available	1
Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection	May 23, 2025	Anomaly DetectionMulti-class Anomaly Detection	CodeCode Available	1
Knot So Simple: A Minimalistic Environment for Spatial Reasoning	May 23, 2025	Model Predictive ControlSpatial Reasoning	CodeCode Available	1
Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities	May 23, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States	May 23, 2025	Theory of Mind Modeling	CodeCode Available	1
Semantic Correspondence: Unified Benchmarking and a Strong Baseline	May 23, 2025	BenchmarkingSemantic correspondence	CodeCode Available	1
ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework	May 23, 2025		CodeCode Available	1
The Origins of Representation Manifolds in Large Language Models	May 23, 2025		CodeCode Available	1
Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions	May 23, 2025	2kBenchmarking	CodeCode Available	1
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens	May 23, 2025	Large Language Model	CodeCode Available	1