The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 13251–13300 of 474278 papers

Title	Date	Tasks	Status	Hype
Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models	Jul 3, 2025	Adversarial RobustnessAll	—Unverified	0
VeFIA: An Efficient Inference Auditing Framework for Vertical Federated Collaborative Software	Jul 3, 2025	Federated LearningVertical Federated Learning	—Unverified	0
High-Order Deep Meta-Learning with Category-Theoretic Interpretation	Jul 3, 2025	Meta-LearningTransfer Learning	—Unverified	0
Automated Grading of Students' Handwritten Graphs: A Comparison of Meta-Learning and Vision-Large Language Models	Jul 3, 2025	Meta-Learning	—Unverified	0
MC-INR: Efficient Encoding of Multivariate Scientific Simulation Data using Meta-Learning and Clustered Implicit Neural Representations	Jul 3, 2025	ClusteringMeta-Learning	—Unverified	0
Determination Of Structural Cracks Using Deep Learning Frameworks	Jul 3, 2025	Deep Learning	—Unverified	0
Evaluating Language Models For Threat Detection in IoT Security Logs	Jul 3, 2025	Anomaly Detection	CodeCode Available	0
S2FGL: Spatial Spectral Federated Graph Learning	Jul 3, 2025	Federated LearningGraph Learning	CodeCode Available	0
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection	Jul 3, 2025		CodeCode Available	1
Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection	Jul 3, 2025	Face Swapping	CodeCode Available	1
Embedding-Based Federated Data Sharing via Differentially Private Conditional VAEs	Jul 3, 2025	Federated Learning	CodeCode Available	0
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks	Jul 3, 2025	Instruction Following	CodeCode Available	2
GDC Cohort Copilot: An AI Copilot for Curating Cohorts from the Genomic Data Commons	Jul 3, 2025		CodeCode Available	0
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment	Jul 3, 2025	cross-modal alignmentInstruction Following	CodeCode Available	2
Federated Learning for ICD Classification with Lightweight Models and Pretrained Embeddings	Jul 3, 2025	Code ClassificationFederated Learning	—Unverified	0
Fluid Democracy in Federated Data Aggregation	Jul 3, 2025	Federated Learning	—Unverified	0
Weakly-supervised Contrastive Learning with Quantity Prompts for Moving Infrared Small Target Detection	Jul 3, 2025	Contrastive Learningobject-detection	CodeCode Available	0
MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks	Jul 3, 2025	FairnessMultiple-choice	—Unverified	0
FMOcc: TPV-Driven Flow Matching for 3D Occupancy Prediction with Selective State Space Model	Jul 3, 2025	3D Semantic Occupancy PredictionAutonomous Driving	—Unverified	0
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent	Jul 3, 2025	8k	—Unverified	0
De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks	Jul 3, 2025	Voice Cloning	—Unverified	0
UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation	Jul 3, 2025	Image Generation	—Unverified	0
Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning	Jul 3, 2025	Pose Estimation	—Unverified	0
LMPNet for Weakly-supervised Keypoint Discovery	Jul 3, 2025	ObjectPose Estimation	—Unverified	0
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction	Jul 3, 2025	Benchmarking	CodeCode Available	0
DeepGesture: A conversational gesture synthesis system based on emotions and semantics	Jul 3, 2025	Gesture GenerationMotion Synthesis	CodeCode Available	0
SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment	Jul 3, 2025	3D ReconstructionScene Understanding	CodeCode Available	2
Hita: Holistic Tokenizer for Autoregressive Image Generation	Jul 3, 2025	Image GenerationStyle Transfer	CodeCode Available	0
Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work	Jul 3, 2025	RAGRetrieval-augmented Generation	—Unverified	0
Continual Gradient Low-Rank Projection Fine-Tuning for LLMs	Jul 3, 2025	Continual Learning	CodeCode Available	0
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation	Jul 3, 2025	DiversityVideo Generation	CodeCode Available	2
AnyI2V: Animating Any Conditional Image with Motion Control	Jul 3, 2025	Style TransferVideo Generation	—Unverified	0
Fast and Simplex: 2-Simplicial Attention in Triton	Jul 3, 2025	valid	—Unverified	0
IndianBailJudgments-1200: A Multi-Attribute Dataset for Legal NLP on Indian Bail Orders	Jul 3, 2025	AttributeFairness	—Unverified	0
RLHGNN: Reinforcement Learning-driven Heterogeneous Graph Neural Network for Next Activity Prediction in Business Processes	Jul 3, 2025	Activity PredictionGraph Neural Network	CodeCode Available	0
Early Signs of Steganographic Capabilities in Frontier LLMs	Jul 3, 2025	Large Language Model	CodeCode Available	0
No time to train! Training-Free Reference-Based Instance Segmentation	Jul 3, 2025	Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection	CodeCode Available	3
Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited Data	Jul 3, 2025	BenchmarkingRepresentation Learning	CodeCode Available	1
CyberRAG: An agentic RAG cyber attack classification and reporting tool	Jul 3, 2025	Intrusion DetectionRAG	—Unverified	0
MAC-Lookup: Multi-Axis Conditional Lookup Model for Underwater Image Enhancement	Jul 3, 2025	Image Enhancement	CodeCode Available	0
Understanding and Improving Length Generalization in Recurrent Models	Jul 3, 2025	2kState Space Models	—Unverified	0
Detection of Rail Line Track and Human Beings Near the Track to Avoid Accidents	Jul 3, 2025	Line Detectionobject-detection	—Unverified	0
CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks	Jul 3, 2025	BenchmarkingCode Generation	—Unverified	0
Exploring Gender Bias Beyond Occupational Titles	Jul 3, 2025		CodeCode Available	0
Adversarial Manipulation of Reasoning Models using Internal Representations	Jul 3, 2025		CodeCode Available	0
WebSailor: Navigating Super-human Reasoning for Web Agent	Jul 3, 2025		CodeCode Available	11
Physics-informed Ground Reaction Dynamics from Human Motion Capture	Jul 2, 2025		CodeCode Available	0
Confidence and Stability of Global and Pairwise Scores in NLP Evaluation	Jul 2, 2025		CodeCode Available	0
Optimizing Methane Detection On Board Satellites: Speed, Accuracy, and Low-Power Solutions for Resource-Constrained Hardware	Jul 2, 2025		CodeCode Available	0
Just Noticeable Difference for Large Multimodal Models	Jul 2, 2025		CodeCode Available	0