The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 15201–15250 of 474278 papers

Title	Date	Tasks	Status	Hype
A Systematic Replicability and Comparative Study of BSARec and SASRec for Sequential Recommendation	Jun 17, 2025	Recommendation SystemsSequential Recommendation	—Unverified	0
Vela: Scalable Embeddings with Voice Large Language Models for Multimodal Retrieval	Jun 17, 2025	In-Context LearningRetrieval	—Unverified	0
Call To Speak To Someone At Expedia Through Various Contact Options: The Ultimate Step Guide	Jun 17, 2025	NavigateTAG	—Unverified	0
Light Aircraft Game : Basic Implementation and training results analysis	Jun 17, 2025	Multi-agent Reinforcement Learning	CodeCode Available	0
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift	Jun 17, 2025		CodeCode Available	0
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models	Jun 17, 2025	Kolmogorov-Arnold NetworksSelf-Supervised Learning	CodeCode Available	0
Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding	Jun 17, 2025	Multiple-choiceNatural Language Inference	—Unverified	0
LexiMark: Robust Watermarking via Lexical Substitutions to Enhance Membership Verification of an LLM's Textual Training Data	Jun 17, 2025	Memorization	CodeCode Available	0
Accurate and scalable exchange-correlation with deep learning	Jun 17, 2025	Computational EfficiencyDeep Learning	—Unverified	0
Doppelganger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack	Jun 17, 2025	Adversarial AttackPrompt Engineering	—Unverified	0
M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset	Jun 17, 2025	Domain Adaptationspeaker-diarization	—Unverified	0
Uncertainty-Driven Radar-Inertial Fusion for Instantaneous 3D Ego-Velocity Estimation	Jun 17, 2025	Autonomous NavigationMotion Estimation	—Unverified	0
Fair for a few: Improving Fairness in Doubly Imbalanced Datasets	Jun 17, 2025	AttributeDecision Making	—Unverified	0
How to Speak to a Real Person at Singapore Airlines®: 15 Easy Methods Explained	Jun 17, 2025	NavigateTAG	—Unverified	0
Call To Speak To Someone At Frontier™️ Airlines Through Various Contact Options: The Ultimate Step Guide	Jun 17, 2025	NavigateTAG	—Unverified	0
Acoustic scattering AI for non-invasive object classifications: A case study on hair assessment	Jun 17, 2025	ClassificationDeep Learning	—Unverified	0
Steering Robots with Inference-Time Interactions	Jun 17, 2025	Imitation Learning	—Unverified	0
Exploring Speaker Diarization with Mixture of Experts	Jun 17, 2025	Mixture-of-Expertsspeaker-diarization	—Unverified	0
RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge	Jun 17, 2025	Answer GenerationLanguage Modeling	CodeCode Available	1
VideoMAR: Autoregressive Video Generatio with Continuous Tokens	Jun 17, 2025	GPUImage Generation	—Unverified	0
Convergence-Privacy-Fairness Trade-Off in Personalized Federated Learning	Jun 17, 2025	FairnessFederated Learning	—Unverified	0
Human-Centered Editable Speech-to-Sign-Language Generation via Streaming Conformer-Transformer and Resampling Hook	Jun 17, 2025	Motion GenerationText Generation	—Unverified	0
Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems	Jun 17, 2025	Autonomous DrivingImage Segmentation	—Unverified	0
RadFabric: Agentic AI System with Reasoning Capability for Radiology	Jun 17, 2025	DiagnosticMultimodal Reasoning	—Unverified	0
Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving Regularization	Jun 17, 2025	Computational EfficiencyDenoising	CodeCode Available	0
Interpreting Biomedical VLMs on High-Imbalance Out-of-Distributions: An Insight into BiomedCLIP on Radiology	Jun 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops	Jun 17, 2025	POSTAG	—Unverified	0
GAMORA: A Gesture Articulated Meta Operative Robotic Arm for Hazardous Material Handling in Containment-Level Environments	Jun 17, 2025	Motion PlanningUnity	—Unverified	0
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees	Jun 17, 2025	Code TranslationHumanEval	—Unverified	0
Sampling from Your Language Model One Byte at a Time	Jun 17, 2025	Code GenerationLanguage Modeling	CodeCode Available	1
Comprehensive Verilog Design Problems: A Next-Generation Benchmark Dataset for Evaluating Large Language Models and Agents on RTL Design and Verification	Jun 17, 2025	Code Generation	CodeCode Available	2
A Variational Framework for Improving Naturalness in Generative Spoken Language Models	Jun 17, 2025		CodeCode Available	1
Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature survey	Jun 17, 2025	software testing	CodeCode Available	0
SceneAware: Scene-Constrained Pedestrian Trajectory Prediction with LLM-Guided Walkability	Jun 17, 2025	Pedestrian Trajectory PredictionScene Understanding	CodeCode Available	0
WISVA: Generative AI for 5G Network Optimization in Smart Warehouses	Jun 16, 2025	Denoising	—Unverified	0
Dynamic Graph Condensation	Jun 16, 2025	Graph Learning	—Unverified	0
LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation	Jun 16, 2025	Collaborative FilteringSequential Recommendation	CodeCode Available	2
Deep Learning-Based Multi-Object Tracking: A Comprehensive Survey from Foundations to State-of-the-Art	Jun 16, 2025	Deep LearningMulti-Object Tracking	—Unverified	0
A Comprehensive Survey on Deep Learning Solutions for 3D Flood Mapping	Jun 16, 2025	Computational EfficiencyDeep Learning	—Unverified	0
Leveraging In-Context Learning for Language Model Agents	Jun 16, 2025	In-Context LearningLanguage Modeling	—Unverified	0
STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation	Jun 16, 2025	Autonomous DrivingDenoising	—Unverified	0
A Comprehensive Survey on Continual Learning in Generative Models	Jun 16, 2025	Continual LearningSurvey	CodeCode Available	2
RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis	Jun 16, 2025		CodeCode Available	1
SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds	Jun 16, 2025	3D Object RecognitionObject Recognition	CodeCode Available	0
Seq2Bind Webserver for Decoding Binding Hotspots directly from Sequences using Fine-Tuned Protein Language Models	Jun 16, 2025	Blind Docking	—Unverified	0
Performance Analysis of Communication Signals for Localization in Underwater Sensor Networks	Jun 16, 2025	Integrated sensing and communicationISAC	—Unverified	0
Joint Spectrum Sensing and Resource Allocation for OFDMA-based Underwater Acoustic Communications	Jun 16, 2025	Deep Reinforcement Learning	—Unverified	0
CBTOPE2: An improved method for predicting of conformational B-cell epitopes in an antigen from its primary sequence	Jun 16, 2025	Hyperparameter Optimization	—Unverified	0
Beyond Black Boxes: Enhancing Interpretability of Transformers Trained on Neural Data	Jun 16, 2025	Decision Making	—Unverified	0
BlastDiffusion: A Latent Diffusion Model for Generating Synthetic Embryo Images to Address Data Scarcity in In Vitro Fertilization	Jun 16, 2025	Data AugmentationDiagnostic	—Unverified	0