The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 17601–17650 of 474278 papers

Title	Date	Tasks	Status	Hype
Gumbel-max List Sampling for Distribution Coupling with Multiple Samples	Jun 5, 2025	LEMMA	—Unverified	0
Efficient Robust Conformal Prediction via Lipschitz-Bounded Networks	Jun 5, 2025	Adversarial AttackComputational Efficiency	CodeCode Available	0
Noninvasive precision modulation of high-level neural population activity via natural vision perturbations	Jun 5, 2025		CodeCode Available	0
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models	Jun 5, 2025	AllMath	—Unverified	0
MTPNet: Multi-Grained Target Perception for Unified Activity Cliff Prediction	Jun 5, 2025	Drug DiscoveryPrediction	CodeCode Available	1
An SCMA Receiver for 6G NTN based on Multi-Task Learning	Jun 5, 2025	Edge-computingMulti-Task Learning	—Unverified	0
Joint Beamforming and Integer User Association using a GNN with Gumbel-Softmax Reparameterizations	Jun 5, 2025	Graph Neural Network	—Unverified	0
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm	Jun 5, 2025	GPURelation	CodeCode Available	9
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning	Jun 5, 2025	In-Context LearningIndoor Scene Synthesis	—Unverified	0
LLM-First Search: Self-Guided Exploration of the Solution Space	Jun 5, 2025		CodeCode Available	1
Demonstrations of Integrity Attacks in Multi-Agent Systems	Jun 5, 2025	Code GenerationNatural Language Understanding	—Unverified	0
LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning	Jun 5, 2025	Mathematical Reasoningreinforcement-learning	CodeCode Available	0
PixCell: A generative foundation model for digital histopathology images	Jun 5, 2025	Cell SegmentationData Augmentation	—Unverified	0
Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis	Jun 5, 2025	Sentiment AnalysisSentiment Classification	—Unverified	0
Adaptive Preconditioners Trigger Loss Spikes in Adam	Jun 5, 2025	Attribute	—Unverified	0
Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning	Jun 5, 2025	Arithmetic ReasoningMath	CodeCode Available	0
DM-SegNet: Dual-Mamba Architecture for 3D Medical Image Segmentation with Global Context Modeling	Jun 5, 2025	AnatomyBrain Tumor Segmentation	—Unverified	0
SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing	Jun 5, 2025	Fact CheckingMisinformation	CodeCode Available	0
Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation	Jun 5, 2025	counterfactualRAG	CodeCode Available	0
Counterfactual reasoning: an analysis of in-context emergence	Jun 5, 2025	counterfactualCounterfactual Reasoning	CodeCode Available	0
DACN: Dual-Attention Convolutional Network for Hyperspectral Image Super-Resolution	Jun 5, 2025	Hyperspectral Image Super-ResolutionImage Super-Resolution	CodeCode Available	0
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning	Jun 5, 2025	MathMathematical Reasoning	CodeCode Available	2
SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs	Jun 5, 2025		CodeCode Available	2
On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools	Jun 5, 2025		CodeCode Available	0
Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards?	Jun 5, 2025	Diversity	—Unverified	0
TextVidBench: A Benchmark for Long Video Scene Text Understanding	Jun 5, 2025	Prompt EngineeringQuestion Answering	—Unverified	0
Neural Inverse Rendering from Propagating Light	Jun 5, 2025	3D ReconstructionInverse Rendering	—Unverified	0
Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting	Jun 5, 2025	3DGSNovel View Synthesis	—Unverified	0
ContentV: Efficient Training of Video Generation Models with Limited Compute	Jun 5, 2025	Image GenerationVideo Generation	—Unverified	0
Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations	Jun 5, 2025	Image Quality Assessment	—Unverified	0
APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval	Jun 5, 2025	Information RetrievalRetrieval	—Unverified	0
Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery	Jun 5, 2025	Instance SegmentationSemantic Segmentation	—Unverified	0
Multi-scale Image Super Resolution with a Single Auto-Regressive Model	Jun 5, 2025	Image Super-ResolutionSuper-Resolution	—Unverified	0
Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics	Jun 5, 2025	Large Language Model	—Unverified	0
PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment	Jun 5, 2025		—Unverified	0
Beyond Cropped Regions: New Benchmark and Corresponding Baseline for Chinese Scene Text Retrieval in Diverse Layouts	Jun 5, 2025	RetrievalText Retrieval	—Unverified	0
Structure-Aware Radar-Camera Depth Estimation	Jun 5, 2025	Depth EstimationMonocular Depth Estimation	—Unverified	0
Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting	Jun 5, 2025	3DGSPoint Cloud Segmentation	—Unverified	0
UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting	Jun 5, 2025	Neural RenderingNovel View Synthesis	—Unverified	0
A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions	Jun 5, 2025	Computational Efficiencydocument understanding	—Unverified	0
FG 2025 TrustFAA: the First Workshop on Towards Trustworthy Facial Affect Analysis: Advancing Insights of Fairness, Explainability, and Safety (TrustFAA)	Jun 5, 2025	Action Unit DetectionDepression Detection	—Unverified	0
DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models	Jun 5, 2025	BenchmarkingDiversity	—Unverified	0
CIVET: Systematic Evaluation of Understanding in VLMs	Jun 5, 2025	Object	—Unverified	0
FRED: The Florence RGB-Event Drone Dataset	Jun 5, 2025	BenchmarkingTrajectory Forecasting	—Unverified	0
Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline	Jun 5, 2025	Anomaly DetectionAnomaly Localization	—Unverified	0
Vision-Based Autonomous MM-Wave Reflector Using ArUco-Driven Angle-of-Arrival Estimation	Jun 5, 2025	Raspberry Pi 4	—Unverified	0
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?	Jun 5, 2025	Object	—Unverified	0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs	Jun 5, 2025	cross-modal alignmentDense Captioning	—Unverified	0
Unleashing Hour-Scale Video Training for Long Video-Language Understanding	Jun 5, 2025	Instruction FollowingLanguage Modeling	—Unverified	0
Refer to Anything with Vision-Language Prompts	Jun 5, 2025	BenchmarkingGeneralized Referring Expression Segmentation	—Unverified	0