The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 20051–20100 of 474278 papers

Title	Date	Tasks	Status	Hype
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization	Oct 28, 2024	RelationVisual Social Relationship Recognition	CodeCode Available	1
SciER: An Entity and Relation Extraction Dataset for Datasets, Methods, and Tasks in Scientific Documents	Oct 28, 2024	ArticlesRelation	CodeCode Available	1
Fine-Grained and Multi-Dimensional Metrics for Document-Level Machine Translation	Oct 28, 2024	Document Level Machine TranslationMachine Translation	CodeCode Available	1
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models	Oct 28, 2024	Few-Shot LearningMMLU	CodeCode Available	1
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency	Oct 28, 2024	Math	CodeCode Available	1
BLAPose: Enhancing 3D Human Pose Estimation with Bone Length Adjustment	Oct 28, 2024	3D Human Pose EstimationPose Estimation	CodeCode Available	1
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics	Oct 28, 2024	Arithmetic ReasoningMath	CodeCode Available	1
Neuro-symbolic Learning Yielding Logical Constraints	Oct 28, 2024	Logical Reasoning	CodeCode Available	1
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment	Oct 28, 2024	BenchmarkingLanguage Modeling	CodeCode Available	1
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models	Oct 28, 2024	Image GenerationMisinformation	CodeCode Available	1
Toward Conditional Distribution Calibration in Survival Prediction	Oct 27, 2024	Conformal PredictionDecision Making	CodeCode Available	1
Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis	Oct 27, 2024	Domain GeneralizationGeneral Knowledge	CodeCode Available	1
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization	Oct 27, 2024	GSM8KHellaSwag	CodeCode Available	1
FoldMark: Protecting Protein Generative Models with Watermarking	Oct 27, 2024	Drug DiscoveryProtein Structure Prediction	CodeCode Available	1
CloudCast -- Total Cloud Cover Nowcasting with Machine Learning	Oct 27, 2024	Optical Flow Estimation	CodeCode Available	1
ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics	Oct 27, 2024		CodeCode Available	1
A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing	Oct 27, 2024		CodeCode Available	1
UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration	Oct 27, 2024	DecoderImage Registration	CodeCode Available	1
Unlocking Comics: The AI4VA Dataset for Visual Understanding	Oct 27, 2024	Depth EstimationSaliency Detection	CodeCode Available	1
Depth Attention for Robust RGB Tracking	Oct 27, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	1
Automatic Estimation of Singing Voice Musical Dynamics	Oct 27, 2024		CodeCode Available	1
Referring Human Pose and Mask Estimation in the Wild	Oct 27, 2024	Decoder	CodeCode Available	1
Symbotunes: unified hub for symbolic music generative models	Oct 27, 2024	Music Generation	CodeCode Available	1
MidiTok Visualizer: a tool for visualization and analysis of tokenized MIDI symbolic music	Oct 27, 2024		CodeCode Available	1
FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion	Oct 27, 2024	Federated Learning	CodeCode Available	1
NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking	Oct 27, 2024	Object TrackingVideo Object Tracking	CodeCode Available	1
SPICEPilot: Navigating SPICE Code Generation and Simulation with AI Guidance	Oct 27, 2024	BenchmarkingCode Generation	CodeCode Available	1
Vector Quantization Prompting for Continual Learning	Oct 27, 2024	Continual LearningQuantization	CodeCode Available	1
Sebica: Lightweight Spatial and Efficient Bidirectional Channel Attention Super Resolution Network	Oct 27, 2024	Image Super-Resolutionobject-detection	CodeCode Available	1
TrajAgent: An Agent Framework for Unified Trajectory Modelling	Oct 27, 2024	Future predictionLanguage Modeling	CodeCode Available	1
Agentic Feedback Loop Modeling Improves Recommendation and User Simulation	Oct 26, 2024	Large Language ModelUser Simulation	CodeCode Available	1
LLMs Can Evolve Continually on Modality for X-Modal Reasoning	Oct 26, 2024	Continual Learningmultimodal interaction	CodeCode Available	1
ISDNN: A Deep Neural Network for Channel Estimation in Massive MIMO systems	Oct 26, 2024		CodeCode Available	1
Securing Healthcare with Deep Learning: A CNN-Based Model for medical IoT Threat Detection	Oct 26, 2024		CodeCode Available	1
Model Equality Testing: Which Model Is This API Serving?	Oct 26, 2024	modelTwo-sample testing	CodeCode Available	1
Transferable Adversarial Attacks on SAM and Its Downstream Models	Oct 26, 2024	Adversarial Attack	CodeCode Available	1
MMM-RS: A Multi-modal, Multi-GSD, Multi-scene Remote Sensing Dataset and Benchmark for Text-to-Image Generation	Oct 26, 2024	Image GenerationText to Image Generation	CodeCode Available	1
FedSSP: Federated Graph Learning with Spectral Knowledge and Personalized Preference	Oct 26, 2024	Graph Learning	CodeCode Available	1
AdaNeg: Adaptive Negative Proxy Guided OOD Detection with Vision-Language Models	Oct 26, 2024		CodeCode Available	1
UniHGKR: Unified Instruction-aware Heterogeneous Knowledge Retrievers	Oct 26, 2024	Information RetrievalRetrieval	CodeCode Available	1
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality Reduction	Oct 25, 2024	Dimensionality ReductionMixture-of-Experts	CodeCode Available	1
Multi-view biomedical foundation models for molecule-target and property prediction	Oct 25, 2024	Drug Discoverymolecular representation	CodeCode Available	1
GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing	Oct 25, 2024	Change Detection	CodeCode Available	1
Fusion-then-Distillation: Toward Cross-modal Positive Distillation for Domain Adaptive 3D Semantic Segmentation	Oct 25, 2024	3D Semantic SegmentationDomain Adaptation	CodeCode Available	1
Improving the prediction of protein stability changes upon mutations by geometric learning and a pre-training strategy	Oct 25, 2024	PredictionProtein Stability Prediction	CodeCode Available	1
AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios	Oct 25, 2024	BenchmarkingDiversity	CodeCode Available	1
Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts	Oct 25, 2024	Image Generation	CodeCode Available	1
Enhancing Battery Storage Energy Arbitrage with Deep Reinforcement Learning and Time-Series Forecasting	Oct 25, 2024	Deep Reinforcement LearningTime Series	CodeCode Available	1
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression	Oct 25, 2024	Offline RLReinforcement Learning (RL)	CodeCode Available	1
Context-Based Visual-Language Place Recognition	Oct 25, 2024	Semantic SegmentationVisual Place Recognition	CodeCode Available	1