The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 12601–12650 of 474278 papers

Title	Date	Tasks	Status	Hype
Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT	Apr 10, 2023	Graph LearningKnowledge Graphs	CodeCode Available	2
RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms	Nov 3, 2020	Collaborative FilteringGPU	CodeCode Available	2
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution	Oct 21, 2024	Allmodel	CodeCode Available	2
Trusted Multi-View Classification with Dynamic Evidential Fusion	Apr 25, 2022	ClassificationMULTI-VIEW LEARNING	CodeCode Available	2
Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues	Jan 5, 2024	Depression Detection	CodeCode Available	2
Deep Differentiable Logic Gate Networks	Oct 15, 2022	CPUEfficient Neural Network	CodeCode Available	2
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models	Feb 7, 2024	DiversityMultiple-choice	CodeCode Available	2
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"	Sep 21, 2023	Data AugmentationSentence	CodeCode Available	2
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation	Jan 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes	Mar 27, 2024	Grasp Generation	CodeCode Available	2
Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph	Sep 28, 2024	Epidemiology	CodeCode Available	2
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology	Feb 28, 2025	Multiple-choicescientific discovery	CodeCode Available	2
Synthetic continued pretraining	Sep 11, 2024	Data AugmentationLanguage Modelling	CodeCode Available	2
OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping	Apr 20, 2023	3D Lane DetectionAutonomous Vehicles	CodeCode Available	2
Towards a Unified Copernicus Foundation Model for Earth Vision	Mar 14, 2025	Earth Observation	CodeCode Available	2
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs	Mar 4, 2025		CodeCode Available	2
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution	May 22, 2025	Super-ResolutionVideo Super-Resolution	CodeCode Available	2
Reproducibility Study of "Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents"	May 14, 2025		CodeCode Available	2
E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection	Mar 14, 2024	Autonomous DrivingObject	CodeCode Available	2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning	Oct 9, 2023	Arithmetic ReasoningData Augmentation	CodeCode Available	2
The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models	Aug 12, 2020	counterfactualSentiment Analysis	CodeCode Available	2
Neural Combinatorial Optimization Algorithms for Solving Vehicle Routing Problems: A Comprehensive Survey with Perspectives	Jun 1, 2024	Combinatorial Optimization	CodeCode Available	2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Feb 22, 2024	AllMixture-of-Experts	CodeCode Available	2
Jailbreaking Attack against Multimodal Large Language Model	Feb 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation	May 29, 2022	DecoderOptical Flow Estimation	CodeCode Available	2
AnnaAgent: Dynamic Evolution Agent System with Multi-Session Memory for Realistic Seeker Simulation	May 31, 2025		CodeCode Available	2
PromptIR: Prompting for All-in-One Blind Image Restoration	Jun 22, 2023	AllBlind All-in-One Image Restoration	CodeCode Available	2
Convolutional Neural Operators for robust and accurate learning of PDEs	Feb 2, 2023	Operator learningPDE Surrogate Modeling	CodeCode Available	2
Grappa -- A Machine Learned Molecular Mechanics Force Field	Mar 25, 2024	Computational Efficiency	CodeCode Available	2
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants	Aug 31, 2023	BelebeleCross-Lingual Transfer	CodeCode Available	2
A Machine Learning Approach That Beats Large Rubik's Cubes	Feb 18, 2025	Rubik's Cube	CodeCode Available	2
Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping	Sep 27, 2022	NeRFVisual Odometry	CodeCode Available	2
CAPO: Cost-Aware Prompt Optimization	Apr 22, 2025	Arithmetic ReasoningAutoML	CodeCode Available	2
BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis	Nov 9, 2023	Face ReenactmentNeRF	CodeCode Available	2
MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones	Jul 26, 2022	object-detectionObject Detection	CodeCode Available	2
Artificial Intelligence of Things: A Survey	Oct 25, 2024	Survey	CodeCode Available	2
BianCang: A Traditional Chinese Medicine Large Language Model	Nov 17, 2024	DiagnosticLanguage Modeling	CodeCode Available	2
Fast Dynamic Radiance Fields with Time-Aware Neural Voxels	May 30, 2022	NeRF	CodeCode Available	2
Automatically Bounding the Taylor Remainder Series: Tighter Bounds and New Applications	Dec 22, 2022	global-optimizationNumerical Integration	CodeCode Available	2
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations	Dec 11, 2024	AttributeImage Generation	CodeCode Available	2
Fraud Dataset Benchmark and Applications	Aug 30, 2022	AutoMLFeature Engineering	CodeCode Available	2
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision	Jul 8, 2024	Action Quality AssessmentDescriptive	CodeCode Available	2
DeBERTa: Decoding-enhanced BERT with Disentangled Attention	Jun 5, 2020	Common Sense ReasoningCoreference Resolution	CodeCode Available	2
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models	May 29, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	2
Streaming Active Learning with Deep Neural Networks	Mar 5, 2023	Active LearningDiversity	CodeCode Available	2
StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation	Feb 24, 2022	Style TransferTranslation	CodeCode Available	2
Frequency-Adaptive Dilated Convolution for Semantic Segmentation	Mar 8, 2024	object-detectionObject Detection	CodeCode Available	2
LIR-LIVO: A Lightweight,Robust LiDAR/Vision/Inertial Odometry with Illumination-Resilient Deep Features	Feb 12, 2025	Pose EstimationVisual Odometry	CodeCode Available	2
On Meta-Prompting	Dec 11, 2023	In-Context Learning	CodeCode Available	2
Reducing Hallucinations in Vision-Language Models via Latent Space Steering	Oct 21, 2024	Hallucination	CodeCode Available	2