The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 20601–20650 of 474278 papers

Title	Date	Tasks	Status	Hype
Reliable Probabilistic Human Trajectory Prediction for Autonomous Applications	Oct 9, 2024	PredictionTrajectory Prediction	CodeCode Available	1
To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models	Oct 9, 2024	MME	CodeCode Available	1
HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution	Oct 9, 2024	Super-Resolution	CodeCode Available	1
Continual Learning in the Frequency Domain	Oct 9, 2024	Continual Learning	CodeCode Available	1
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents	Oct 9, 2024	Autonomous Web Navigation	CodeCode Available	1
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection	Oct 9, 2024	Bilevel Optimization	CodeCode Available	1
Mitigating Time Discretization Challenges with WeatherODE: A Sandwich Physics-Driven Neural ODE for Weather Forecasting	Oct 9, 2024	Weather Forecasting	CodeCode Available	1
Learning Evolving Tools for Large Language Models	Oct 9, 2024		CodeCode Available	1
Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention	Oct 9, 2024	Graph LearningNode Clustering	CodeCode Available	1
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector	Oct 9, 2024	Anomaly DetectionGraph Anomaly Detection	CodeCode Available	1
InstructG2I: Synthesizing Images from Multimodal Attributed Graphs	Oct 9, 2024	DenoisingRe-Ranking	CodeCode Available	1
Does Spatial Cognition Emerge in Frontier Models?	Oct 9, 2024		CodeCode Available	1
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV Imagery	Oct 9, 2024	Segmentation	CodeCode Available	1
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning	Oct 9, 2024	In-Context LearningNetwork Pruning	CodeCode Available	1
Rejecting Hallucinated State Targets during Planning	Oct 9, 2024	Decision MakingOut-of-Distribution Generalization	CodeCode Available	1
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering	Oct 9, 2024	In-Context Learning	CodeCode Available	1
BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory Optimization	Oct 9, 2024	Autonomous NavigationTrajectory Planning	CodeCode Available	1
LLM Embeddings Improve Test-time Adaptation to Tabular Y\|X-Shifts	Oct 9, 2024	Test-time AdaptationWorld Knowledge	CodeCode Available	1
ING-VP: MLLMs cannot Play Easy Vision-based Games Yet	Oct 9, 2024	Spatial Reasoning	CodeCode Available	1
Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs	Oct 9, 2024	FairnessRetrieval	CodeCode Available	1
Toward Physics-guided Time Series Embedding	Oct 9, 2024	Time SeriesTime Series Analysis	CodeCode Available	1
TinyLidarNet: 2D LiDAR-based End-to-End Deep Learning Model for F1TENTH Autonomous Racing	Oct 9, 2024	Autonomous RacingDeep Learning	CodeCode Available	1
A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research	Oct 9, 2024		CodeCode Available	1
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning	Oct 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification	Oct 9, 2024	Domain Generalization	CodeCode Available	1
Bridge the Points: Graph-based Few-shot Segment Anything Semantically	Oct 9, 2024	Few-Shot Semantic SegmentationSemantic Segmentation	CodeCode Available	1
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking	Oct 9, 2024	ARCCode Generation	CodeCode Available	1
Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration	Oct 8, 2024	Graph Neural NetworkPoint Cloud Registration	CodeCode Available	1
GlucoBench: Curated List of Continuous Glucose Monitoring Datasets with Prediction Benchmarks	Oct 8, 2024	ManagementTrajectory Prediction	CodeCode Available	1
Generative Artificial Intelligence (GAI) for Mobile Communications: A Diffusion Model Perspective	Oct 8, 2024	Deep Reinforcement LearningManagement	CodeCode Available	1
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities	Oct 8, 2024		CodeCode Available	1
Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models	Oct 8, 2024	Hyperparameter OptimizationSequential Recommendation	CodeCode Available	1
Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes	Oct 8, 2024	ArticlesClassification	CodeCode Available	1
FACMIC: Federated Adaptative CLIP Model for Medical Image Classification	Oct 8, 2024	Domain AdaptationFederated Learning	CodeCode Available	1
NegMerge: Consensual Weight Negation for Strong Machine Unlearning	Oct 8, 2024	image-classificationImage Classification	CodeCode Available	1
SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution	Oct 8, 2024	Super-ResolutionVideo Generation	CodeCode Available	1
UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters	Oct 8, 2024	Graph Neural NetworkImage Segmentation	CodeCode Available	1
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA	Oct 8, 2024	Knowledge GraphsRAG	CodeCode Available	1
QT-DoG: Quantization-aware Training for Domain Generalization	Oct 8, 2024	Domain GeneralizationModel Compression	CodeCode Available	1
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning	Oct 8, 2024	GSM8KMulti-agent Reinforcement Learning	CodeCode Available	1
Multi-Behavioral Sequential Recommendation	Oct 8, 2024	Multibehavior RecommendationRecommendation Systems	CodeCode Available	1
Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future	Oct 8, 2024	object-detectionObject Detection	CodeCode Available	1
Estimating the Number of HTTP/3 Responses in QUIC Using Deep Learning	Oct 8, 2024		CodeCode Available	1
Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition	Oct 8, 2024	Contrastive LearningDensity Estimation	CodeCode Available	1
Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective	Oct 8, 2024	AttributeBenchmarking	CodeCode Available	1
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series	Oct 8, 2024	Computational EfficiencyIrregular Time Series	CodeCode Available	1
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment	Oct 8, 2024	ARCBelebele	CodeCode Available	1
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback	Oct 8, 2024	MathSequential Decision Making	CodeCode Available	1
A mechanistically interpretable neural network for regulatory genomics	Oct 8, 2024		CodeCode Available	1
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects	Oct 8, 2024	ARCProgram Synthesis	CodeCode Available	1