The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8126–8150 of 474278 papers

Title	Date	Tasks	Status	Hype
Efficient World Models with Context-Aware Tokenization	Jun 27, 2024	Deep Reinforcement LearningReinforcement Learning (RL)	CodeCode Available	2
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings	Jun 27, 2024	Cross-Lingual TransferTransfer Learning	CodeCode Available	2
Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions	Jun 27, 2024	NavigateVision and Language Navigation	CodeCode Available	2
Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services	Jun 27, 2024	Scheduling	CodeCode Available	2
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance	Jun 26, 2024	Image Generation	CodeCode Available	2
ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models	Jun 26, 2024	Classification	CodeCode Available	2
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems	Jun 26, 2024	Audio Source SeparationDecoder	CodeCode Available	2
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs	Jun 26, 2024	Chart Understanding	CodeCode Available	2
KAGNNs: Kolmogorov-Arnold Networks meet Graph Learning	Jun 26, 2024	Graph ClassificationGraph Learning	CodeCode Available	2
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models	Jun 26, 2024	LLM JailbreakSurvey	CodeCode Available	2
RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets	Jun 26, 2024	RetrosynthesisSingle-step retrosynthesis	CodeCode Available	2
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference	Jun 26, 2024	multimodal interaction	CodeCode Available	2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Jun 26, 2024	BenchmarkingMath	CodeCode Available	2
The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval	Jun 26, 2024	Action LocalizationMoment Retrieval	CodeCode Available	2
GenRL: Multimodal-foundation world models for generalization in embodied agents	Jun 26, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available	2
MatchTime: Towards Automatic Soccer Game Commentary Generation	Jun 26, 2024		CodeCode Available	2
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models	Jun 26, 2024	ChatbotRed Teaming	CodeCode Available	2
Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process	Jun 26, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	2
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration	Jun 26, 2024	Contrastive LearningDeblurring	CodeCode Available	2
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs	Jun 26, 2024		CodeCode Available	2
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2
SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery	Jun 26, 2024	Domain AdaptationEarth Observation	CodeCode Available	2
EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition	Jun 26, 2024	EEGEEG Emotion Recognition	CodeCode Available	2
EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation	Jun 26, 2024	Action AnticipationAction Recognition	CodeCode Available	2
Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos	Jun 26, 2024	Novel View SynthesisPoint Tracking	CodeCode Available	2