The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8101–8150 of 661570 papers

Title	Date	Tasks	Status	Hype
Learning Formal Mathematics From Intrinsic Motivation	Jun 30, 2024	Automated Theorem ProvingLanguage Modeling	CodeCode Available	2
Hyperparameter Optimization for Randomized Algorithms: A Case Study on Random Features	Jun 30, 2024	GPRHyperparameter Optimization	CodeCode Available	2
Diffusion Models and Representation Learning: A Survey	Jun 30, 2024	DenoisingRepresentation Learning	CodeCode Available	2
InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation	Jun 30, 2024	Image GenerationStyle Transfer	CodeCode Available	2
Teola: Towards End-to-End Optimization of LLM-based Applications	Jun 29, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks	Jun 29, 2024	Diversity	CodeCode Available	2
UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems	Jun 29, 2024	Combinatorial OptimizationGraph Neural Network	CodeCode Available	2
Diving Deeper Into Pedestrian Behavior Understanding: Intention Estimation, Action Prediction, and Event Risk Assessment	Jun 29, 2024	Prediction	CodeCode Available	2
Efficient Large Multi-modal Models via Visual Context Compression	Jun 28, 2024	Question AnsweringVisual Question Answering	CodeCode Available	2
Text2Robot: Evolutionary Robot Design from Text Descriptions	Jun 28, 2024	NavigateText to 3D	CodeCode Available	2
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents	Jun 28, 2024		CodeCode Available	2
Multimodal Prototyping for cancer survival prediction	Jun 28, 2024	PredictionSurvival Prediction	CodeCode Available	2
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management	Jun 28, 2024	ManagementText Generation	CodeCode Available	2
PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration	Jun 28, 2024	image-classificationImage Classification	CodeCode Available	2
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs	Jun 28, 2024	Code GenerationCode Translation	CodeCode Available	2
Odd-One-Out: Anomaly Detection by Comparing with Neighbors	Jun 28, 2024	8kAnomaly Detection	CodeCode Available	2
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models	Jun 27, 2024	AttributeBenchmarking	CodeCode Available	2
Efficient World Models with Context-Aware Tokenization	Jun 27, 2024	Deep Reinforcement LearningReinforcement Learning (RL)	CodeCode Available	2
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings	Jun 27, 2024	Cross-Lingual TransferTransfer Learning	CodeCode Available	2
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability	Jun 27, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation	Jun 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis	Jun 27, 2024	Clustering	CodeCode Available	2
Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions	Jun 27, 2024	NavigateVision and Language Navigation	CodeCode Available	2
AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation	Jun 27, 2024	Image GenerationText to Image Generation	CodeCode Available	2
Taming Data and Transformers for Audio Generation	Jun 27, 2024	Audio captioningAudio Generation	CodeCode Available	2
Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction	Jun 27, 2024	Artificial Life	CodeCode Available	2
Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services	Jun 27, 2024	Scheduling	CodeCode Available	2
On Discrete Prompt Optimization for Diffusion Models	Jun 27, 2024	Adversarial AttackPrompt Engineering	CodeCode Available	2
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement	Jun 27, 2024	Human-Object Interaction DetectionHuman-Object Interaction Generation	CodeCode Available	2
GenRL: Multimodal-foundation world models for generalization in embodied agents	Jun 26, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available	2
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation	Jun 26, 2024	HallucinationKnowledge Base Question Answering	CodeCode Available	2
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models	Jun 26, 2024	ChatbotRed Teaming	CodeCode Available	2
RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets	Jun 26, 2024	RetrosynthesisSingle-step retrosynthesis	CodeCode Available	2
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs	Jun 26, 2024	Chart Understanding	CodeCode Available	2
MatchTime: Towards Automatic Soccer Game Commentary Generation	Jun 26, 2024		CodeCode Available	2
ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models	Jun 26, 2024	Classification	CodeCode Available	2
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems	Jun 26, 2024	Audio Source SeparationDecoder	CodeCode Available	2
Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process	Jun 26, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	2
EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation	Jun 26, 2024	Action AnticipationAction Recognition	CodeCode Available	2
EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition	Jun 26, 2024	EEGEEG Emotion Recognition	CodeCode Available	2
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference	Jun 26, 2024	multimodal interaction	CodeCode Available	2
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs	Jun 26, 2024		CodeCode Available	2
Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos	Jun 26, 2024	Novel View SynthesisPoint Tracking	CodeCode Available	2
SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery	Jun 26, 2024	Domain AdaptationEarth Observation	CodeCode Available	2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Jun 26, 2024	BenchmarkingMath	CodeCode Available	2
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance	Jun 26, 2024	Image Generation	CodeCode Available	2
KAGNNs: Kolmogorov-Arnold Networks meet Graph Learning	Jun 26, 2024	Graph ClassificationGraph Learning	CodeCode Available	2
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models	Jun 26, 2024	LLM JailbreakSurvey	CodeCode Available	2
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration	Jun 26, 2024	Contrastive LearningDeblurring	CodeCode Available	2