The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 10901–10950 of 661570 papers

Title	Date	Tasks	Status	Hype
Harnessing Administrative Data Inventories to Create a Reliable Transnational Reference Database for Crop Type Monitoring	Oct 10, 2023	Earth Observation	CodeCode Available	2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning	Oct 9, 2023	Arithmetic ReasoningData Augmentation	CodeCode Available	2
Compressing Context to Enhance Inference Efficiency of Large Language Models	Oct 9, 2023	ArticlesQuestion Answering	CodeCode Available	2
Causal structure learning with momentum: Sampling distributions over Markov Equivalence Classes of DAGs	Oct 9, 2023	Causal DiscoveryGraph Sampling	CodeCode Available	2
Distributional Soft Actor-Critic with Three Refinements	Oct 9, 2023	Decision MakingReinforcement Learning (RL)	CodeCode Available	2
OptiMUS: Optimization Modeling Using MIP Solvers and large language models	Oct 9, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models	Oct 9, 2023		CodeCode Available	2
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing	Oct 9, 2023	Optical Flow EstimationText-to-Video Editing	CodeCode Available	2
Generative Judge for Evaluating Alignment	Oct 9, 2023		CodeCode Available	2
HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation	Oct 9, 2023	DecoderFace Generation	CodeCode Available	2
Humanoid Agents: Platform for Simulating Human-like Generative Agents	Oct 9, 2023	Unity	CodeCode Available	2
Colmap-PCD: An Open-source Tool for Fine Image-to-point cloud Registration	Oct 9, 2023	Image to Point Cloud RegistrationPoint Cloud Registration	CodeCode Available	2
Interpreting CLIP's Image Representation via Text-Based Decomposition	Oct 9, 2023		CodeCode Available	2
ZooPFL: Exploring Black-box Foundation Models for Personalized Federated Learning	Oct 8, 2023	Federated LearningPersonalized Federated Learning	CodeCode Available	2
ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination	Oct 8, 2023	DiversityMulti-agent Reinforcement Learning	CodeCode Available	2
Fast protein backbone generation with SE(3) flow matching	Oct 8, 2023	Protein Design	CodeCode Available	2
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT	Oct 7, 2023	Audio captioningAutomatic Speech Recognition	CodeCode Available	2
Crystal-GFN: sampling crystals with desirable properties and constraints	Oct 7, 2023	Formation Energy	CodeCode Available	2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models	Oct 6, 2023	Code GenerationDecision Making	CodeCode Available	2
Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets	Oct 6, 2023		CodeCode Available	2
Towards Foundation Models for Knowledge Graph Reasoning	Oct 6, 2023	Knowledge GraphsLink Prediction	CodeCode Available	2
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training	Oct 5, 2023	GPU	CodeCode Available	2
FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators	Oct 5, 2023	Image to Point Cloud RegistrationMetric Learning	CodeCode Available	2
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction	Oct 5, 2023	Event Argument ExtractionEvent Extraction	CodeCode Available	2
Aligning Text-to-Image Diffusion Models with Reward Backpropagation	Oct 5, 2023	DenoisingImage Generation	CodeCode Available	2
Smoothing Methods for Automatic Differentiation Across Conditional Branches	Oct 5, 2023	Stochastic Optimization	CodeCode Available	2
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!	Oct 5, 2023	Red TeamingSafety Alignment	CodeCode Available	2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	Oct 5, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation	Oct 5, 2023	BenchmarkingDecision Making	CodeCode Available	2
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation	Oct 5, 2023	HallucinationWorld Knowledge	CodeCode Available	2
SweetDreamer: Aligning Geometric Priors in 2D Diffusion for Consistent Text-to-3D	Oct 4, 2023	3D GenerationText to 3D	CodeCode Available	2
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models	Oct 4, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection	Oct 4, 2023	3D Object Detectioncross-modal alignment	CodeCode Available	2
Ring Attention with Blockwise Transformers for Near-Infinite Context	Oct 3, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models	Oct 3, 2023	Decision Making	CodeCode Available	2
SE(3)-Stochastic Flow Matching for Protein Backbone Generation	Oct 3, 2023		CodeCode Available	2
ACE: A fast, skillful learned global atmospheric model for climate prediction	Oct 3, 2023		CodeCode Available	2
Can large language models provide useful feedback on research papers? A large-scale empirical analysis	Oct 3, 2023		CodeCode Available	2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving	Oct 3, 2023	Action GenerationAutonomous Driving	CodeCode Available	2
MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens	Oct 3, 2023	Image Generationmultimodal generation	CodeCode Available	2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code	Oct 2, 2023	Image GenerationText-based Image Editing	CodeCode Available	2
Controlling Vision-Language Models for Multi-Task Image Restoration	Oct 2, 2023	Image DehazingImage Denoising	CodeCode Available	2
Quantifying the Plausibility of Context Reliance in Neural Machine Translation	Oct 2, 2023	Machine TranslationTranslation	CodeCode Available	2
You Only Look at Once for Real-time and Generic Multi-Task	Oct 2, 2023	Autonomous DrivingDrivable Area Detection	CodeCode Available	2
GPT-Driver: Learning to Drive with GPT	Oct 2, 2023	Autonomous DrivingAutonomous Vehicles	CodeCode Available	2
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction	Oct 2, 2023	image-classificationImage Classification	CodeCode Available	2
Making LLaMA SEE and Draw with SEED Tokenizer	Oct 2, 2023	multimodal generation	CodeCode Available	2
GRID: A Platform for General Robot Intelligence Development	Oct 2, 2023		CodeCode Available	2
GenSim: Generating Robotic Simulation Tasks via Large Language Models	Oct 2, 2023	Code GenerationDiversity	CodeCode Available	2