The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 10001–10025 of 474278 papers

Title	Date	Tasks	Status	Hype
Verif.ai: Towards an Open-Source Scientific Generative Question-Answering System with Referenced and Verifiable Answers	Feb 9, 2024	Generative Question AnsweringInformation Retrieval	CodeCode Available	2
Debating with More Persuasive LLMs Leads to More Truthful Answers	Feb 9, 2024	Persuasiveness	CodeCode Available	2
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following	Feb 9, 2024	Autonomous DrivingDenoising	CodeCode Available	2
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes	Feb 9, 2024		CodeCode Available	2
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference	Feb 9, 2024	GPULanguage Modeling	CodeCode Available	2
DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer	Feb 8, 2024		CodeCode Available	2
Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers	Feb 8, 2024	Video Compression	CodeCode Available	2
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis	Feb 8, 2024		CodeCode Available	2
Scalable Diffusion Models with State Space Backbone	Feb 8, 2024	Conditional Image GenerationImage Generation	CodeCode Available	2
Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models	Feb 8, 2024		CodeCode Available	2
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention	Feb 8, 2024	MMLUQuantization	CodeCode Available	2
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion	Feb 8, 2024	Computational EfficiencyMultimodal Reasoning	CodeCode Available	2
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning	Feb 8, 2024	GSM8Kreinforcement-learning	CodeCode Available	2
PLAPT: Protein-Ligand Binding Affinity Prediction Using Pretrained Transformers	Feb 8, 2024	Drug DiscoveryPrediction	CodeCode Available	2
Let Your Graph Do the Talking: Encoding Structured Data for LLMs	Feb 8, 2024		CodeCode Available	2
Time Series Diffusion in the Frequency Domain	Feb 8, 2024	DenoisingInductive Bias	CodeCode Available	2
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation	Feb 8, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Dirichlet Flow Matching with Applications to DNA Sequence Design	Feb 8, 2024		CodeCode Available	2
Learning to Route Among Specialized Experts for Zero-Shot Generalization	Feb 8, 2024	parameter-efficient fine-tuningZero-shot Generalization	CodeCode Available	2
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data	Feb 8, 2024	Action RecognitionMamba	CodeCode Available	2
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs	Feb 8, 2024	Ethics	CodeCode Available	2
InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior	Feb 7, 2024	BenchmarkingDecoder	CodeCode Available	2
Closing the Gap Between SGP4 and High-Precision Propagation via Differentiable Programming	Feb 7, 2024		CodeCode Available	2
A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents	Feb 7, 2024		CodeCode Available	2
Can Large Language Model Agents Simulate Human Trust Behavior?	Feb 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	2