The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3925 of 661570 papers

Title	Date	Tasks	Status	Hype
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning	Feb 19, 2024		CodeCode Available	3
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations	Feb 18, 2024	DenoisingRobot Manipulation	CodeCode Available	3
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models	Feb 18, 2024	Language ModellingQuestion Answering	CodeCode Available	3
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models	Feb 18, 2024	Event ExtractionHallucination	CodeCode Available	3
GenAD: Generative End-to-End Autonomous Driving	Feb 18, 2024	Autonomous DrivingBench2Drive	CodeCode Available	3
OneBit: Towards Extremely Low-bit Large Language Models	Feb 17, 2024	Quantization	CodeCode Available	3
LLMDFA: Analyzing Dataflow in Code with Large Language Models	Feb 16, 2024	Hallucination	CodeCode Available	3
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations	Feb 16, 2024	DenoisingRobot Manipulation	CodeCode Available	3
Discovering and exploring cases of educational source code plagiarism with Dolos	Feb 16, 2024		CodeCode Available	3
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning	Feb 15, 2024	Data AugmentationInstruction Following	CodeCode Available	3
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips	Feb 15, 2024		CodeCode Available	3
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models	Feb 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering	Feb 15, 2024	3D ReconstructionNovel View Synthesis	CodeCode Available	3
Data Engineering for Scaling Language Models to 128K Context	Feb 15, 2024	4kContinual Pretraining	CodeCode Available	3
BitDelta: Your Fine-Tune May Only Be Worth One Bit	Feb 15, 2024	GPU	CodeCode Available	3
QuRating: Selecting High-Quality Data for Training Language Models	Feb 15, 2024	In-Context Learning	CodeCode Available	3
Magic-Me: Identity-Specific Video Customized Diffusion	Feb 14, 2024	Image GenerationText to Image Generation	CodeCode Available	3
Traj-LIO: A Resilient Multi-LiDAR Multi-IMU State Estimator Through Sparse Gaussian Process	Feb 14, 2024		CodeCode Available	3
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers	Feb 13, 2024	Question AnsweringRetrieval	CodeCode Available	3
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search	Feb 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models	Feb 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
SPO: Sequential Monte Carlo Policy Optimisation	Feb 12, 2024	Decision MakingModel-based Reinforcement Learning	CodeCode Available	3
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models	Feb 12, 2024	Answer GenerationHallucination	CodeCode Available	3
Scaling Laws for Fine-Grained Mixture of Experts	Feb 12, 2024	Mixture-of-Experts	CodeCode Available	3
X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design	Feb 11, 2024	graph constructionKnowledge Graphs	CodeCode Available	3