The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2026–2050 of 661570 papers

Title	Date	Tasks	Status	Hype
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs	Sep 11, 2023	Quantization	CodeCode Available	4
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain	Sep 8, 2023	Fact CheckingKnowledge Graphs	CodeCode Available	4
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese	Sep 8, 2023	Domain AdaptationHallucination	CodeCode Available	4
Cognitive Architectures for Language Agents	Sep 5, 2023	Decision Making	CodeCode Available	4
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior	Aug 29, 2023	Blind Face RestorationDenoising	CodeCode Available	4
Prompt2Model: Generating Deployable Models from Natural Language Instructions	Aug 23, 2023	Data-free Knowledge DistillationDataset Generation	CodeCode Available	4
A Survey on Large Language Model based Autonomous Agents	Aug 22, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors	Aug 21, 2023		CodeCode Available	4
GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning	Aug 20, 2023	FairnessFederated Learning	CodeCode Available	4
ChatHaruhi: Reviving Anime Character in Reality via Large Language Model	Aug 18, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis	Aug 18, 2023	Dynamic ReconstructionNovel View Synthesis	CodeCode Available	4
Graph of Thoughts: Solving Elaborate Problems with Large Language Models	Aug 18, 2023		CodeCode Available	4
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining	Aug 10, 2023	Audio GenerationIn-Context Learning	CodeCode Available	4
OpenProteinSet: Training data for structural biology at scale	Aug 10, 2023	Protein DesignProtein Structure Prediction	CodeCode Available	4
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models	Aug 7, 2023	Community Detection	CodeCode Available	4
AgentBench: Evaluating LLMs as Agents	Aug 7, 2023	Decision MakingInstruction Following	CodeCode Available	4
TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality Assessment	Aug 6, 2023	Image Quality AssessmentLocal Distortion	CodeCode Available	4
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models	Aug 2, 2023	Visual Question AnsweringVisual Question Answering (VQA)	CodeCode Available	4
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion	Aug 2, 2023		CodeCode Available	4
LISA: Reasoning Segmentation via Large Language Model	Aug 1, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models	Jul 30, 2023	HallucinationPrompt Engineering	CodeCode Available	4
Effective Whole-body Pose Estimation with Two-stages Distillation	Jul 29, 2023	2D Human Pose EstimationKnowledge Distillation	CodeCode Available	4
Universal and Transferable Adversarial Attacks on Aligned Language Models	Jul 27, 2023	Adversarial AttackIngenuity	CodeCode Available	4
Guaranteed Approximation Bounds for Mixed-Precision Neural Operators	Jul 27, 2023	GPUOperator learning	CodeCode Available	4
Turning Whisper into Real-Time Transcription System	Jul 27, 2023	speech-recognitionSpeech Recognition	CodeCode Available	4