The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 658356 papers

Title	Date	Tasks	Status	Hype
PowerPM: Foundation Model for Power Systems	Aug 7, 2024	Contrastive Learningmodel	CodeCode Available	7
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases	Aug 7, 2024	HumanEvalmbpp	CodeCode Available	7
Segment Anything in Medical Images and Videos: Benchmark and Deployment	Aug 6, 2024	BenchmarkingSegmentation	CodeCode Available	7
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining	Aug 5, 2024	DecoderDepth Estimation	CodeCode Available	7
Global Structure-from-Motion Revisited	Jul 29, 2024	16k	CodeCode Available	7
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer	Jul 24, 2024	Data AugmentationDecoder	CodeCode Available	7
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?	Jul 19, 2024	BenchmarkingCode Generation	CodeCode Available	7
Stable Audio Open	Jul 19, 2024	Audio GenerationText-to-Music Generation	CodeCode Available	7
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains	Jul 18, 2024		CodeCode Available	7
Qwen2-Audio Technical Report	Jul 15, 2024	Instruction FollowingLanguage Modelling	CodeCode Available	7
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions	Jul 11, 2024	Image Animation	CodeCode Available	7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7
MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Jul 10, 2024	Image ClassificationInstance Segmentation	CodeCode Available	7
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Jul 9, 2024	Information RetrievalLEMMA	CodeCode Available	7
Agentless: Demystifying LLM-based Software Engineering Agents	Jul 1, 2024	Program Repair	CodeCode Available	7
ColPali: Efficient Document Retrieval with Vision Language Models	Jun 27, 2024	document understandingRAG	CodeCode Available	7
RouteLLM: Learning to Route LLMs with Preference Data	Jun 26, 2024	Data AugmentationTransfer Learning	CodeCode Available	7
BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO	Jun 25, 2024	reinforcement-learningReinforcement Learning	CodeCode Available	7
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving	Jun 24, 2024	CPUGPU	CodeCode Available	7
EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees	Jun 24, 2024		CodeCode Available	7
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation	Jun 24, 2024	parameter-efficient fine-tuningSentence	CodeCode Available	7
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking	Jun 21, 2024	Autonomous DrivingBenchmarking	CodeCode Available	7
Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description)	Jun 21, 2024		CodeCode Available	7
DataComp-LM: In search of the next generation of training sets for language models	Jun 17, 2024	Language ModellingMMLU	CodeCode Available	7
Grounding Image Matching in 3D with MASt3R	Jun 14, 2024	3D Reconstruction	CodeCode Available	7
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers	Jun 14, 2024	Decoder	CodeCode Available	7
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	7
TextGrad: Automatic "Differentiation" via Text	Jun 11, 2024	Question AnsweringSpecificity	CodeCode Available	7
Mixture-of-Agents Enhances Large Language Model Capabilities	Jun 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
M&M VTO: Multi-Garment Virtual Try-On and Editing	Jun 6, 2024	DenoisingSuper-Resolution	CodeCode Available	7
The Prompt Report: A Systematic Survey of Prompting Techniques	Jun 6, 2024	Prompt EngineeringSurvey	CodeCode Available	7
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT	Jun 5, 2024	Image GenerationPoint Cloud Generation	CodeCode Available	7
Scalable MatMul-free Language Modeling	Jun 4, 2024	GPULanguage Modeling	CodeCode Available	7
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models	Jun 4, 2024	In-Context LearningLanguage Modelling	CodeCode Available	7
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding	Jun 4, 2024		CodeCode Available	7
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image	May 30, 2024	Image to 3DSingle-View 3D Reconstruction	CodeCode Available	7
TotalSegmentator MRI: Robust Sequence-independent Segmentation of Multiple Anatomic Structures in MRI	May 29, 2024	MRI segmentation	CodeCode Available	7
Adaptive In-conversation Team Building for Language Model Agents	May 29, 2024	DiversityLanguage Modeling	CodeCode Available	7
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture	May 29, 2024	Image GenerationVideo Generation	CodeCode Available	7
PromptWizard: Task-Aware Prompt Optimization Framework	May 28, 2024	Computational EfficiencyDiversity	CodeCode Available	7
Efficient multi-prompt evaluation of LLMs	May 27, 2024	MMLU	CodeCode Available	7
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	May 27, 2024	Autonomous DrivingVideo Generation	CodeCode Available	7
The Road Less Scheduled	May 24, 2024	Scheduling	CodeCode Available	7
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models	May 23, 2024	HippocampusKnowledge Graphs	CodeCode Available	7
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training	May 23, 2024	GSM8KMixture-of-Experts	CodeCode Available	7
Learning Multi-dimensional Human Preference for Text-to-Image Generation	May 23, 2024	Image GenerationText to Image Generation	CodeCode Available	7
Dynamic data sampler for cross-language transfer learning in large language models	May 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection	May 16, 2024	Edge-computingFew-Shot Object Detection	CodeCode Available	7
Chameleon: Mixed-Modal Early-Fusion Foundation Models	May 16, 2024	Image CaptioningImage Generation	CodeCode Available	7
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models	May 16, 2024	In-Context LearningQuestion Answering	CodeCode Available	7