The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1051–1100 of 659983 papers

Title	Date	Tasks	Status	Hype
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models	Jan 25, 2024		CodeCode Available	5
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation	Jan 24, 2024	text-to-speechText to Speech	CodeCode Available	5
Differentiable Tree Search Network	Jan 22, 2024	Decision MakingInductive Bias	CodeCode Available	5
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs	Jan 22, 2024	Diffusion Personalization Tuning FreeImage Generation	CodeCode Available	5
Large Language Model based Multi-Agents: A Survey of Progress and Challenges	Jan 21, 2024	Decision MakingLanguage Modeling	CodeCode Available	5
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5
Scalable Pre-training of Large Autoregressive Image Models	Jan 16, 2024	Image Classification	CodeCode Available	5
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers	Jan 16, 2024	Image Generation	CodeCode Available	5
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis	Jan 16, 2024	3D ReconstructionFace Generation	CodeCode Available	5
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding	Jan 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Secrets of RLHF in Large Language Models Part II: Reward Modeling	Jan 11, 2024	Contrastive LearningMeta-Learning	CodeCode Available	5
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models	Jan 11, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security	Jan 10, 2024	Task Planning	CodeCode Available	5
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects	Jan 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions	Jan 7, 2024	BenchmarkingImage Segmentation	CodeCode Available	5
Latte: Latent Diffusion Transformer for Video Generation	Jan 5, 2024	Text-to-Video GenerationVideo Generation	CodeCode Available	5
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively	Jan 5, 2024	image-classificationImage Classification	CodeCode Available	5
Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting	Jan 2, 2024	Autonomous DrivingNeRF	CodeCode Available	5
A Comprehensive Study of Knowledge Editing for Large Language Models	Jan 2, 2024	knowledge editingModel Editing	CodeCode Available	5
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models	Jan 2, 2024		CodeCode Available	5
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition	Jan 1, 2024	Time SeriesTime Series Forecasting	CodeCode Available	5
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models	Jan 1, 2024	Code Generationparameter-efficient fine-tuning	CodeCode Available	5
Point Transformer V3: Simpler Faster Stronger	Jan 1, 2024	Representation Learning	CodeCode Available	5
VGGSfM: Visual Geometry Grounded Deep Structure From Motion	Jan 1, 2024	Camera CalibrationPoint Tracking	CodeCode Available	5
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling	Jan 1, 2024	NeRF	CodeCode Available	5
GenCast: Diffusion-based ensemble forecasting for medium-range weather	Dec 25, 2023	Decision MakingWeather Forecasting	CodeCode Available	5
DUSt3R: Geometric 3D Vision Made Easy	Dec 21, 2023	3D ReconstructionCamera Calibration	CodeCode Available	5
AppAgent: Multimodal Agents as Smartphone Users	Dec 21, 2023	Navigate	CodeCode Available	5
StarVector: Generating Scalable Vector Graphics Code from Images and Text	Dec 17, 2023	Code GenerationLanguage Modeling	CodeCode Available	5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU	Dec 16, 2023	CPUGPU	CodeCode Available	5
MobileSAMv2: Faster Segment Anything to Everything	Dec 15, 2023	DecoderKnowledge Distillation	CodeCode Available	5
CogAgent: A Visual Language Model for GUI Agents	Dec 14, 2023	Language Modeling	CodeCode Available	5
Weakly Supervised Detection of Hallucinations in LLM Activations	Dec 5, 2023	HallucinationLanguage Modeling	CodeCode Available	5
TaskWeaver: A Code-First Agent Framework	Nov 29, 2023	Natural Language Understanding	CodeCode Available	5
Human Gaussian Splatting: Real-time Rendering of Animatable Avatars	Nov 28, 2023		CodeCode Available	5
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine	Nov 28, 2023	Electrical EngineeringExperimental Design	CodeCode Available	5
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following	Nov 28, 2023	AttributeDenoising	CodeCode Available	5
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI	Nov 27, 2023	Complex Query AnsweringLogical Reasoning	CodeCode Available	5
Structure-Aware Sparse-View X-ray 3D Reconstruction	Nov 18, 2023	3D ReconstructionCT Reconstruction	CodeCode Available	5
Instruction-Following Evaluation for Large Language Models	Nov 14, 2023	Instruction Following	CodeCode Available	5
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models	Nov 8, 2023	8kGPU	CodeCode Available	5
CogVLM: Visual Expert for Pretrained Language Models	Nov 6, 2023	1 Image, 2*2 StitchingFS-MEVQA	CodeCode Available	5
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation	Oct 30, 2023	Text-to-Video GenerationVideo Generation	CodeCode Available	5
Zephyr: Direct Distillation of LM Alignment	Oct 25, 2023	2D Cyclist DetectionFew-Shot Learning	CodeCode Available	5
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning	Oct 24, 2023		CodeCode Available	5
Wonder3D: Single Image to 3D using Cross-Domain Diffusion	Oct 23, 2023	3D geometryImage to 3D	CodeCode Available	5
NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails	Oct 16, 2023	Dialogue ManagementManagement	CodeCode Available	5
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving	Oct 11, 2023	Language ModelingLanguage Modelling	CodeCode Available	5
Ferret: Refer and Ground Anything Anywhere at Any Granularity	Oct 11, 2023	HallucinationLanguage Modeling	CodeCode Available	5