The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 659983 papers

Title	Date	Tasks	Status	Hype
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding	Jun 27, 2024	DecoderSegmentation	CodeCode Available	5
Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model	Jun 27, 2024	MambaSegmentation	CodeCode Available	5
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation	Jun 26, 2024	Text-to-Video GenerationVideo Generation	CodeCode Available	5
MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation	Jun 25, 2024	DiversityNatural Language Understanding	CodeCode Available	5
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data	Jun 24, 2024	Data AugmentationOptical Character Recognition (OCR)	CodeCode Available	5
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs	Jun 24, 2024	Representation LearningVisual Grounding	CodeCode Available	5
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	5
ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models	Jun 21, 2024		CodeCode Available	5
Uni-Mol2: Exploring Molecular Pretraining Model at Scale	Jun 21, 2024	model	CodeCode Available	5
aeon: a Python toolkit for learning from time series	Jun 20, 2024	Anomaly DetectionModel Selection	CodeCode Available	5
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution	Jun 19, 2024	Event-based visionSuper-Resolution	CodeCode Available	5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Improving Text-To-Audio Models with Synthetic Captions	Jun 18, 2024	AudioCapsAudio captioning	CodeCode Available	5
Autoregressive Image Generation without Vector Quantization	Jun 17, 2024	Image GenerationQuantization	CodeCode Available	5
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains	Jun 17, 2024		CodeCode Available	5
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline	Jun 17, 2024	Chatbot	CodeCode Available	5
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery	Jun 16, 2024	DecoderEarth Observation	CodeCode Available	5
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities	Jun 13, 2024	Instance Segmentationmultimodal generation	CodeCode Available	5
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts	Jun 13, 2024	Conditional Image GenerationImage Generation	CodeCode Available	5
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks	Jun 12, 2024	Image GenerationLanguage Modeling	CodeCode Available	5
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs	Jun 11, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion	Jun 11, 2024	GPU	CodeCode Available	5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	Jun 11, 2024	Decision MakingGSM8K	CodeCode Available	5
Zero-shot Image Editing with Reference Imitation	Jun 11, 2024	Semantic correspondence	CodeCode Available	5
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation	Jun 10, 2024	Conditional Image GenerationImage Generation	CodeCode Available	5
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation	Jun 10, 2024	3D ReconstructionAutonomous Driving	CodeCode Available	5
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models	Jun 9, 2024	Instruction Following	CodeCode Available	5
Matching Anything by Segmenting Anything	Jun 6, 2024	Domain GeneralizationMultiple Object Tracking	CodeCode Available	5
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions	Jun 6, 2024	Video CaptioningVideo Generation	CodeCode Available	5
Text-to-Image Rectified Flow as Plug-and-Play Priors	Jun 5, 2024	3D GenerationText to 3D	CodeCode Available	5
Wings: Learning Multimodal LLMs without Text-only Forgetting	Jun 5, 2024	Question AnsweringVisual Question Answering	CodeCode Available	5
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning	Jun 5, 2024	Automatic Speech Recognition (ASR)de-en	CodeCode Available	5
Parrot: Multilingual Visual Instruction Tuning	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	5
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling	Jun 4, 2024		CodeCode Available	5
AudioLCM: Text-to-Audio Generation with Latent Consistency Models	Jun 1, 2024	Audio GenerationAudio Synthesis	CodeCode Available	5
Ovis: Structural Embedding Alignment for Multimodal Large Language Model	May 31, 2024	Language ModelingMultimodal Large Language Model	CodeCode Available	5
Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation	May 31, 2024	MuJoCoreinforcement-learning	CodeCode Available	5
Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction	May 31, 2024	Speech Synthesis	CodeCode Available	5
Xwin-LM: Strong and Scalable Alignment Practice for LLMs	May 30, 2024		CodeCode Available	5
SpinQuant: LLM quantization with learned rotations	May 26, 2024	Quantization	CodeCode Available	5
CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling	May 26, 2024		CodeCode Available	5
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ	May 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Focus Anywhere for Fine-grained Multi-page Document Understanding	May 23, 2024	document understandingOptical Character Recognition (OCR)	CodeCode Available	5
TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting	May 23, 2024	Future predictionTime Series	CodeCode Available	5
Improved Distribution Matching Distillation for Fast Image Synthesis	May 23, 2024	Image Generation	CodeCode Available	5
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression	May 23, 2024	Quantization	CodeCode Available	5
Awesome Multi-modal Object Tracking	May 23, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	5
Diffusion for World Modeling: Visual Details Matter in Atari	May 20, 2024	Image Generationreinforcement-learning	CodeCode Available	5
Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction	May 20, 2024	Drug DesignMolecular Docking	CodeCode Available	5