The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2101–2150 of 659983 papers

Title	Date	Tasks	Status	Hype
Images Speak in Images: A Generalist Painter for In-Context Visual Learning	Dec 5, 2022	In-Context LearningKeypoint Detection	CodeCode Available	4
DreamGen: Unlocking Generalization in Robot Learning through Video World Models	May 19, 2025	Video Generation	CodeCode Available	4
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning	Mar 10, 2025	Multimodal ReasoningReinforcement Learning (RL)	CodeCode Available	4
Cognitive Architectures for Language Agents	Sep 5, 2023	Decision Making	CodeCode Available	4
AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data	Feb 1, 2024	Conditional Image GenerationDenoising	CodeCode Available	4
Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition	Oct 24, 2023		CodeCode Available	4
Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers	Jul 14, 2022	RetrievalText Retrieval	CodeCode Available	4
Mamba YOLO: A Simple Baseline for Object Detection with State Space Model	Jun 9, 2024	GPUMamba	CodeCode Available	4
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements	Sep 30, 2022		CodeCode Available	4
Compressible-composable NeRF via Rank-residual Decomposition	May 30, 2022	NeRF	CodeCode Available	4
Structured Pruning for Deep Convolutional Neural Networks: A survey	Mar 1, 2023	Network PruningNeural Architecture Search	CodeCode Available	4
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge	Nov 25, 2024		CodeCode Available	4
AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks	Mar 21, 2024	Image to Video GenerationStyle Transfer	CodeCode Available	4
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance	Oct 16, 2024	Human Agent Collaboration	CodeCode Available	4
Orb: A Fast, Scalable Neural Network Potential	Oct 29, 2024		CodeCode Available	4
Spirit LM: Interleaved Spoken and Written Language Model	Feb 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments	Jul 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench	Jan 31, 2024	BenchmarkingMultiple-choice	CodeCode Available	4
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens	Jun 17, 2024		CodeCode Available	4
Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later	Jul 3, 2024		CodeCode Available	4
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection	Mar 7, 2022	Object DetectionReal-Time Object Detection	CodeCode Available	4
TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling	Oct 31, 2024	Deep LearningRetrieval	CodeCode Available	4
INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation	Jun 13, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
SegGPT: Segmenting Everything In Context	Apr 6, 2023	Few-Shot Semantic SegmentationIn-Context Learning	CodeCode Available	4
TinyLLaVA: A Framework of Small-scale Large Multimodal Models	Feb 22, 2024	Visual Question Answering	CodeCode Available	4
Building reliable sim driving agents by scaling self-play	Feb 20, 2025	Autonomous VehiclesBenchmarking	CodeCode Available	4
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts	Mar 13, 2024	Image AnimationImage to Video Generation	CodeCode Available	4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN	May 27, 2022	Image ClassificationInstance Segmentation	CodeCode Available	4
SkyReels-A2: Compose Anything in Video Diffusion Transformers	Apr 3, 2025	Human-Domain Subject-to-VideoOpen-Domain Subject-to-Video	CodeCode Available	4
Croissant: A Metadata Format for ML-Ready Datasets	Mar 28, 2024	FrictionManagement	CodeCode Available	4
DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning	Feb 28, 2025	Information Retrievalreinforcement-learning	CodeCode Available	4
LLMMapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long Resources	Apr 8, 2025	ArticlesForm	CodeCode Available	4
KISS-Matcher: Fast and Robust Point Cloud Registration Revisited	Sep 23, 2024	Point Cloud Registration	CodeCode Available	4
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control	Mar 18, 2025		CodeCode Available	4
Prototypical Verbalizer for Prompt-based Few-shot Tuning	Mar 18, 2022	Contrastive LearningEntity Typing	CodeCode Available	4
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning	May 2, 2024	Autonomous Drivingcounterfactual	CodeCode Available	4
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis	Jul 20, 2022	Image OutpaintingText-to-Image Generation	CodeCode Available	4
Autoregressive Video Generation without Vector Quantization	Dec 18, 2024	Image GenerationPrediction	CodeCode Available	4
Best-of-N Jailbreaking	Dec 4, 2024		CodeCode Available	4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
Continual Learning of Large Language Models: A Comprehensive Survey	Apr 25, 2024	Continual LearningSurvey	CodeCode Available	4
KTO: Model Alignment as Prospect Theoretic Optimization	Feb 2, 2024	Attributemodel	CodeCode Available	4
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval	Sep 14, 2024	Contrastive LearningImage Retrieval	CodeCode Available	4
Text2SQL is Not Enough: Unifying AI and Databases with TAG	Aug 27, 2024	RAGRetrieval-augmented Generation	CodeCode Available	4
Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss	Aug 5, 2022	Language ModelingLanguage Modelling	CodeCode Available	4
Convolutional Differentiable Logic Gate Networks	Nov 7, 2024		CodeCode Available	4
Billion-scale similarity search with GPUs	Feb 28, 2017	GPUImage Similarity Search	CodeCode Available	4
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers	Sep 30, 2024		CodeCode Available	4
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level	Mar 7, 2024		CodeCode Available	4