The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11251–11300 of 177340 papers

Title	Date	Tasks	Status	Hype	Score
Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning	Mar 10, 2025		CodeCode Available	2	5
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions	Apr 27, 2025	Image GenerationMotion Synthesis	CodeCode Available	2	5
VMA: Divide-and-Conquer Vectorized Map Annotation System for Large-Scale Driving Scene	Apr 19, 2023	Autonomous Driving	CodeCode Available	2	5
Accelerating Certifiable Estimation with Preconditioned Eigensolvers	Jul 12, 2022		CodeCode Available	2	5
Efficient Video Object Segmentation via Modulated Cross-Attention Memory	Mar 26, 2024	GPUObject	CodeCode Available	2	5
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!	Oct 5, 2023	Red TeamingSafety Alignment	CodeCode Available	2	5
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding	Nov 6, 2024	ARCGSM8K	CodeCode Available	2	5
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training	Jan 20, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM	Apr 18, 2024	Topic Models	CodeCode Available	2	5
Evaluating LLM Reasoning in the Operations Research Domain with ORQA	Dec 22, 2024	Question Answering	CodeCode Available	2	5
Knowledge Conflicts for LLMs: A Survey	Mar 13, 2024	MisinformationSurvey	CodeCode Available	2	5
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning	Apr 10, 2022	Continual Learning	CodeCode Available	2	5
GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge	Jan 15, 2025	Text Detection	CodeCode Available	2	5
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting	Mar 14, 2024		CodeCode Available	2	5
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data	Mar 1, 2024	continuous-controlContinuous Control	CodeCode Available	2	5
Making Large Language Models Perform Better in Knowledge Graph Completion	Oct 10, 2023	In-Context LearningKnowledge Graph Completion	CodeCode Available	2	5
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving	Jun 15, 2023	3D Semantic Scene Completion3D Semantic Scene Completion from a single 2D image	CodeCode Available	2	5
SUNet: Swin Transformer UNet for Image Denoising	Feb 28, 2022	DenoisingImage Denoising	CodeCode Available	2	5
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?	Feb 7, 2025	8kInformation Retrieval	CodeCode Available	2	5
SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image	Apr 2, 2022	NeRFNovel View Synthesis	CodeCode Available	2	5
Towards Knowledge-driven Autonomous Driving	Dec 7, 2023	Autonomous DrivingNeural Rendering	CodeCode Available	2	5
Ring Attention with Blockwise Transformers for Near-Infinite Context	Oct 3, 2023	Language ModelingLanguage Modelling	CodeCode Available	2	5
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models	Jun 17, 2024		CodeCode Available	2	5
TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation	Jul 14, 2024	Computational EfficiencyPrompt Engineering	CodeCode Available	2	5
Language models scale reliably with over-training and on downstream tasks	Mar 13, 2024	Language Modelling	CodeCode Available	2	5
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement	May 13, 2025	BenchmarkingLanguage Modeling	CodeCode Available	2	5
Editing Language Model-based Knowledge Graph Embeddings	Jan 25, 2023	EDIT Taskknowledge editing	CodeCode Available	2	5
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap	Mar 27, 2025	Autonomous DrivingIn-Context Learning	CodeCode Available	2	5
STAMP: Scalable Task And Model-agnostic Collaborative Perception	Jan 24, 2025	Autonomous Driving	CodeCode Available	2	5
Dual Diffusion Implicit Bridges for Image-to-Image Translation	Mar 16, 2022	Image-to-Image TranslationTranslation	CodeCode Available	2	5
PartGS:Learning Part-aware 3D Representations by Fusing 2D Gaussians and Superquadrics	Aug 20, 2024	3D Reconstruction	CodeCode Available	2	5
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection	Mar 9, 2024	3D Object DetectionAutonomous Driving	CodeCode Available	2	5
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization	Feb 3, 2025	model	CodeCode Available	2	5
Simple Online and Realtime Tracking	Feb 2, 2016	Multi-Object TrackingMultiple Object Tracking	CodeCode Available	2	5
Forecasting Global Weather with Graph Neural Networks	Feb 15, 2022		CodeCode Available	2	5
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving	Mar 27, 2025	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2	5
Learning representations of learning representations	Apr 12, 2024	Sentence	CodeCode Available	2	5
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding	May 23, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
Non-stationary Diffusion For Probabilistic Time Series Forecasting	May 7, 2025	DenoisingProbabilistic Time Series Forecasting	CodeCode Available	2	5
Rethinking Efficient Lane Detection via Curve Modeling	Mar 4, 2022	Lane Detection	CodeCode Available	2	5
Generative Auto-Bidding with Value-Guided Explorations	Apr 20, 2025	Reinforcement Learning (RL)	CodeCode Available	2	5
MonoCD: Monocular 3D Object Detection with Complementary Depths	Apr 4, 2024	3D Object DetectionDepth Estimation	CodeCode Available	2	5
ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data	Dec 16, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching	Dec 22, 2024	Image GenerationText to Image Generation	CodeCode Available	2	5
OSSO: Obtaining Skeletal Shape from Outside	Apr 21, 2022		CodeCode Available	2	5
Composed Video Retrieval via Enriched Context and Discriminative Embeddings	Mar 25, 2024	Composed Video Retrieval (CoVR)Retrieval	CodeCode Available	2	5
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses	Jun 3, 2024		CodeCode Available	2	5
BRIO: Bringing Order to Abstractive Summarization	Mar 31, 2022	Abstractive Text SummarizationText Summarization	CodeCode Available	2	5
Towards Measuring and Modeling "Culture" in LLMs: A Survey	Mar 5, 2024	Survey	CodeCode Available	2	5
Vript: A Video Is Worth Thousands of Words	Jun 10, 2024	Video CaptioningVideo Understanding	CodeCode Available	2	5