The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9876–9900 of 474278 papers

Title	Date	Tasks	Status	Hype
Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling	Feb 20, 2024	Multivariate Time Series ForecastingTime Series	CodeCode Available	2
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models	Feb 20, 2024	DenoisingImage Generation	CodeCode Available	2
A Touch, Vision, and Language Dataset for Multimodal Alignment	Feb 20, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing	Feb 20, 2024	Voice Cloning	CodeCode Available	2
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations	Feb 20, 2024	Sentence	CodeCode Available	2
Me LLaMA: Foundation Large Language Models for Medical Applications	Feb 20, 2024	Few-Shot LearningGPU	CodeCode Available	2
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs	Feb 19, 2024	Safety Alignment	CodeCode Available	2
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators	Feb 19, 2024		CodeCode Available	2
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs	Feb 19, 2024	Question Answering	CodeCode Available	2
EmoBench: Evaluating the Emotional Intelligence of Large Language Models	Feb 19, 2024	Emotional IntelligenceEmotion Recognition	CodeCode Available	2
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models	Feb 19, 2024	Adversarial DefenseMultimodal Deep Learning	CodeCode Available	2
UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models	Feb 19, 2024	Image GenerationMachine Unlearning	CodeCode Available	2
Event-Based Motion Magnification	Feb 19, 2024	BenchmarkingMotion Detection	CodeCode Available	2
Class-incremental Learning for Time Series: Benchmark and Evaluation	Feb 19, 2024	Activity RecognitionBenchmarking	CodeCode Available	2
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships	Feb 19, 2024	3d scene graph generationObject	CodeCode Available	2
CausalGym: Benchmarking causal interpretability methods on linguistic tasks	Feb 19, 2024	BenchmarkingInterpretability Techniques for Deep Learning	CodeCode Available	2
EVOR: Evolving Retrieval for Code Generation	Feb 19, 2024	Code GenerationRAG	CodeCode Available	2
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models	Feb 19, 2024		CodeCode Available	2
Generative Semi-supervised Graph Anomaly Detection	Feb 19, 2024	Anomaly DetectionGraph Anomaly Detection	CodeCode Available	2
Pan-Mamba: Effective pan-sharpening with State Space Model	Feb 19, 2024	MambaPansharpening	CodeCode Available	2
The Revolution of Multimodal Large Language Models: A Survey	Feb 19, 2024	Image GenerationInstruction Following	CodeCode Available	2
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic	Feb 19, 2024	Instruction FollowingMath	CodeCode Available	2
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation	Feb 19, 2024	DenoisingFew-Shot Learning	CodeCode Available	2
Reformatted Alignment	Feb 19, 2024	GSM8KHallucination	CodeCode Available	2
A Critical Evaluation of AI Feedback for Aligning Large Language Models	Feb 19, 2024	Instruction Followingreinforcement-learning	CodeCode Available	2