The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1651–1700 of 659983 papers

Title	Date	Tasks	Status	Hype
Tarsier: Recipes for Training and Evaluating Large Video Description Models	Jun 30, 2024	Video CaptioningVideo Description	CodeCode Available	4
YuLan: An Open-source Large Language Model	Jun 28, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks	Jun 27, 2024	Feature EngineeringModel Selection	CodeCode Available	4
On Scaling Up 3D Gaussian Splatting Training	Jun 26, 2024	3DGS3D Reconstruction	CodeCode Available	4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge	Jun 25, 2024	Computational EfficiencyCPU	CodeCode Available	4
Long Context Transfer from Language to Vision	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results	Jun 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	4
RaTEScore: A Metric for Radiology Report Generation	Jun 24, 2024	DiagnosticEntity Embeddings	CodeCode Available	4
Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournaments	Jun 24, 2024	Benchmarking	CodeCode Available	4
Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs	Jun 23, 2024		CodeCode Available	4
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions	Jun 22, 2024	BenchmarkingCode Generation	CodeCode Available	4
Convolutional Kolmogorov-Arnold Networks	Jun 19, 2024	Kolmogorov-Arnold Networks	CodeCode Available	4
Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback	Jun 18, 2024	DenoisingRecommendation Systems	CodeCode Available	4
Nemotron-4 340B Technical Report	Jun 17, 2024	Synthetic Data Generation	CodeCode Available	4
Graspness Discovery in Clutters for Fast and Accurate Grasp Detection	Jun 17, 2024		CodeCode Available	4
Diffusion Models in Low-Level Vision: A Survey	Jun 17, 2024	DenoisingSurvey	CodeCode Available	4
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens	Jun 17, 2024		CodeCode Available	4
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning	Jun 17, 2024	Emotion RecognitionMultimodal Emotion Recognition	CodeCode Available	4
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery	Jun 16, 2024	scientific discoverySurvey	CodeCode Available	4
Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center	Jun 15, 2024		CodeCode Available	4
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs	Jun 14, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Gender Representation in TV and Radio: Automatic Information Extraction methods versus Manual Analyses	Jun 14, 2024		CodeCode Available	4
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations	Jun 13, 2024	3D visual groundingAttribute	CodeCode Available	4
HelpSteer2: Open-source dataset for training top-performing reward models	Jun 12, 2024	Attribute	CodeCode Available	4
One-Step Effective Diffusion Network for Real-World Image Super-Resolution	Jun 12, 2024	Image RestorationImage Super-Resolution	CodeCode Available	4
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing	Jun 12, 2024		CodeCode Available	4
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling	Jun 11, 2024	4kLanguage Modeling	CodeCode Available	4
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising	Jun 11, 2024	Denoising	CodeCode Available	4
Simple and Effective Masked Diffusion Language Models	Jun 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
PufferLib: Making Reinforcement Learning Libraries and Environments Play Nice	Jun 11, 2024	NetHackreinforcement-learning	CodeCode Available	4
Mamba YOLO: A Simple Baseline for Object Detection with State Space Model	Jun 9, 2024	GPUMamba	CodeCode Available	4
MotionClone: Training-Free Motion Cloning for Controllable Video Generation	Jun 8, 2024	DenoisingMotion Generation	CodeCode Available	4
The CLRS-Text Algorithmic Reasoning Language Benchmark	Jun 6, 2024		CodeCode Available	4
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems	Jun 6, 2024	Automated Theorem ProvingMath	CodeCode Available	4
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search	Jun 6, 2024		CodeCode Available	4
Nomic Embed Vision: Expanding the Latent Space	Jun 6, 2024		CodeCode Available	4
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving	Jun 6, 2024	Autonomous DrivingBench2Drive	CodeCode Available	4
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments	Jun 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Scaling and evaluating sparse autoencoders	Jun 6, 2024	Language Modelling	CodeCode Available	4
DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images	Jun 5, 2024	2D Object DetectionDenoising	CodeCode Available	4
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models	Jun 4, 2024	Common Sense Reasoning	CodeCode Available	4
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation	Jun 4, 2024	Face SwappingGPU	CodeCode Available	4
Guiding a Diffusion Model with a Bad Version of Itself	Jun 4, 2024	Image Generation	CodeCode Available	4
RaDe-GS: Rasterizing Depth in Gaussian Splatting	Jun 3, 2024	Computational EfficiencyNovel View Synthesis	CodeCode Available	4
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation	Jun 3, 2024	Image AnimationVideo Generation	CodeCode Available	4
Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual Odometry	Jun 3, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	4
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models	Jun 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
COS-Mix: Cosine Similarity and Distance Fusion for Improved Information Retrieval	Jun 2, 2024	Information RetrievalRAG	CodeCode Available	4
End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model	Jun 2, 2024	Image Reconstruction	CodeCode Available	4
R^2-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction	May 31, 2024	3DGSNeRF	CodeCode Available	4