The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1651–1675 of 661570 papers

Title	Date	Tasks	Status	Hype
Tarsier: Recipes for Training and Evaluating Large Video Description Models	Jun 30, 2024	Video CaptioningVideo Description	CodeCode Available	4
YuLan: An Open-source Large Language Model	Jun 28, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks	Jun 27, 2024	Feature EngineeringModel Selection	CodeCode Available	4
On Scaling Up 3D Gaussian Splatting Training	Jun 26, 2024	3DGS3D Reconstruction	CodeCode Available	4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge	Jun 25, 2024	Computational EfficiencyCPU	CodeCode Available	4
RaTEScore: A Metric for Radiology Report Generation	Jun 24, 2024	DiagnosticEntity Embeddings	CodeCode Available	4
Long Context Transfer from Language to Vision	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results	Jun 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	4
Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournaments	Jun 24, 2024	Benchmarking	CodeCode Available	4
Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs	Jun 23, 2024		CodeCode Available	4
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions	Jun 22, 2024	BenchmarkingCode Generation	CodeCode Available	4
Convolutional Kolmogorov-Arnold Networks	Jun 19, 2024	Kolmogorov-Arnold Networks	CodeCode Available	4
Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback	Jun 18, 2024	DenoisingRecommendation Systems	CodeCode Available	4
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning	Jun 17, 2024	Emotion RecognitionMultimodal Emotion Recognition	CodeCode Available	4
Diffusion Models in Low-Level Vision: A Survey	Jun 17, 2024	DenoisingSurvey	CodeCode Available	4
Nemotron-4 340B Technical Report	Jun 17, 2024	Synthetic Data Generation	CodeCode Available	4
Graspness Discovery in Clutters for Fast and Accurate Grasp Detection	Jun 17, 2024		CodeCode Available	4
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens	Jun 17, 2024		CodeCode Available	4
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery	Jun 16, 2024	scientific discoverySurvey	CodeCode Available	4
Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center	Jun 15, 2024		CodeCode Available	4
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs	Jun 14, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Gender Representation in TV and Radio: Automatic Information Extraction methods versus Manual Analyses	Jun 14, 2024		CodeCode Available	4
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations	Jun 13, 2024	3D visual groundingAttribute	CodeCode Available	4
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing	Jun 12, 2024		CodeCode Available	4
One-Step Effective Diffusion Network for Real-World Image Super-Resolution	Jun 12, 2024	Image RestorationImage Super-Resolution	CodeCode Available	4