SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 16511675 of 661570 papers

TitleStatusHype
Tarsier: Recipes for Training and Evaluating Large Video Description ModelsCode4
YuLan: An Open-source Large Language ModelCode4
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning BenchmarksCode4
On Scaling Up 3D Gaussian Splatting TrainingCode4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on EdgeCode4
RaTEScore: A Metric for Radiology Report GenerationCode4
Long Context Transfer from Language to VisionCode4
PVUW 2024 Challenge on Complex Video Understanding: Methods and ResultsCode4
Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournamentsCode4
Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMsCode4
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex InstructionsCode4
Convolutional Kolmogorov-Arnold NetworksCode4
Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User FeedbackCode4
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction TuningCode4
Diffusion Models in Low-Level Vision: A SurveyCode4
Nemotron-4 340B Technical ReportCode4
Graspness Discovery in Clutters for Fast and Accurate Grasp DetectionCode4
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion TokensCode4
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific DiscoveryCode4
Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance CenterCode4
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMsCode4
Gender Representation in TV and Radio: Automatic Information Extraction methods versus Manual AnalysesCode4
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language AnnotationsCode4
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with NothingCode4
One-Step Effective Diffusion Network for Real-World Image Super-ResolutionCode4
Show:102550
← PrevPage 67 of 26463Next →