SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 35513575 of 177340 papers

TitleStatusHype
CompSLAM: Complementary Hierarchical Multi-Modal Localization and Mapping for Robot Autonomy in Underground EnvironmentsCode3
Neural Ordinary Differential EquationsCode3
LEADS: Lightweight Embedded Assisted Driving SystemCode3
Fine-Tuning Language Models with Just Forward PassesCode3
USB: A Unified Semi-supervised Learning Benchmark for ClassificationCode3
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation TasksCode3
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart ReasoningCode3
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling CapabilitiesCode3
The Role of Generative Systems in Historical Photography Management: A Case Study on Catalan ArchivesCode3
ResumeFlow: An LLM-facilitated Pipeline for Personalized Resume Generation and RefinementCode3
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and VerificationCode3
Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object DetectionCode3
VoiceFixer: A Unified Framework for High-Fidelity Speech RestorationCode3
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models BenchmarkingCode3
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
RoHM: Robust Human Motion Reconstruction via DiffusionCode3
Secrets of RLHF in Large Language Models Part I: PPOCode3
CoMotion: Concurrent Multi-person 3D MotionCode3
FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading AgentsCode3
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip GenerationCode3
Learning Neural PDE Solvers with Parameter-Guided Channel AttentionCode3
Is Mamba Effective for Time Series Forecasting?Code3
Objaverse-XL: A Universe of 10M+ 3D ObjectsCode3
Data Poisoning in LLMs: Jailbreak-Tuning and Scaling LawsCode3
StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time RenderingCode3
Show:102550
← PrevPage 143 of 7094Next →