SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 35513600 of 177340 papers

TitleStatusHype
Neural Ordinary Differential EquationsCode3
LEADS: Lightweight Embedded Assisted Driving SystemCode3
Fine-Tuning Language Models with Just Forward PassesCode3
USB: A Unified Semi-supervised Learning Benchmark for ClassificationCode3
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation TasksCode3
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart ReasoningCode3
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling CapabilitiesCode3
The Role of Generative Systems in Historical Photography Management: A Case Study on Catalan ArchivesCode3
ResumeFlow: An LLM-facilitated Pipeline for Personalized Resume Generation and RefinementCode3
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and VerificationCode3
Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object DetectionCode3
VoiceFixer: A Unified Framework for High-Fidelity Speech RestorationCode3
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models BenchmarkingCode3
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
RoHM: Robust Human Motion Reconstruction via DiffusionCode3
Secrets of RLHF in Large Language Models Part I: PPOCode3
CoMotion: Concurrent Multi-person 3D MotionCode3
FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading AgentsCode3
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip GenerationCode3
Learning Neural PDE Solvers with Parameter-Guided Channel AttentionCode3
Is Mamba Effective for Time Series Forecasting?Code3
Objaverse-XL: A Universe of 10M+ 3D ObjectsCode3
Data Poisoning in LLMs: Jailbreak-Tuning and Scaling LawsCode3
StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time RenderingCode3
Real-Time Packet Loss Concealment With Mixed Generative and Predictive ModelCode3
Chameleon: Plug-and-Play Compositional Reasoning with Large Language ModelsCode3
Deep learning tools for the measurement of animal behavior in neuroscienceCode3
Fast Sampling of Diffusion Models with Exponential IntegratorCode3
Video4DGen: Enhancing Video and 4D Generation through Mutual OptimizationCode3
High-Fidelity Audio Compression with Improved RVQGANCode3
BigVGAN: A Universal Neural Vocoder with Large-Scale TrainingCode3
Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph ReasoningCode3
Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene UnderstandingCode3
Towards Visual Grounding: A SurveyCode3
G-Memory: Tracing Hierarchical Memory for Multi-Agent SystemsCode3
Digitizing Touch with an Artificial Multimodal FingertipCode3
DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning BenchmarksCode3
CORL: Research-oriented Deep Offline Reinforcement Learning LibraryCode3
Data Filtering NetworksCode3
FastMap: Revisiting Dense and Scalable Structure from MotionCode3
ToRL: Scaling Tool-Integrated RLCode3
Safety of Multimodal Large Language Models on Images and TextsCode3
Low-Rank Few-Shot Adaptation of Vision-Language ModelsCode3
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with TextCode3
Large Spatial Model: End-to-end Unposed Images to Semantic 3DCode3
FlatQuant: Flatness Matters for LLM QuantizationCode3
Optimal Stepsize for Diffusion SamplingCode3
DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian ConsensusCode3
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency ModelCode3
Benchmarking LLMs via Uncertainty QuantificationCode3
Show:102550
← PrevPage 72 of 3547Next →