SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 54515475 of 474278 papers

TitleStatusHype
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language ModelsCode2
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM EraCode2
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion ModelsCode2
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by ThemselvesCode2
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image EditingCode2
Detect, Classify, Act: Categorizing Industrial Anomalies with Multi-Modal Large Language ModelsCode2
SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy DemonstrationsCode2
MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based AgentsCode2
Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge DistillationCode2
An Empirical Study of Qwen3 QuantizationCode2
PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross DockingCode2
A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and EfficiencyCode2
Don't be lazy: CompleteP enables compute-efficient deep transformersCode2
CostFilter-AD: Enhancing Anomaly Detection through Matching Cost FilteringCode2
CAMELTrack: Context-Aware Multi-cue ExpLoitation for Online Multi-Object TrackingCode2
LightEMMA: Lightweight End-to-End Multimodal Model for Autonomous DrivingCode2
Explainable AI in Spatial AnalysisCode2
MINERVA: Evaluating Complex Video ReasoningCode2
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and OutlookCode2
GPU Performance Portability needs AutotuningCode2
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene GenerationCode2
Visual Text Processing: A Comprehensive Review and Unified EvaluationCode2
mAIstro: an open-source multi-agentic system for automated end-to-end development of radiomics and deep learning models for medical imagingCode2
Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image DenoisingCode2
RWKV-X: A Linear Complexity Hybrid Language ModelCode2
Show:102550
← PrevPage 219 of 18972Next →