SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 63516400 of 661570 papers

TitleStatusHype
ChemAgent: Self-updating Library in Large Language Models Improves Chemical ReasoningCode2
TakuNet: an Energy-Efficient CNN for Real-Time Inference on Embedded UAV systems in Emergency Response ScenariosCode2
Do we actually understand the impact of renewables on electricity prices? A causal inference approachCode2
Test-time Alignment of Diffusion Models without Reward Over-optimizationCode2
xLSTM-SENet: xLSTM for Single-Channel Speech EnhancementCode2
Russian Financial Statements Database: A firm-level collection of the universe of financial statementsCode2
VideoRAG: Retrieval-Augmented Generation over Video CorpusCode2
AI-powered virtual tissues from spatial proteomics for clinical diagnostics and biomedical discoveryCode2
FOCUS: Towards Universal Foreground SegmentationCode2
V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept TokenizerCode2
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?Code2
FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow MatchingCode2
UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission GenerationCode2
CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation ModelsCode2
Mechanistic understanding and validation of large AI models with SemanticLensCode2
ReFocus: Visual Editing as a Chain of Thought for Structured Image UnderstandingCode2
MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image ClassificationCode2
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech SynthesisCode2
A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point ProcessesCode2
Stable Derivative Free Gaussian Mixture Variational Inference for Bayesian Inverse ProblemsCode2
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion TrainingCode2
MB-TaylorFormer V2: Improved Multi-branch Linear Transformer Expanded by Taylor Formula for Image RestorationCode2
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal MathematicsCode2
Generative AI for Cel-Animation: A SurveyCode2
FatesGS: Fast and Accurate Sparse-View Surface Reconstruction using Gaussian Splatting with Depth-Feature ConsistencyCode2
FrontierNet: Learning Visual Cues to ExploreCode2
Grokking at the Edge of Numerical StabilityCode2
LLM4SR: A Survey on Large Language Models for Scientific ResearchCode2
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and ReflectionCode2
Graph-Based Multimodal and Multi-view Alignment for Keystep RecognitionCode2
Realistic Test-Time Adaptation of Vision-Language ModelsCode2
Deep Learning-based Compression Detection for explainable Face Image Quality AssessmentCode2
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsCode2
Magic Mirror: ID-Preserved Video Generation in Video Diffusion TransformersCode2
LiMoE: Mixture of LiDAR Representation Learners from Automotive ScenesCode2
LightGNN: Simple Graph Neural Network for RecommendationCode2
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and ReactionCode2
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward ModelsCode2
Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 RobotsCode2
Qinco2: Vector Compression and Search with Improved Implicit Neural CodebooksCode2
Revolutionizing Encrypted Traffic Classification with MH-Net: A Multi-View Heterogeneous Graph ModelCode2
LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language ModelsCode2
Punch Out Model Synthesis: A Stochastic Algorithm for Constraint Based Tiling GenerationCode2
Test-time Computing: from System-1 Thinking to System-2 ThinkingCode2
DepthMaster: Taming Diffusion Models for Monocular Depth EstimationCode2
DiffGraph: Heterogeneous Graph Diffusion ModelCode2
Graph-Aware Isomorphic Attention for Adaptive Dynamics in TransformersCode2
What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of GraphCode2
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System StrategiesCode2
GNSS/GPS Spoofing and Jamming Identification Using Machine Learning and Deep LearningCode2
Show:102550
← PrevPage 128 of 13232Next →