SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1165111700 of 661570 papers

TitleStatusHype
MAD-SmaAt-GNet: A Multimodal Advection-Guided Neural Network for Precipitation Nowcasting0
Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography0
VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling0
Understanding the Dynamics of Demonstration Conflict in In-Context Learning0
Act-Observe-Rewrite: Multimodal Coding Agents as In-Context Policy Learners for Robot Manipulation0
Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering0
mHC-HSI: Clustering-Guided Hyper-Connection Mamba for Hyperspectral Image ClassificationCode0
ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments0
MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos0
Higher Gauge Flow Models0
Wasserstein Proximal Policy Gradient0
Data-Driven Conditional Flexibility Index0
Value Gradient Guidance for Flow Matching Alignment0
MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs0
Spilled Energy in Large Language ModelsCode0
Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic0
Design Generative AI for Practitioners: Exploring Interaction Approaches Aligned with Creative Practice0
From Heuristic Selection to Automated Algorithm Design: LLMs Benefit from Strong Priors0
Belief-Sim: Towards Belief-Driven Simulation of Demographic Misinformation Susceptibility0
Seeing Clearly without Training: Mitigating Hallucinations in Multimodal LLMs for Remote SensingCode0
Gauge Flow Models0
Towards a more realistic evaluation of machine learning models for bearing fault diagnosis0
AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows0
Heterogeneous Agent Collaborative Reinforcement Learning0
Combinatorial Sparse PCA Beyond the Spiked Identity Model0
CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think0
On Discriminative vs. Generative classifiers: Rethinking MLLMs for Action Understanding0
SemGS: Feed-Forward Semantic 3D Gaussian Splatting from Sparse Views for Generalizable Scene Understanding0
Give me scissors: Collision-Free Dual-Arm Surgical Assistive Robot for Instrument Delivery0
Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving0
Minimal Computational Preconditions for Subjective Perspective in Artificial Agents0
ForestPersons: A Large-Scale Dataset for Under-Canopy Missing Person Detection0
Robust Heterogeneous Analog-Digital Computing for Mixture-of-Experts Models with Theoretical Generalization Guarantees0
Detecting Structural Heart Disease from Electrocardiograms via a Generalized Additive Model of Interpretable Foundation-Model Predictors0
Direct Reward Fine-Tuning on Poses for Single Image to 3D Human in the Wild0
Contextualized Privacy Defense for LLM Agents0
Any Resolution Any Geometry: From Multi-View To Multi-Patch0
Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features0
TTT3R: 3D Reconstruction as Test-Time Training4
CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space0
Valet: A Standardized Testbed of Traditional Imperfect-Information Card Games0
SOLAR: SVD-Optimized Lifelong Attention for Recommendation0
EdgeFLow: Serverless Federated Learning via Sequential Model Migration in Edge Networks0
FlashEvaluator: Expanding Search Space with Parallel Evaluation0
Towards Parameter-Free Temporal Difference Learning0
Agentic AI-based Coverage Closure for Formal Verification0
Joint Optimization of Model Partitioning and Resource Allocation for Anti-Jamming Collaborative Inference Systems0
Neural Electromagnetic Fields for High-Resolution Material Parameter Reconstruction0
LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges0
OmniFashion: Towards Generalist Fashion Intelligence via Multi-Task Vision-Language Learning0
Show:102550
← PrevPage 234 of 13232Next →