SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 26012650 of 659983 papers

TitleStatusHype
Beware Untrusted Simulators -- Reward-Free Backdoor Attacks in Reinforcement Learning0
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion2
Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models0
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration0
A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation0
Context-Nav: Context-Driven Exploration and Viewpoint-Aware 3D Spatial Reasoning for Instance Navigation0
Exploiting Adaptive Channel Pruning for Communication-Efficient Split Learning0
Coherent Human-Scene Reconstruction from Multi-Person Multi-View Video in a Single Pass0
Human-AI Co-reasoning for Clinical Diagnosis with Evidence-Integrated Language Agent0
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers0
Multimodal Emotion Recognition via Bi-directional Cross-Attention and Temporal Modeling0
Real-World AI Evaluation: How FRAME Generates Systematic Evidence to Resolve the Decision-Maker's Dilemma0
Spatial Transcriptomics as Images for Large-Scale Pretraining0
SAATT Nav: a Socially Aware Autonomous Transparent Transportation Navigation Framework for Wheelchairs0
The Reasoning Bottleneck in Graph-RAG: Structured Prompting and Context Compression for Multi-Hop QA0
AvatarForcing: One-Step Streaming Talking Avatars via Local-Future Sliding-Window Denoising0
SemanticFace: Semantic Facial Action Estimation via Semantic Distillation in Interpretable Space0
F2HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling0
Training-free Detection of Generated Videos via Spatial-Temporal Likelihoods0
Open Biomedical Knowledge Graphs at Scale: Construction, Federation, and AI Agent Access with Samyama Graph Database0
A Tutorial on ALOS2 SAR Utilization: Dataset Preparation, Self-Supervised Pretraining, and Semantic Segmentation0
I Know What I Don't Know: Latent Posterior Factor Models for Multi-Evidence Probabilistic Reasoning0
Theoretical Foundations of Latent Posterior Factors: Formal Guarantees for Multi-Evidence Reasoning0
A Framework and Prototype for a Navigable Map of Datasets in Engineering Design and Systems Engineering0
OMNIFLOW: A Physics-Grounded Multimodal Agent for Generalized Scientific Reasoning0
100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models0
S-VAM: Shortcut Video-Action Model by Self-Distilling Geometric and Semantic Foresight0
VIEW2SPACE: Studying Multi-View Visual Reasoning from Sparse Observations0
Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty0
The Causal Uncertainty Principle: Manifold Tearing and the Topological Limits of Counterfactual Interventions0
Gesture-Aware Pretraining and Token Fusion for 3D Hand Pose Estimation0
Adaptive Anchor Policies for Efficient 4D Gaussian Streaming0
From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs0
Visual SLAM with DEM Anchoring for Lunar Surface Navigation0
KANtize: Exploring Low-bit Quantization of Kolmogorov-Arnold Networks for Efficient Inference0
Neuron-Level Emotion Control in Speech-Generative Large Audio-Language Models0
Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients0
Neural Radiance Maps for Extraterrestrial Navigation and Path Planning0
On the Cone Effect and Modality Gap in Medical Vision-Language Embeddings0
Variational Rectification Inference for Learning with Noisy Labels0
GigaWorld-Policy: An Efficient Action-Centered World--Action Model2
LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis0
DANCE: Dynamic 3D CNN Pruning: Joint Frame, Channel, and Feature Adaptation for Energy Efficiency on the Edge0
WINFlowNets: Warm-up Integrated Networks Training of Generative Flow Networks for Robotics and Machine Fault Adaptation0
From Words to Worlds: Benchmarking Cross-Cultural Cultural Understanding in Machine Translation0
Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations0
Towards Safer Large Reasoning Models by Promoting Safety Decision-Making before Chain-of-Thought Generation0
ReLMXEL: Adaptive RL-Based Memory Controller with Explainable Energy and Latency Optimization0
InfoDensity: Rewarding Information-Dense Traces for Efficient Reasoning0
Deploying Semantic ID-based Generative Retrieval for Large-Scale Podcast Discovery at Spotify0
Show:102550
← PrevPage 53 of 13200Next →