SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 85518600 of 661570 papers

TitleStatusHype
Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports2
Robot Control Stack: A Lean Ecosystem for Robot Learning at Scale2
Reward Prediction with Factorized World States1
ID-LoRA: Identity-Driven Audio-Video Personalization with In-Context LoRA2
GNNs for Time Series Anomaly Detection: An Open-Source Framework and a Critical Evaluation0
Temporal-Conditioned Normalizing Flows for Multivariate Time Series Anomaly Detection0
TemporalDoRA: Temporal PEFT for Robust Surgical Video Question Answering0
Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions0
BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language ModelsCode0
Missing-by-Design: Certifiable Modality Deletion for Revocable Multimodal Sentiment Analysis0
DOCFORGE-BENCH: A Comprehensive 0-shot Benchmark for Document Forgery Detection and Analysis0
Distributed Convolutional Neural Networks for Object Recognition0
NavSpace: How Navigation Agents Follow Spatial Intelligence Instructions0
Robust Provably Secure Image Steganography via Latent Iterative Optimization0
SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding0
QUSR: Quality-Aware and Uncertainty-Guided Image Super-Resolution Diffusion ModelCode0
TableMind++: An Uncertainty-Aware Programmatic Agent for Tool-Augmented Table Reasoning0
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data2
How Contrastive Decoding Enhances Large Audio Language Models?0
Predictive Spectral Calibration for Source-Free Test-Time Regression0
DenoiseSplat: Feed-Forward Gaussian Splatting for Noisy 3D Scene Reconstruction0
Reinforcing Numerical Reasoning in LLMs for Tabular Prediction via Structural Priors0
Exploiting the Final Component of Generator Architectures for AI-Generated Image Detection0
Active Prompt Learning with Vision-Language Model Priors0
OPENXRD: A Comprehensive Benchmark Framework for LLM/MLLM XRD Question Answering0
Operator Learning for Consolidation: An Architectural Comparison for DeepONet Variants0
Improving Large Vision-Language Models' Understanding for Flow Field Data0
EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering0
You Only Pose Once: A Minimalist's Detection Transformer for Monocular RGB Category-level 9D Multi-Object Pose Estimation0
RF-Informed Graph Neural Networks for Accurate and Data-Efficient Circuit Performance Prediction0
A Surrogate model for High Temperature Superconducting Magnets to Predict Current Distribution with Neural Network0
VocSegMRI: Multimodal Learning for Precise Vocal Tract Segmentation in Real-time MRI0
Automated Coral Spawn Monitoring for Reef Restoration: The Coral Spawn and Larvae Imaging Camera System (CSLICS)0
RECODE: Reasoning Through Code Generation for Visual Question Answering0
ZeroSiam: An Efficient Asymmetry for Test-Time Entropy Optimization without Collapse0
VSSFlow: Unifying Video-conditioned Sound and Speech Generation via Joint Learning0
v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound0
Real-Time Neural Video Compression with Unified Intra and Inter Coding0
AlphaApollo: A System for Deep Agentic Reasoning1
Does Scientific Writing Converge to U.S. English? Evidence from Generative AI-Assisted Publications0
Lightweight Time Series Data Valuation on Time Series Foundation Models via In-Context Finetuning0
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models0
Multi-Agent Reinforcement Learning with Communication-Constrained Priors0
SA^2GFM: Enhancing Robust Graph Foundation Models with Structure-Aware Semantic Augmentation0
EMFusion: Conditional Diffusion Framework for Trustworthy Frequency Selective EMF Forecasting in Wireless Networks0
Reinforcement Learning for Self-Improving Agent with Skill Library0
DEER: A Benchmark for Evaluating Deep Research Agents on Expert Report Generation0
An AI-powered Bayesian Generative Modeling Approach for Arbitrary Conditional InferenceCode0
Low-rank Orthogonal Subspace Intervention for Generalizable Face Forgery Detection0
From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents0
Show:102550
← PrevPage 172 of 13232Next →