SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 82518300 of 661570 papers

TitleStatusHype
Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition1
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression0
Proper Body Landmark Subset Enables More Accurate and 5X Faster Recognition of Isolated Signs in LIBRAS0
SynHLMA:Synthesizing Hand Language Manipulation for Articulated Object with Discrete Human Object Interaction Representation0
GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation0
SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection0
PRISM of Opinions: A Persona-Reasoned Multimodal Framework for User-centric Conversational Stance Detection0
Mitigating Long-Tail Bias in HOI Detection via Adaptive Diversity Cache0
Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning0
Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound0
AVGGT: Rethinking Global Attention for Accelerating VGGT0
From Veracity to Diffusion: Adressing Operational Challenges in Moving From Fake-News Detection to Information Disorders0
ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning0
Do Spatial Descriptors Improve Multi-DoF Finger Movement Decoding from HD sEMG?0
Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning0
Empowering All-in-Loop Health Management of Spacecraft Power System in the Mega-Constellation Era via Human-AI Collaboration0
Rewards as Labels: Revisiting RLVR from a Classification Perspective0
Energy-Aware Spike Budgeting for Continual Learning in Spiking Neural Networks for Neuromorphic Vision0
Continual uncertainty learning0
VLN-Cache: Enabling Token Caching for VLN Models with Visual/Semantic Dynamics Awareness0
OrthoAI: A Neurosymbolic Framework for Evidence-Grounded Biomechanical Reasoning in Clear Aligner Orthodontics0
DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking0
PonderLM-3: Adaptive Token-Wise Pondering with Differentiable Masking0
Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation0
Latent Generative Models with Tunable Complexity for Compressed Sensing and other Inverse Problems0
PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks0
Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment0
VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding0
Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA0
ICDAR 2025 Competition on End-to-End Document Image Machine Translation Towards Complex Layouts0
From Flow to One Step: Real-Time Multi-Modal Trajectory Policies via Implicit Maximum Likelihood Estimation-based Distribution Distillation0
YOLO-NAS-Bench: A Surrogate Benchmark with Self-Evolving Predictors for YOLO Architecture Search0
Reviving ConvNeXt for Efficient Convolutional Diffusion Models0
RiO-DETR: DETR for Real-time Oriented Object Detection0
Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health0
You Didn't Have to Say It like That: Subliminal Learning from Faithful Paraphrases0
MetaDAT: Generalizable Trajectory Prediction via Meta Pre-training and Data-Adaptive Test-Time Updating0
CERES: A Probabilistic Early Warning System for Acute Food Insecurity0
AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems0
From Weighting to Modeling: A Nonparametric Estimator for Off-Policy Evaluation0
GIIM: Graph-based Learning of Inter- and Intra-view Dependencies for Multi-view Medical Image Diagnosis0
A Guideline-Aware AI Agent for Zero-Shot Target Volume Auto-Delineation0
Declarative Scenario-based Testing with RoadLogic0
TopoOR: A Unified Topological Scene Representation for the Operating Room0
Evolving Prompt Adaptation for Vision-Language Models0
OmniEarth: A Benchmark for Evaluating Vision-Language Models in Geospatial Tasks0
Telogenesis: Goal Is All U Need0
Vibe-Creation: The Epistemology of Human-AI Emergent Cognition0
TrainDeeploy: Hardware-Accelerated Parameter-Efficient Fine-Tuning of Small Transformer Models at the Extreme Edge0
Probing the Reliability of Driving VLMs: From Inconsistent Responses to Grounded Temporal Reasoning0
Show:102550
← PrevPage 166 of 13232Next →