SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1035110400 of 661570 papers

TitleStatusHype
Towards Scalable Pre-training of Visual Tokenizers for GenerationCode0
Understanding and Improving Hyperbolic Deep Reinforcement LearningCode0
(MGS)^2-Net: Unifying Micro-Geometric Scale and Macro-Geometric Structure for Cross-View Geo-LocalizationCode0
Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew EstimationCode0
CASA: Cross-Attention over Self-Attention for Efficient Vision-Language Fusion1
U6G XL-MIMO Radiomap Prediction: Multi-Config Dataset and Beam Map Approach1
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization1
LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference1
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion2
RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering0
Symmetry-Constrained Language-Guided Program Synthesis for Discovering Governing Equations from Noisy and Partial Observations0
Can LLMs Capture Expert Uncertainty? A Comparative Analysis of Value Alignment in Ethnographic Qualitative Research0
Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices0
Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation0
Robust Sparse Signal Recovery with Outliers: A Hard Thresholding Pursuit Approach Based on LAD0
Systematic Evaluation of Novel View Synthesis for Video Place Recognition0
ThaiSafetyBench: Assessing Language Model Safety in Thai Cultural ContextsCode0
Proof-of-Guardrail in AI Agents and What (Not) to Trust from ItCode0
Quantum parameter estimation with uncertainty quantification from continuous measurement data using neural network ensembles0
Real-Time Learning of Predictive Dynamic Obstacle Models for Robotic Motion Planning0
VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs0
Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs0
Reasoned Safety Alignment: Ensuring Jailbreak Defense via Answer-Then-Check0
Do We Really Need Permutations? Impact of Model Width on Linear Mode Connectivity0
Phys2Real: Fusing VLM Priors with Interactive Online Adaptation for Uncertainty-Aware Sim-to-Real Manipulation0
AURASeg: Attention-guided Upsampling with Residual-Assistive Boundary Refinement for Onboard Robot Drivable-Area Segmentation0
Critical Confabulation: Can LLMs Hallucinate for Social Good?0
SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge0
SyncMV4D: Synchronized Multi-view Joint Diffusion of Appearance and Motion for Hand-Object Interaction Synthesis0
UniTS: Unified Spatio-Temporal Generative Model for Remote Sensing0
XR-DT: Extended Reality-Enhanced Digital Twin for Safe Motion Planning via Human-Aware Model Predictive Path Integral Control0
Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation0
Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition0
Online unsupervised Hebbian learning in deep photonic neuromorphic networks0
COMI: Coarse-to-fine Context Compression via Marginal Information Gain0
Uncertainty Quantification in LLM Agents: Foundations, Emerging Challenges, and Opportunities0
Why Human Guidance Matters in Collaborative Vibe Coding0
Coverage-Aware Web Crawling for Domain-Specific Supplier Discovery via a Web--Knowledge--Web Pipeline0
MatRIS: Toward Reliable and Efficient Pretrained Machine Learning Interatomic Potentials0
Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion0
Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning0
MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem0
Visual Words Meet BM25: Sparse Auto-Encoder Visual Word Scoring for Image Retrieval0
Layer-wise Instance Binding for Regional and Occlusion Control in Text-to-Image Diffusion Transformers0
First-Order Softmax Weighted Switching Gradient Method for Distributed Stochastic Minimax Optimization with Stochastic Constraints0
Tutor Move Taxonomy: A Theory-Aligned Framework for Analyzing Instructional Moves in Tutoring0
Balancing Domestic and Global Perspectives: Evaluating Dual-Calibration and LLM-Generated Nudges for Diverse News Recommendation0
Spectral Probing of Feature Upsamplers in 2D-to-3D Scene Reconstruction0
StreamWise: Serving Multi-Modal Generation in Real-Time at Scale0
Ambiguity Collapse by LLMs: A Taxonomy of Epistemic Risks0
Show:102550
← PrevPage 208 of 13232Next →