SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1190111950 of 661570 papers

TitleStatusHype
VIRGi: View-dependent Instant Recoloring of 3D Gaussians Splats0
MaBERT:A Padding Safe Interleaved Transformer Mamba Hybrid Encoder for Efficient Extended Context Masked Language Modeling0
SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models0
Breaking the Prototype Bias Loop: Confidence-Aware Federated Contrastive Learning for Highly Imbalanced Clients0
Safe and Robust Domains of Attraction for Discrete-Time Systems: A Set-Based Characterization and Certifiable Neural Network Estimation0
REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry0
SEHFS: Structural Entropy-Guided High-Order Correlation Learning for Multi-View Multi-Label Feature Selection0
TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health0
Step-Level Sparse Autoencoder for Reasoning Process InterpretationCode0
EduVQA: Benchmarking AI-Generated Video Quality Assessment for Education0
Using Learning Progressions to Guide AI Feedback for Science Learning0
From Reachability to Learnability: Geometric Design Principles for Quantum Neural Networks0
TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning0
TinyIceNet: Low-Power SAR Sea Ice Segmentation for On-Board FPGA Inference0
RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization0
TAO-Attack: Toward Advanced Optimization-Based Jailbreak Attacks for Large Language Models0
Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation0
Odin: Multi-Signal Graph Intelligence for Autonomous Discovery in Knowledge Graphs0
Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection0
Evaluating Performance Drift from Model Switching in Multi-Turn LLM Systems0
Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation0
AI Space Physics: Constitutive boundary semantics for open AI institutions0
Torus embeddings0
Channel-Adaptive Edge AI: Maximizing Inference Throughput by Adapting Computational Complexity to Channel States0
FEAST: Retrieval-Augmented Multi-Hierarchical Food Classification for the FoodEx2 System0
Kling-MotionControl Technical Report0
Conditioned Activation Transport for T2I Safety Steering0
Less Noise, Same Certificate: Retain Sensitivity for Unlearning0
Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification0
Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling0
Scalable Uncertainty Quantification for Black-Box Density-Based Clustering0
MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization0
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?1
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use0
Understanding and Mitigating Dataset Corruption in LLM Steering0
I-CAM-UV: Integrating Causal Graphs over Non-Identical Variable Sets Using Causal Additive Models with Unobserved Variables0
Shape Derivative-Informed Neural Operators with Application to Risk-Averse Shape Optimization0
NeuroSkill(tm): Proactive Real-Time Agentic System Capable of Modeling Human State of Mind0
Stabilized Adaptive Loss and Residual-Based Collocation for Physics-Informed Neural Networks0
Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective0
Coalgebras for categorical deep learning: Representability and universal approximation0
SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enable Learning-Based Optimization and Benchmarking0
AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework0
Guiding Sparse Neural Networks with Neurobiological Principles to Elicit Biologically Plausible Representations0
Speculative Speculative Decoding0
COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data -- Generation Stochastic by Design0
Physics-informed post-processing of stabilized finite element solutions for transient convection-dominated problems0
DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction0
Gravity Falls: A Comparative Analysis of Domain-Generation Algorithm (DGA) Detection Methods for Mobile Device Spearphishing0
Beyond Language Modeling: An Exploration of Multimodal Pretraining0
Show:102550
← PrevPage 239 of 13232Next →