SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1090110950 of 661570 papers

TitleStatusHype
GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?0
Parallel Diffusion Solver via Residual Dirichlet Policy Optimization0
KARL: Knowledge Agents via Reinforcement Learning0
ICHOR: A Robust Representation Learning Approach for ASL CBF Maps with Self-Supervised Masked Autoencoders0
Conformal Graph Prediction with Z-Gromov Wasserstein Distances0
SPOT: Single-Shot Positioning via Trainable Near-Field Rainbow Beamforming0
Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings0
Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems0
EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models0
Towards Sharp Minimax Risk Bounds for Operator Learning0
MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing2
Count Bridges enable Modeling and Deconvolving Transcriptomic Data0
When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining0
Differential Privacy in Two-Layer Networks: How DP-SGD Harms Fairness and Robustness0
Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems0
S5-SHB Agent: Society 5.0 enabled Multi-model Agentic Blockchain Framework for Smart Home0
RepoLaunch: Automating Build&Test Pipeline of Code Repositories on ANY Language and ANY Platform0
Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding0
GCAgent: Enhancing Group Chat Communication through Dialogue Agents System0
UAM: A Unified Attention-Mamba Backbone of Multimodal Framework for Tumor Cell Classification0
Window-based Membership Inference Attacks Against Fine-tuned Large Language Models0
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models0
Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search0
The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok0
SecureRAG-RTL: A Retrieval-Augmented, Multi-Agent, Zero-Shot LLM-Driven Framework for Hardware Vulnerability Detection0
UniPAR: A Unified Framework for Pedestrian Attribute Recognition0
Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models0
Quadratic polarity and polar Fenchel-Young divergences from the canonical Legendre polarity0
Inverse Reconstruction of Shock Time Series from Shock Response Spectrum Curves using Machine Learning0
Non-Zipfian Distribution of Stopwords and Subset Selection Models0
RelaxFlow: Text-Driven Amodal 3D Generation0
What Topological and Geometric Structure Do Biological Foundation Models Learn? Evidence from 141 Hypotheses0
MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers0
Spatiotemporal Heterogeneity of AI-Driven Traffic Flow Patterns and Land Use Interaction: A GeoAI-Based Analysis of Multimodal Urban Mobility0
Revisiting Shape from Polarization in the Era of Vision Foundation Models0
Cultural Perspectives and Expectations for Generative AI: A Global Survey Approach0
Graph-Based Multi-Modal Light-weight Network for Adaptive Brain Tumor Segmentation0
Elucidating the Design Space of Arbitrary-Noise-Based Diffusion ModelsCode0
Beyond the Unit Hypersphere: Embedding Magnitude in Contrastive Learning0
QTabGAN: A Hybrid Quantum-Classical GAN for Tabular Data Synthesis0
Learning to Select Like Humans: Explainable Active Learning for Medical Imaging0
UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images0
RADAR: Learning to Route with Asymmetry-aware DistAnce Representations0
Why Is RLHF Alignment Shallow? A Gradient Analysis0
A Benchmark Study of Neural Network Compression Methods for Hyperspectral Image Classification0
From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security0
MADCrowner: Margin Aware Dental Crown Design with Template Deformation and Refinement0
Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation0
Osmosis Distillation: Model Hijacking with the Fewest Samples0
Person Detection and Tracking from an Overhead Crane LiDAR0
Show:102550
← PrevPage 219 of 13232Next →