SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 47514800 of 661570 papers

TitleStatusHype
Transformers are Bayesian Networks0
TrackDeform3D: Markerless and Autonomous 3D Keypoint Tracking and Dataset Collection for Deformable Objects0
Large Reasoning Models Struggle to Transfer Parametric Knowledge Across Scripts0
PRISM: Demystifying Retention and Interaction in Mid-Training0
Evaluating LLM-Simulated Conversations in Modeling Inconsistent and Uncollaborative Behaviors in Human Social Interaction0
An End-to-End Framework for Functionality-Embedded Provenance Graph Construction and Threat Interpretation0
Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency0
When the Specification Emerges: Benchmarking Faithfulness Loss in Long-Horizon Coding Agents0
SENSE: Efficient EEG-to-Text via Privacy-Preserving Semantic Retrieval0
Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation0
Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles0
MosaicMem: Hybrid Spatial Memory for Controllable Video World Models0
Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework0
Topology-Preserving Deep Joint Source-Channel Coding for Semantic Communication0
Personalized Fall Detection by Balancing Data with Selective Feedback Using Contrastive Learning0
Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents0
GazeOnce360: Fisheye-Based 360° Multi-Person Gaze Estimation with Global-Local Feature Fusion0
Quadratic Surrogate Attractor for Particle Swarm Optimization0
SLAM Adversarial Lab: An Extensible Framework for Visual SLAM Robustness Evaluation under Adverse Conditions0
PAuth - Precise Task-Scoped Authorization For Agents0
Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning0
Domain-informed explainable boosting machines for trustworthy lateral spread predictions0
Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing0
Visual Product Search Benchmark0
Abstraction as a Memory-Efficient Inductive Bias for Continual Learning0
CODMAS: A Dialectic Multi-Agent Collaborative Framework for Structured RTL Optimization0
OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation0
A scalable neural bundle map for multiphysics prediction in lithium-ion battery across varying configurations0
AI Scientist via Synthetic Task Scaling0
Alignment Makes Language Models Normative, Not Descriptive0
Multilingual, Multimodal Pipeline for Creating Authentic and Structured Fact-Checked Claim Dataset0
One-Shot Badminton Shuttle Detection for Mobile Robots0
Gradient Atoms: Unsupervised Discovery, Attribution and Steering of Model Behaviors via Sparse Decomposition of Training GradientsCode0
Manifold-Matching Autoencoders0
RaDAR: Relation-aware Diffusion-Asymmetric Graph Contrastive Learning for Recommendation0
PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction0
Robust Physics-Guided Diffusion for Full-Waveform Inversion0
Optimal uncertainty bounds for multivariate kernel regression under bounded noise: A Gaussian process-based dual function0
Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures0
VideoMatGen: PBR Materials through Joint Generative Modeling0
Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints0
Dual Stream Independence Decoupling for True Emotion Recognition under Masked Expressions0
REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge0
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic WeightingCode0
Distilling Feedback into Memory-as-a-Tool0
A Lensless Polarization Camera0
When AI Navigates the Fog of War0
Exposing Blindspots: Cultural Bias Evaluation in Generative Image Models0
SHAMISA: SHAped Modeling of Implicit Structural Associations for Self-supervised No-Reference Image Quality Assessment0
Adaptive Contracts for Cost-Effective AI Delegation0
Show:102550
← PrevPage 96 of 13232Next →