SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 19512000 of 659983 papers

TitleStatusHype
From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory0
Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression0
Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation0
Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits0
UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference0
AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture0
SODIUM: From Open Web Data to Queryable Databases0
Seeking Universal Shot Language Understanding Solutions0
MedQ-UNI: Toward Unified Medical Image Quality Assessment and Restoration via Vision-Language Modeling0
Recolour What Matters: Region-Aware Colour Editing via Token-Level Diffusion0
GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms0
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding0
Do Vision Language Models Understand Human Engagement in Games?0
T-QPM: Enabling Temporal Out-Of-Distribution Detection and Domain Generalization for Vision-Language Models in Open-World0
The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices0
Precise Performance of Linear Denoisers in the Proportional Regime0
TexEditor: Structure-Preserving Text-Driven Texture EditingCode0
Cross-Domain Demo-to-Code via Neurosymbolic Counterfactual Reasoning0
NymeriaPlus: Enriching Nymeria Dataset with Additional Annotations and Data0
OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting0
From Snapshots to Symphonies: The Evolution of Protein Prediction from Static Structures to Generative Dynamics and Multimodal Interactions0
Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM0
CAFlow: Adaptive-Depth Single-Step Flow Matching for Efficient Histopathology Super-Resolution0
Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models0
Correlation-Weighted Multi-Reward Optimization for Compositional Generation0
Data-efficient pre-training by scaling synthetic megadocs0
Remedying Target-Domain Astigmatism for Cross-Domain Few-Shot Object Detection0
HEP Statistical Inference for UAV Fault Detection: CLs, LRT, and SBI Applied to Blade Damage0
SINDy-KANs: Sparse identification of non-linear dynamics through Kolmogorov-Arnold networks0
CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention0
SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding0
CAPSUL: A Comprehensive Human Protein Benchmark for Subcellular Localization0
MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning0
ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs0
Breaking Hard Isomorphism Benchmarks with DRESS0
Color image restoration based on nonlocal saturation-value similarity0
Elastic Weight Consolidation Done Right for Continual Learning0
myMNIST: Benchmark of PETNN, KAN, and Classical Deep Learning Models for Burmese Handwritten Digit Recognition0
Complementary Text-Guided Attention for Zero-Shot Adversarial Robustness0
Beyond TVLA: Anderson-Darling Leakage Assessment for Neural Network Side-Channel Leakage Detection0
Improving Joint Audio-Video Generation with Cross-Modal Context Learning0
AutORAN: LLM-driven Natural Language Programming for Agile xApp Development0
DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units0
Cyber-Resilient Digital Twins: Discriminating Attacks for Safe Critical Infrastructure Control0
Benchmarking CNN-based Models against Transformer-based Models for Abdominal Multi-Organ Segmentation on the RATIC Dataset0
GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?0
Agentic Flow Steering and Parallel Rollout Search for Spatially Grounded Text-to-Image Generation0
An Onto-Relational-Sophic Framework for Governing Synthetic Minds0
SwiftGS: Episodic Priors for Immediate Satellite Surface Recovery0
PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance0
Show:102550
← PrevPage 40 of 13200Next →