SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 21012150 of 659983 papers

TitleStatusHype
Learning to Predict, Discover, and Reason in High-Dimensional Event Sequences0
SCALE:Scalable Conditional Atlas-Level Endpoint transport for virtual cell perturbation prediction0
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing1
Enhancing the Parameterization of Reservoir Properties for Data Assimilation Using Deep VAE-GAN0
Cross-Lingual LLM-Judge Transfer via Evaluation Decomposition0
Balanced Thinking: Improving Chain of Thought Training in Vision Language Models0
Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer0
Measuring 3D Spatial Geometric Consistency in Dynamic Generated Videos0
RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation0
Robustness, Cost, and Attack-Surface Concentration in Phishing Detection0
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs1
A Model Ensemble-Based Post-Processing Framework for Fairness-Aware Prediction0
A Comparative Empirical Study of Catastrophic Forgetting Mitigation in Sequential Task Adaptation for Continual Natural Language Processing Systems0
Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image SegmentationCode0
Unmasking Algorithmic Bias in Predictive Policing: A GAN-Based Simulation Framework with Multi-City Temporal Analysis0
AlignMamba-2: Enhancing Multimodal Fusion and Sentiment Analysis with Modality-Aware Mamba0
CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models0
Model Order Reduction of Cerebrovascular Hemodynamics Using POD_Galerkin and Reservoir Computing_based Approach0
Beyond Passive Aggregation: Active Auditing and Topology-Aware Defense in Decentralized Federated Learning0
Single Agent Robust Deep Reinforcement Learning for Bus Fleet Control0
Transfer Learning for Neutrino Scattering: Domain Adaptation with GANs0
Multi-Preconditioned LBFGS for Training Finite-Basis PINNs0
Foundations and Architectures of Artificial Intelligence for Motor Insurance0
SRRM: Improving Recursive Transport Surrogates in the Small-Discrepancy Regime0
Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review0
Teleological Inference in Structural Causal Models via Intentional Interventions0
Zipper-LoRA: Dynamic Parameter Decoupling for Speech-LLM based Multilingual Speech RecognitionCode0
Evaluating Model-Free Policy Optimization in Masked-Action Environments via an Exact Blackjack Oracle0
Deep Expert Injection for Anchoring Retinal VLMs with Domain-Specific Knowledge0
HaltNav: Reactive Visual Halting over Lightweight Topological Priors for Robust Vision-Language Navigation0
Evaluating Counterfactual Strategic Reasoning in Large Language Models0
AIMER: Calibration-Free Task-Agnostic MoE Pruning0
Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting0
LLM-Augmented Changepoint Detection: A Framework for Ensemble Detection and Automated Explanation0
BVSIMC: Bayesian Variable Selection-Guided Inductive Matrix Completion for Improved and Interpretable Drug Discovery0
HypeMed: Enhancing Medication Recommendations with Hypergraph-Based Patient Relationships0
Interpretable Prostate Cancer Detection using a Small Cohort of MRI Images0
NeuroGame Transformer: Gibbs-Inspired Attention Driven by Game Theory and Statistical PhysicsCode0
Implicit Grading Bias in Large Language Models: How Writing Style Affects Automated Assessment Across Math, Programming, and Essay Tasks0
Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs0
DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering0
GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning0
Meanings and Measurements: Multi-Agent Probabilistic Grounding for Vision-Language Navigation0
OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards0
Evaluating Game Difficulty in Tetris Block Puzzle0
On Optimizing Multimodal Jailbreaks for Spoken Language Models0
Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models0
DSPO: Stable and Efficient Policy Optimization for Agentic Search and Reasoning0
DriveSplat: Unified Neural Gaussian Reconstruction for Dynamic Driving Scenes0
A Unified Generalization Framework for Model Merging: Trade-offs, Non-Linearity, and Scaling Laws0
Show:102550
← PrevPage 43 of 13200Next →