SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1085110900 of 661570 papers

TitleStatusHype
Interpretable Perception and Reasoning for Audiovisual Geolocation0
The Rise of AI in Weather and Climate Information and its Impact on Global Inequality0
Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy0
LTLGuard: Formalizing LTL Specifications with Compact Language Models and Lightweight Symbolic Reasoning0
Unlocking ImageNet's Multi-Object Nature: Automated Large-Scale Multilabel AnnotationCode0
Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment0
CodeScout: Contextual Problem Statement Enhancement for Software Agents0
NERdME: a Named Entity Recognition Dataset for Indexing Research Artifacts in Code Repositories0
Full Dynamic Range Sky-Modelling For Image Based Lighting0
MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation0
Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow Editing0
Structured quantum learning via em algorithm for Boltzmann machines0
Thinking with Spatial Code for Physical-World Video ReasoningCode0
Why Depth Matters in Parallelizable Sequence Models: A Lie Algebraic View0
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and MethodologyCode0
Multilevel Training for Kolmogorov Arnold Networks0
Particle-Guided Diffusion for Gas-Phase Reaction Kinetics0
Evaluating the Search Agent in a Parallel World0
Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks0
Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective0
Escaping the Hydrolysis Trap: An Agentic Workflow for Inverse Design of Durable Photocatalytic Covalent Organic Frameworks0
SPyCer: Semi-Supervised Physics-Guided Contextual Attention for Near-Surface Air Temperature Estimation from Satellite Imagery0
DAP: A Discrete-token Autoregressive Planner for Autonomous Driving0
Controlled LLM Training on Spectral Sphere0
FairFinGAN: Fairness-aware Synthetic Financial Data Generation0
Replaying pre-training data improves fine-tuning0
When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On0
Fusion-CAM: Integrating Gradient and Region-Based Class Activation Maps for Robust Visual Explanations0
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs2
LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification0
SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus0
Synchronization-based clustering on the unit hypersphere0
On the Non-Identifiability of Steering Vectors in Large Language Models0
LoRA-MME: Multi-Model Ensemble of LoRA-Tuned Encoders for Code Comment Classification0
Testing Most Influential Sets0
RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity0
NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction0
A Simple Baseline for Unifying Understanding, Generation, and Editing via Vanilla Next-token PredictionCode0
Diff-ES: Stage-wise Structural Diffusion Pruning via Evolutionary Search0
SSR-GS: Separating Specular Reflection in Gaussian Splatting for Glossy Surface Reconstruction0
Learning Optimal Individualized Decision Rules with Conditional Demographic Parity0
Bayesian Supervised Causal Clustering0
WavSLM: Single-Stream Speech Language Modeling via WavLM Distillation0
Dissociating Direct Access from Inference in AI Introspection0
MERLIN: Multi-Stage Curriculum Alignment for Multilingual Encoder-LLM Integration in Cross-Lingual Reasoning0
CBR-to-SQL: Rethinking Retrieval-based Text-to-SQL using Case-based Reasoning in the Healthcare Domain0
Bootstrapped Mixed Rewards for RL Post-Training: Injecting Canonical Action Order0
Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention0
The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks0
AfriMTEB and AfriE5: Benchmarking and Adapting Text Embedding Models for African Languages0
Show:102550
← PrevPage 218 of 13232Next →