SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 92019250 of 661570 papers

TitleStatusHype
Large Language Model-Assisted Superconducting Qubit Experiments0
VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs0
How Far Can Unsupervised RLVR Scale LLM Training?0
Deterministic Differentiable Structured Pruning for Large Language Models0
Test-Time Modification: Inverse Domain Transformation for Robust Perception0
BEV-Patch-PF: Particle Filtering with BEV-Aerial Feature Matching for Off-Road Geo-Localization0
Computing Evolutionarily Stable Strategies in Multiplayer Games0
ExGS: Extreme 3D Gaussian Compression with Diffusion PriorsCode0
Revisiting Unknowns: Towards Effective and Efficient Open-Set Active LearningCode0
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use1
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept GenerationCode0
NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic0
Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment0
Rethinking Discrete Speech Representation Tokens for Accent Generation0
Arbiter: Detecting Interference in LLM Agent System Prompts0
Detecting AI-Generated Images via Contextual Anomaly Estimation in Masked AutoEncoders0
Interpretable Motion-Attentive Maps: Spatio-Temporally Localizing Concepts in Video Diffusion Transformers0
Information Routing in Atomistic Foundation Models: How Task Alignment and Equivariance Shape Linear Disentanglement0
AI Agents, Language, Deep Learning and the Next Revolution in Science0
FedMomentum: Preserving LoRA Training Momentum in Federated Fine-Tuning0
RexDrug: Reliable Multi-Drug Combination Extraction through Reasoning-Enhanced LLMsCode0
Graph-Instructed Neural Networks for parametric problems with varying boundary conditions0
Retrieval-Augmented Anatomical Guidance for Text-to-CT Generation0
HDR-NSFF: High Dynamic Range Neural Scene Flow Fields0
AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition0
Can Vision-Language Models Solve the Shell Game?1
Don't Look Back in Anger: MAGIC Net for Streaming Continual Learning with Temporal Dependence0
Evaluating Financial Intelligence in Large Language Models: Benchmarking SuperInvesting AI with LLM Engines0
The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM0
One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations0
MetricNet: Recovering Metric Scale in Generative Navigation Policies0
Integral Formulas for Vector Spherical Tensor Products0
A Unified Framework for Zero-Shot Reinforcement Learning0
SAIL: Test-Time Scaling for In-Context Imitation Learning with VLM0
Interactive World Simulator for Robot Policy Training and Evaluation0
TIDE: Text-Informed Dynamic Extrapolation with Step-Aware Temperature Control for Diffusion Transformers0
A Lightweight Traffic Map for Efficient Anytime LaCAM*0
Adaptive Entropy-Driven Sensor Selection in a Camera-LiDAR Particle Filter for Single-Vessel Tracking0
Trust via Reputation of Conviction0
Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations0
Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation0
Sign Identifiability of Causal Effects in Stationary Stochastic Dynamical Systems0
Not All Queries Need Deep Thought: CoFiCot for Adaptive Coarse-to-fine Stateful Refinement0
LycheeCluster: Efficient Long-Context Inference with Structure-Aware Chunking and Hierarchical KV Indexing0
Generative Adversarial Regression (GAR): Learning Conditional Risk Scenarios0
FOMO-3D: Using Vision Foundation Models for Long-Tailed 3D Object Detection0
SkipGS: Post-Densification Backward Skipping for Efficient 3DGS Training0
MERIT Feedback Elicits Better Bargaining in LLM Negotiators0
Scale Space Diffusion1
Automated Thematic Analysis for Clinical Qualitative Data: Iterative Codebook Refinement with Full Provenance0
Show:102550
← PrevPage 185 of 13232Next →