SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1220112250 of 661570 papers

TitleStatusHype
Accelerating Video Generation Inference with Sequential-Parallel 3D Positional Encoding Using a Global Time Index0
SJD-PV: Speculative Jacobi Decoding with Phrase Verification for Autoregressive Image Generation0
Leveraging Model Soups to Classify Intangible Cultural Heritage Images from the Mekong Delta0
EnsAug: Augmentation-Driven Ensembles for Human Motion Sequence Analysis0
Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting0
Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in MedicineCode0
Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent MemoryCode0
Large Language Models as Bidding Agents in Repeated HetNet Auction0
On the Reliability of AI Methods in Drug Discovery: Evaluation of Boltz-2 for Structure and Binding Affinity Prediction0
A Neural Network-Based Real-time Casing Collar Recognition System for Downhole Instruments0
What Helps---and What Hurts: Bidirectional Explanations for Vision Transformers0
Extracting Training Dialogue Data from Large Language Model based Task Bots0
Bridging the Reproducibility Divide: Open Source Software's Role in Standardizing Healthcare AI0
Sleeper Cell: Injecting Latent Malice Temporal Backdoors into Tool-Using LLMs0
The Theory behind UMAP?0
Selecting Optimal Variable Order in Autoregressive Ising Models0
Manifold Aware Denoising Score Matching (MAD)0
Contextual Drag: How Errors in the Context Affect LLM Reasoning0
Language steering in latent space to mitigate unintended code-switching0
Federated Nonlinear System Identification0
WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration0
Quantifying Conversational Reliability of Large Language Models under Multi-Turn Interaction0
Practical Deep Heteroskedastic Regression0
MobileMold: A Smartphone-Based Microscopy Dataset for Food Mold Detection0
The Hidden Width of Deep ResNets: Tight Error Bounds and Phase Diagram0
Proceedings for the Inaugural Meeting of the International Society for Tractography -- IST 2025 Bordeaux0
Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning0
DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization0
Post-training Large Language Models for Diverse High-Quality Responses0
Ignore All Previous Instructions: Jailbreaking as a de-escalatory peace building practise to resist LLM social media bots0
VMDNet: Temporal Leakage-Free Variational Mode Decomposition for Electricity Demand Forecasting0
From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video AgentsCode0
FireRed-OCR Technical Report3
Hard-constraint physics-residual networks enable robust extrapolation for hydrogen crossover prediction in PEM water electrolyzers0
AIRMap: AI-Generated Radio Maps for Wireless Digital Twins0
Transform-Invariant Generative Ray Path Sampling for Efficient Radio Propagation ModelingCode0
RA-Det: Towards Universal Detection of AI-Generated Images via Robustness Asymmetry0
Uniform-in-time concentration in two-layer neural networks via transportation inequalities0
TiledAttention: a CUDA Tile SDPA Kernel for PyTorch0
Accelerating Single-Pass SGD for Generalized Linear Prediction0
GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR0
Recursive Models for Long-Horizon Reasoning0
REMS: a unified solution representation, problem modeling and metaheuristic algorithm design for general combinatorial optimization problems0
Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework0
Transmit Weights, Not Features: Orthogonal-Basis Aided Wireless Point-Cloud Transmission0
Gender Bias in Emotion Recognition by Large Language Models0
NAB: Neural Adaptive Binning for Sparse-View CT reconstructionCode0
WAXAL: A Large-Scale Multilingual African Language Speech Corpus0
nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models0
CoVAE: correlated multimodal generative modeling0
Show:102550
← PrevPage 245 of 13232Next →