SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 36513700 of 659983 papers

TitleStatusHype
MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale0
AGCD: Agent-Guided Cross-Modal Decoding for Weather Forecasting0
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty0
Learnability with Partial Labels and Adaptive Nearest Neighbors0
Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium1
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models1
Revisiting Model Stitching In the Foundation Model Era0
Neural Value Iteration0
Self-Supervised ImageNet Representations for In Vivo Confocal Microscopy: Tortuosity Grading without Segmentation Maps0
Pretraining and Benchmarking Modern Encoders for Latvian0
Deep Reinforcement Learning for Fano Hypersurfaces0
BIS Reasoning 1.0: The First Large-Scale Japanese Benchmark for Belief-Inconsistent Syllogistic Reasoning0
Seismic full-waveform inversion based on a physics-driven generative adversarial network0
SRL-MAD: Structured Residual Latents for One-Class Morphing Attack Detection0
Establishing Construct Validity in LLM Capability Benchmarks Requires Nomological Networks0
Predictive Uncertainty in Short-Term PV Forecasting under Missing Data: A Multiple Imputation Approach0
Are LLMs Good Text Diacritizers? An Arabic and Yoruba Case Study0
T-FIX: Text-Based Explanations with Features Interpretable to eXperts0
Self Voice Conversion as an Attack against Neural Audio Watermarking0
CRASH: Cognitive Reasoning Agent for Safety Hazards in Autonomous Driving0
Generative Semantic HARQ: Latent-Space Text Retransmission and Combining0
Towards Foundation Models for Consensus Rank Aggregation0
Bridging National and International Legal Data: Two Projects Based on the Japanese Legal Standard XML Schema for Comparative Law Studies0
Coherent Audio-Visual Editing via Conditional Audio Generation Following Video Edits0
InterPol: De-anonymizing LM Arena via Interpolated Preference Learning0
DS^2-Instruct: Domain-Specific Data Synthesis for Large Language Models Instruction Tuning0
BayesBreak: Generalized Hierarchical Bayesian Segmentation with Irregular Designs, Multi-Sample Hierarchies, and Grouped/Latent-Group Designs0
CLRNet: Targetless Extrinsic Calibration for Camera, Lidar and 4D Radar Using Deep Learning0
Algorithmic Trading Strategy Development and Optimisation0
DAST: A Dual-Stream Voice Anonymization Attacker with Staged Training0
Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty0
Efficient Story Point Estimation With Comparative Learning0
LLM-Driven Instance-Specific Heuristic Generation and Selection0
Multiresolution Analysis and Statistical Thresholding on Dynamic Networks0
Convergence and clustering analysis for Mean Shift with radially symmetric, positive definite kernels0
WaRA: Wavelet Low Rank AdaptationCode0
Disentangled Feature Importance0
SpatialViz-Bench: A Cognitively-Grounded Benchmark for Diagnosing Spatial Visualization in MLLMs0
Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling0
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner0
Cropping outperforms dropout as an augmentation strategy for self-supervised training of text embeddings0
STEMTOX: From Social Tags to Fine-Grained Toxic Meme Detection via Entropy-Guided Multi-Task Learning0
Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark0
Benchmarking LLM-based agents for single-cell omics analysis0
Surgical Video Understanding with Label Interpolation0
EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer0
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask0
Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm0
YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection0
Convergence of Distributionally Robust Q-Learning with Linear Function Approximation0
Show:102550
← PrevPage 74 of 13200Next →