SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 71517200 of 661570 papers

TitleStatusHype
When LLM Judge Scores Look Good but Best-of-N Decisions Fail0
Multi-Station WiFi CSI Sensing Framework Robust to Station-wise Feature Missingness and Limited Labeled Data0
Fair Learning for Bias Mitigation and Quality Optimization in Paper Recommendation0
KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System0
Geometry-Aware Probabilistic Circuits via Voronoi Tessellations0
RF4D:Neural Radar Fields for Novel View Synthesis in Outdoor Dynamic Scenes0
Hope Speech Detection in code-mixed Roman Urdu tweets: A Positive Turn in Natural Language Processing0
Adaptive Dual-Constrained Line Aggregation for Robust Generic and Wireframe Line Segment Detection0
On the Theoretical Limitations of Embedding-Based Retrieval4
Disentangling Slow and Fast Temporal Dynamics in Degradation Inference with Hierarchical Differential Models0
ManiVID-3D: Generalizable View-Invariant Reinforcement Learning for Robotic Manipulation via Disentangled 3D Representations0
NormGenesis: Multicultural Dialogue Generation via Exemplar-Guided Social Norm Modeling and Violation Recovery0
Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning1
Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct2
Contrastive Diffusion Guidance for Spatial Inverse Problems0
Refereed Learning0
ReSplat: Learning Recurrent Gaussian Splatting0
Understanding and Optimizing Attention-Based Sparse Matching for Diverse Local Features0
DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models0
See4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting0
FrugalPrompt: Reducing Contextual Overhead in Large Language Models via Token Attribution0
A Foundational Theory of Quantitative Abstraction: Adjunctions, Duality, and Logic for Probabilistic Systems0
More Than Memory Savings: Zeroth-Order Optimization Mitigates Forgetting in Continual Learning0
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering0
Adaptive Hyperbolic Kernels: Modulated Embedding in de Branges-Rovnyak Spaces0
Quality Assurance of LLM-generated Code: Addressing Non-Functional Quality Characteristics0
Defending Unauthorized Model Merging via Dual-Stage Weight Protection0
Mobile-Agent-RAG: Driving Smart Multi-Agent Coordination with Contextual Knowledge Empowerment for Long-Horizon Mobile Automation0
ConCISE: A Reference-Free Conciseness Evaluation Metric for LLM-Generated Answers0
Radiative-Structured Neural Operator for Continuous Spectral Super-Resolution0
Decoupling Perception from Reasoning for Hallucination-Resistant Video Understanding0
Beyond Description: Cognitively Benchmarking Fine-Grained Action for Embodied Agents0
Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing0
LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models0
Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict0
Value Under Ignorance in Universal Artificial Intelligence0
SDUM: A Scalable Deep Unrolled Model for Universal MRI Reconstruction0
Resurfacing Paralinguistic Awareness in Large Audio Language Models0
Don't Mind the Gaps: Implicit Neural Representations for Resolution-Agnostic Retinal OCT Analysis0
Beyond the Black Box: A Survey on the Theory and Mechanism of Large Language Models0
Prompting Underestimates LLM Capability for Time Series Classification0
Provably Finding a Hidden Dense Submatrix among Many Planted Dense Submatrices via Convex Programming0
LLMTrack: Semantic Multi-Object Tracking with Multi-modal Large Language Models0
Learning Through Dialogue: Engagement and Efficacy Matter More Than Explanations0
PosIR: Position-Aware Heterogeneous Information Retrieval Benchmark0
Energy-Aware Metaheuristics0
A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization0
Do LLMs Truly Benefit from Longer Context in Automatic Post-Editing?0
Generating a Paracosm for Training-Free Zero-Shot Composed Image Retrieval0
BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft0
Show:102550
← PrevPage 144 of 13232Next →