SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 32263250 of 661570 papers

TitleStatusHype
Reasoning Gets Harder for LLMs Inside A Dialogue0
Can Large Multimodal Models Inspect Buildings? A Hierarchical Benchmark for Structural Pathology Reasoning0
Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning0
Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification0
Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case0
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD0
Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models0
Evaluating Evidence Grounding Under User Pressure in Instruction-Tuned Language Models0
The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning0
EgoForge: Goal-Directed Egocentric World Simulator0
Learning Dynamic Belief Graphs for Theory-of-mind Reasoning0
TinyML Enhances CubeSat Mission Capabilities0
LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis0
AI Agents Can Already Autonomously Perform Experimental High Energy Physics0
Adaptive Greedy Frame Selection for Long Video Understanding0
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking0
Improving Image-to-Image Translation via a Rectified Flow Reformulation0
MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms0
Deterministic Mode Proposals: An Efficient Alternative to Generative Sampling for Ambiguous Segmentation0
LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation0
MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints0
PFM-VEPAR: Prompting Foundation Models for RGB-Event Camera based Pedestrian Attribute Recognition0
Graph-Informed Adversarial Modeling: Infimal Subadditivity of Interpolative Divergences0
Layered Quantum Architecture Search for 3D Point Cloud Classification0
Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision0
Show:102550
← PrevPage 130 of 26463Next →