SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 79768000 of 474278 papers

TitleStatusHype
MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language ModelsCode0
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench0
Scaling Tractable Probabilistic Circuits: A Systems PerspectiveCode0
Incremental Human-Object Interaction Detection with Invariant Relation Representation LearningCode0
SpinalSAM-R1: A Vision-Language Multimodal Interactive System for Spine CT SegmentationCode0
Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and ServingCode0
The Denario project: Deep knowledge AI agents for scientific discoveryCode0
Cross-view Localization and Synthesis -- Datasets, Challenges and OpportunitiesCode0
D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning -- A Benchmark Dataset and MethodCode0
Defeating the Training-Inference Mismatch via FP160
From One to More: Contextual Part Latents for 3D Generation0
Locality in Image Diffusion Models Emerges from Data Statistics0
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in Omni Models0
Multi-Agent Evolve: LLM Self-Improve through Co-evolution0
Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMsCode0
EgoExo-Con: Exploring View-Invariant Video Temporal Understanding0
FullPart: Generating each 3D Part at Full Resolution0
Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras0
AMO-Bench: Large Language Models Still Struggle in High School Math Competitions0
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes0
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark0
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation0
C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language ModelsCode0
Smoothing Slot Attention Iterations and RecurrencesCode0
Holographic Transformers for Complex-Valued Signal Processing: Integrating Phase Interference into Self-AttentionCode0
Show:102550
← PrevPage 320 of 18972Next →