SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 61516175 of 474278 papers

TitleStatusHype
SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image InterpretationCode2
Human-Centric Foundation Models: Perception, Generation and Agentic ModelingCode2
A Systematic Review on the Evaluation of Large Language Models in Theory of Mind TasksCode2
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image ClassificationCode2
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic TasksCode2
WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting PointCode2
Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent DiffusionCode2
Fino1: On the Transferability of Reasoning Enhanced LLMs to FinanceCode2
LIR-LIVO: A Lightweight,Robust LiDAR/Vision/Inertial Odometry with Illumination-Resilient Deep FeaturesCode2
Cluster and Predict Latents Patches for Improved Masked Image ModelingCode2
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic DataCode2
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book DataCode2
MeshSplats: Mesh-Based Rendering with Gaussian Splatting InitializationCode2
DPO-Shift: Shifting the Distribution of Direct Preference OptimizationCode2
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its HybridCode2
Automated Capability Discovery via Model Self-ExplorationCode2
TextAtlas5M: A Large-scale Dataset for Dense Text Image GenerationCode2
Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion ModelsCode2
Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous DrivingCode2
Training Deep Learning Models with Norm-Constrained LMOsCode2
RoboBERT: An End-to-end Multimodal Robotic Manipulation ModelCode2
On the Emergence of Thinking in LLMs I: Searching for the Right IntuitionCode2
MaterialFusion: High-Quality, Zero-Shot, and Controllable Material Transfer with Diffusion ModelsCode2
SAMRefiner: Taming Segment Anything Model for Universal Mask RefinementCode2
KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph EnrichmentCode2
Show:102550
← PrevPage 247 of 18972Next →