SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 84768500 of 474278 papers

TitleStatusHype
VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual ReasoningCode0
Polyline Path Masked Attention for Vision TransformerCode0
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use AgentsCode0
Taming the Judge: Deconflicting AI Feedback for Stable Reinforcement LearningCode0
EMA-SAM: Exponential Moving-average for SAM-based PTMC SegmentationCode0
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel PerspectiveCode0
Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble DecodingCode0
Training Diverse Graph Experts for Ensembles: A Systematic Empirical StudyCode0
Learning with Dual-level Noisy Correspondence for Multi-modal Entity AlignmentCode0
Variance-Reduction Guidance: Sampling Trajectory Optimization for Diffusion ModelsCode0
2D_3D Feature Fusion via Cross-Modal Latent Synthesis and Attention Guided Restoration for Industrial Anomaly DetectionCode0
OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time ScalesCode0
CourtGuard: A Local, Multiagent Prompt Injection ClassifierCode0
SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in EcologyCode0
CARDIUM: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical recordsCode0
REACT-KD: Region-Aware Cross-modal Topological Knowledge Distillation for Interpretable Medical Image ClassificationCode0
UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model ExpertsCode0
Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution0
Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles0
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning0
ReDi: Rectified Discrete FlowCode0
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models0
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery0
Nearest-Class Mean and Logits Agreement for Wildlife Open-Set RecognitionCode0
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation0
Show:102550
← PrevPage 340 of 18972Next →