SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 65016525 of 474278 papers

TitleStatusHype
GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces0
MemOS: A Memory OS for AI System0
M3DR: Towards Universal Multilingual Multimodal Document Retrieval0
FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread ForecastingCode0
MathBode: Measuring the Stability of LLM Reasoning using Frequency Response0
FantasyStyle: Controllable Stylized Distillation for 3D Gaussian SplattingCode0
3D and 4D World Modeling: A SurveyCode0
Jupiter: Enhancing LLM Data Analysis Capabilities via Notebook and Inference-Time Value-Guided SearchCode0
Revisiting Data Challenges of Computational Pathology: A Pack-based Multiple Instance Learning Training FrameworkCode0
AutoEnv: Automated Environments for Measuring Cross-Environment Agent LearningCode0
ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in EchocardiographyCode0
Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data GenerationCode0
HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ SegmentationCode0
Optical Context Compression Is Just (Bad) AutoencodingCode0
CuES: A Curiosity-driven and Environment-grounded Synthesis Framework for Agentic RLCode0
S5: Scalable Semi-Supervised Semantic Segmentation in Remote SensingCode0
Context Cascade Compression: Exploring the Upper Limits of Text CompressionCode0
Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation LearningCode0
SELF: A Robust Singular Value and Eigenvalue Approach for LLM FingerprintingCode0
ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric VideosCode0
NavMapFusion: Diffusion-based Fusion of Navigation Maps for Online Vectorized HD Map ConstructionCode0
Virtual Parameter Sharpening: Dynamic Low-Rank Perturbations for Inference-Time Reasoning EnhancementCode0
RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence0
ASCIIBench: Evaluating Language-Model-Based Understanding of Visually-Oriented TextCode0
Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach0
Show:102550
← PrevPage 261 of 18972Next →