SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 77267750 of 474278 papers

TitleStatusHype
Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann DensitiesCode0
Less Greedy Equivalence SearchCode0
KARMA: Efficient Structural Defect Segmentation via Kolmogorov-Arnold Representation LearningCode0
AWEMixer: Adaptive Wavelet-Enhanced Mixer Network for Long-Term Time Series ForecastingCode0
Data Efficiency and Transfer Robustness in Biomedical Image Segmentation: A Study of Redundancy and Forgetting with CellposeCode0
Learning from Online Videos at Inference Time for Computer-Use AgentsCode0
MIND: Material Interface Generation from UDFs for Non-Manifold Surface ReconstructionCode0
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition TokenizationCode0
CorPipe at CRAC 2025: Evaluating Multilingual Encoders for Multilingual Coreference ResolutionCode0
SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene UnderstandingCode0
Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate IntegrationCode0
Text to Sketch Generation with Multi-StylesCode0
MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark DetectionCode0
An Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human InterventionCode0
Adversarial and Score-Based CT Denoising: CycleGAN vs Noise2ScoreCode0
Carousel: A High-Resolution Dataset for Multi-Target Automatic Image CroppingCode0
Learning Interestingness in Automated Mathematical Theory FormationCode0
ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI ApplicationsCode0
Towards Formalizing Reinforcement Learning TheoryCode0
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache EvictionCode0
SOLVE-Med: Specialized Orchestration for Leading Vertical Experts across Medical SpecialtiesCode0
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off0
Generative View Stitching0
LiveTradeBench: Seeking Real-World Alpha with Large Language Models0
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning0
Show:102550
← PrevPage 310 of 18972Next →