SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 80768100 of 474278 papers

TitleStatusHype
VFIMamba: Video Frame Interpolation with State Space ModelsCode2
Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion ModelsCode2
Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real ApproachCode2
SeFlow: A Self-Supervised Scene Flow Method in Autonomous DrivingCode2
Benchmarking Predictive Coding Networks -- Made SimpleCode2
Centerline Boundary Dice Loss for Vascular SegmentationCode2
E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awarenessCode2
SOOD++: Leveraging Unlabeled Data to Boost Oriented Object DetectionCode2
AutoFlow: Automated Workflow Generation for Large Language Model AgentsCode2
RegMix: Data Mixture as Regression for Language Model Pre-trainingCode2
DCoM: Active Learning for All LearnersCode2
DiscoveryBench: Towards Data-Driven Discovery with Large Language ModelsCode2
FairMedFM: Fairness Benchmarking for Medical Imaging Foundation ModelsCode2
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable ApproachesCode2
IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script GenerationCode2
Improving Diffusion Inverse Problem Solving with Decoupled Noise AnnealingCode2
Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution AnalysisCode2
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration ModelsCode2
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG SystemsCode2
MMLongBench-Doc: Benchmarking Long-context Document Understanding with VisualizationsCode2
FORA: Fast-Forward Caching in Diffusion Transformer AccelerationCode2
GalLoP: Learning Global and Local Prompts for Vision-Language ModelsCode2
Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT ReconstructionCode2
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?Code2
Equivariant Diffusion PolicyCode2
Show:102550
← PrevPage 324 of 18972Next →