SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 94519475 of 474278 papers

TitleStatusHype
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual PatternsCode2
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy DataCode2
H-vmunet: High-order Vision Mamba UNet for Medical Image SegmentationCode2
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language ModelsCode2
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language ModelsCode2
VL-ICL Bench: The Devil in the Details of Multimodal In-Context LearningCode2
JaxUED: A simple and useable UED library in JaxCode2
Lifting Multi-View Detection and Tracking to the Bird's Eye ViewCode2
Tuning-Free Image Customization with Image and Text GuidanceCode2
Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEsCode2
Embodied LLM Agents Learn to Cooperate in Organized TeamsCode2
Better Call SAL: Towards Learning to Segment Anything in LidarCode2
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANsCode2
ViTGaze: Gaze Following with Interaction Features in Vision TransformersCode2
Cross-Domain Pre-training with Language Models for Transferable Time Series RepresentationsCode2
Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path PlanningCode2
Discover and Mitigate Multiple Biased Subgroups in Image ClassifiersCode2
Optimal Flow Matching: Learning Straight Trajectories in Just One StepCode2
Task-Customized Mixture of Adapters for General Image FusionCode2
Advancing Time Series Classification with Multimodal Language ModelingCode2
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block QuantizationCode2
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image SynthesisCode2
Generative Enhancement for 3D Medical ImagesCode2
RouterBench: A Benchmark for Multi-LLM Routing SystemCode2
GaussNav: Gaussian Splatting for Visual NavigationCode2
Show:102550
← PrevPage 379 of 18972Next →