SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 61016150 of 177340 papers

TitleStatusHype
Disentangling Length from Quality in Direct Preference OptimizationCode2
WaveMixSR-V2: Enhancing Super-resolution with Higher EfficiencyCode2
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step GenerationCode2
SymbolFit: Automatic Parametric Modeling with Symbolic RegressionCode2
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex ScenariosCode2
An open dataset for oracle bone script recognition and deciphermentCode2
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentationCode2
Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education SystemsCode2
EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysisCode2
Doob's Lagrangian: A Sample-Efficient Variational Approach to Transition Path SamplingCode2
LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential RecommendationCode2
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long VideosCode2
Reward Design with Language ModelsCode2
Vision-Language Model for Object Detection and Segmentation: A Review and EvaluationCode2
Relevance-guided Supervision for OpenQA with ColBERTCode2
Modern Evolution Strategies for Creativity: Fitting Concrete Images and Abstract ConceptsCode2
Mixed-curvature decision trees and random forestsCode2
XLB: A differentiable massively parallel lattice Boltzmann library in PythonCode2
OmniXAI: A Library for Explainable AICode2
Time-MMD: Multi-Domain Multimodal Dataset for Time Series AnalysisCode2
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science ResearchCode2
Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender SystemsCode2
Learned Image Compression with Dictionary-based Entropy ModelCode2
Context Autoencoder for Self-Supervised Representation LearningCode2
Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object DetectorCode2
SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth ObservationCode2
Audio-FLAN: A Preliminary ReleaseCode2
FanOutQA: A Multi-Hop, Multi-Document Question Answering Benchmark for Large Language ModelsCode2
VCoder: Versatile Vision Encoders for Multimodal Large Language ModelsCode2
RectifID: Personalizing Rectified Flow with Anchored Classifier GuidanceCode2
3D Gaussian Splatting with Deferred ReflectionCode2
Centroid-Based Efficient Minimum Bayes Risk DecodingCode2
VectorMapNet: End-to-end Vectorized HD Map LearningCode2
SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target DetectionCode2
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block QuantizationCode2
TinyLVLM-eHub: Towards Comprehensive and Efficient Evaluation for Large Vision-Language ModelsCode2
Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled GuidanceCode2
Measuring Re-identification RiskCode2
DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional LatentsCode2
RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented TransformerCode2
Transformer-Based Visual Segmentation: A SurveyCode2
Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change ProcessCode2
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement LearningCode2
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionCode2
YOLOPoint Joint Keypoint and Object DetectionCode2
chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical PhysicsCode2
VeriThinker: Learning to Verify Makes Reasoning Model EfficientCode2
Colar: Effective and Efficient Online Action Detection by Consulting ExemplarsCode2
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language ModelsCode2
MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object TrackingCode2
Show:102550
← PrevPage 123 of 3547Next →