SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 72267250 of 177340 papers

TitleStatusHype
FanOutQA: A Multi-Hop, Multi-Document Question Answering Benchmark for Large Language ModelsCode2
VCoder: Versatile Vision Encoders for Multimodal Large Language ModelsCode2
RectifID: Personalizing Rectified Flow with Anchored Classifier GuidanceCode2
3D Gaussian Splatting with Deferred ReflectionCode2
Centroid-Based Efficient Minimum Bayes Risk DecodingCode2
VectorMapNet: End-to-end Vectorized HD Map LearningCode2
SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target DetectionCode2
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block QuantizationCode2
TinyLVLM-eHub: Towards Comprehensive and Efficient Evaluation for Large Vision-Language ModelsCode2
Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled GuidanceCode2
Measuring Re-identification RiskCode2
DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional LatentsCode2
RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented TransformerCode2
Transformer-Based Visual Segmentation: A SurveyCode2
Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change ProcessCode2
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement LearningCode2
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionCode2
YOLOPoint Joint Keypoint and Object DetectionCode2
chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical PhysicsCode2
VeriThinker: Learning to Verify Makes Reasoning Model EfficientCode2
Colar: Effective and Efficient Online Action Detection by Consulting ExemplarsCode2
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language ModelsCode2
MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object TrackingCode2
GaussianWorld: Gaussian World Model for Streaming 3D Occupancy PredictionCode2
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in ZebrafishCode2
Show:102550
← PrevPage 290 of 7094Next →