SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 82518275 of 474278 papers

TitleStatusHype
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token SynchronizationCode2
Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLMCode2
PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision MakersCode2
SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code AgentsCode2
Duoduo CLIP: Efficient 3D Understanding with Multi-View ImagesCode2
DiffMM: Multi-Modal Diffusion Model for RecommendationCode2
DistPred: A Distribution-Free Probabilistic Inference Method for Regression and ForecastingCode2
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical ReasoningCode2
Understanding Multi-Granularity for Open-Vocabulary Part SegmentationCode2
Residual and bidirectional LSTM for epileptic seizure detectionCode2
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language ModelsCode2
Watch Every Step! LLM Agent Learning via Iterative Step-Level Process RefinementCode2
A Robust Online Multi-Camera People Tracking System With Geometric Consistency and State-aware Re-ID CorrectionCode2
Large Scale Transfer Learning for Tabular Data via Language ModelingCode2
Transcoders Find Interpretable LLM Feature CircuitsCode2
Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic SegmentationCode2
Twin-Merging: Dynamic Integration of Modular Expertise in Model MergingCode2
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMsCode2
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning AbilitiesCode2
Zero-Shot Scene Change DetectionCode2
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language ModelsCode2
OGNI-DC: Robust Depth Completion with Optimization-Guided Neural IterationsCode2
GUICourse: From General Vision Language Models to Versatile GUI AgentsCode2
ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPOCode2
In-Context Editing: Learning Knowledge from Self-Induced DistributionsCode2
Show:102550
← PrevPage 331 of 18972Next →