SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 40514075 of 177340 papers

TitleStatusHype
Do generative video models understand physical principles?Code3
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor SearchCode3
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online VideosCode3
VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed TomographyCode3
Remote Sensing Temporal Vision-Language Models: A Comprehensive SurveyCode3
STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space ModelCode3
Rethinking Evaluation Metrics of Open-Vocabulary SegmentaionCode3
Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory MappingCode3
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language ModelsCode3
ACE2: Accurately learning subseasonal to decadal atmospheric variability and forced responsesCode3
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and SpeechCode3
KnowAgent: Knowledge-Augmented Planning for LLM-Based AgentsCode3
Scaling Analysis of Interleaved Speech-Text Language ModelsCode3
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated CapabilitiesCode3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language ModelsCode3
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and BenchmarkCode3
A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant FrameworksCode3
PyThaiNLP: Thai Natural Language Processing in PythonCode3
FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse LandscapesCode3
A Survey of Large Language Models in Medicine: Progress, Application, and ChallengeCode3
MultiDiffusion: Fusing Diffusion Paths for Controlled Image GenerationCode3
Rule Based Rewards for Language Model SafetyCode3
Hyper-parameter tuning for text guided image editingCode3
RepoGraph: Enhancing AI Software Engineering with Repository-level Code GraphCode3
Show:102550
← PrevPage 163 of 7094Next →