SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 59515975 of 474278 papers

TitleStatusHype
MAS-GPT: Training LLMs to Build LLM-based Multi-Agent SystemsCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
Golden Cudgel Network for Real-Time Semantic SegmentationCode2
BEVDriver: Leveraging BEV Maps in LLMs for Robust Closed-Loop DrivingCode2
Collaborative Expert LLMs Guided Multi-Objective Molecular OptimizationCode2
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent ApplicationsCode2
BHViT: Binarized Hybrid Vision TransformerCode2
WMNav: Integrating Vision-Language Models into World Models for Object Goal NavigationCode2
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMsCode2
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-TransformCode2
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical EnvironmentsCode2
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal ModelsCode2
MPO: Boosting LLM Agents with Meta Plan OptimizationCode2
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in ZebrafishCode2
Technique Inference Engine: A Recommender Model to Support Cyber Threat HuntingCode2
Large-Scale Data Selection for Instruction TuningCode2
DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-ResolutionCode2
Composed Multi-modal Retrieval: A Survey of Approaches and ApplicationsCode2
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN DiscriminatorCode2
Liger: Linearizing Large Language Models to Gated Recurrent StructuresCode2
MI-DETR: An Object Detection Model with Multi-time Inquiries MechanismCode2
Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus AreasCode2
Forgetting Transformer: Softmax Attention with a Forget GateCode2
Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAGCode2
OptMetaOpenFOAM: Large Language Model Driven Chain of Thought for Sensitivity Analysis and Parameter Optimization based on CFDCode2
Show:102550
← PrevPage 239 of 18972Next →