SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1345113500 of 474278 papers

TitleStatusHype
MadCLIP: Few-shot Medical Anomaly Detection with CLIPCode0
Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric AttentionCode0
GazeTarget360: Towards Gaze Target Estimation in 360-Degree for Robot PerceptionCode0
OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous DrivingCode0
Event-based Tiny Object Detection: A Benchmark Dataset and BaselineCode0
MReg: A Novel Regression Model with MoE-based Video Feature Mining for Mitral Regurgitation DiagnosisCode0
AutoEvoEval: An Automated Framework for Evolving Close-Ended LLM Evaluation DataCode0
Interpretable Zero-Shot Learning with Locally-Aligned Vision-Language ModelCode0
How to Design and Train Your Implicit Neural Representation for Video CompressionCode0
State and Memory is All You Need for Robust and Reliable AI Agents0
LineRetriever: Planning-Aware Observation Reduction for Web Agents0
Supercm: Revisiting Clustering for Semi-Supervised Learning0
Discovering the underlying analytic structure within Standard Model constants using artificial intelligenceCode0
A Data-Ensemble-Based Approach for Sample-Efficient LQ Control of Linear Time-Varying Systems0
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGICode0
Robustness of Misinformation Classification Systems to Adversarial Examples Through BeamAttackCode0
A Survey on Vision-Language-Action Models for Autonomous DrivingCode4
Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-TuningCode2
GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models0
Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data0
MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation0
STACK: Adversarial Attacks on LLM Safeguard Pipelines0
Flash-VStream: Efficient Real-Time Understanding for Long Video StreamsCode3
Consensus-based optimization for closed-box adversarial attacks and a connection to evolution strategiesCode0
Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object TrackingCode1
Dataset Distillation via Vision-Language Category PrototypeCode1
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future FrontiersCode5
Diffusion Model-based Data Augmentation Method for Fetal Head Ultrasound Segmentation0
Visual and Memory Dual Adapter for Multi-Modal Object TrackingCode0
HiNeuS: High-fidelity Neural Surface Mitigating Low-texture and Reflective Ambiguity0
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real WorldCode2
Refine Any Object in Any SceneCode1
Epona: Autoregressive Diffusion World Model for Autonomous DrivingCode3
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting0
μ^2Tokenizer: Differentiable Multi-Scale Multi-Modal Tokenizer for Radiology Report Generation0
Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective0
Flow-Through Tensors: A Unified Computational Graph Architecture for Multi-Layer Transportation Network Optimization0
Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model0
FADRM: Fast and Accurate Data Residual Matching for Dataset DistillationCode1
The Trilemma of Truth in Large Language ModelsCode0
Constructing Non-Markovian Decision Process via History AggregatorCode0
Thought-Augmented Planning for LLM-Powered Interactive Recommender AgentCode0
Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning0
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement LearningCode2
MDPG: Multi-domain Diffusion Prior Guidance for MRI ReconstructionCode0
Self-Supervised Multiview Xray MatchingCode0
Seeding neural network quantum states with tensor network statesCode0
Real-World En Call Center Transcripts Dataset with PII RedactionCode0
Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models0
Ella: Embodied Social Agents with Lifelong Memory0
Show:102550
← PrevPage 270 of 9486Next →