SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 101150 of 474278 papers

TitleStatusHype
Are LLM-Enhanced Graph Neural Networks Robust against Poisoning Attacks?0
LLM Benchmark-User Need Misalignment for Climate Change0
Gaussian Shannon: High-Precision Diffusion Model Watermarking Based on Communication0
OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement0
CALRK-Bench: Evaluating Context-Aware Legal Reasoning in Korean Law0
HINT: Composed Image Retrieval with Dual-path Compositional Contextualized Network0
From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning0
MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model0
KMM-CP: Practical Conformal Prediction under Covariate Shift via Selective Kernel Mean Matching0
Analysing Calls to Order in German Parliamentary Debates0
UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models0
Make Geometry Matter for Spatial Reasoning0
AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection0
Beyond Textual Knowledge-Leveraging Multimodal Knowledge Bases for Enhancing Vision-and-Language Navigation0
Unified Number-Free Text-to-Motion Generation Via Flow Matching0
PixelSmile: Toward Fine-Grained Facial Expression Editing0
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference0
Can Users Specify Driving Speed? Bench2Drive-Speed: Benchmark and Baselines for Desired-Speed Conditioned Autonomous Driving0
No Hard Negatives Required: Concept Centric Learning Leads to Compositionality without Degrading Zero-shot Capabilities of Contrastive Models0
Focus-to-Perceive Representation Learning: A Cognition-Inspired Hierarchical Framework for Endoscopic Video Analysis0
Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio0
Adapting Segment Anything Model 3 for Concept-Driven Lesion Segmentation in Medical Images: An Experimental Study0
Low-Rank-Modulated Functa: Exploring the Latent Space of Implicit Neural Representations for Interpretable Ultrasound Video Analysis0
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes0
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models0
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling0
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation0
GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding0
Anchored-Branched Steady-state WInd Flow Transformer (AB-SWIFT): a metamodel for 3D atmospheric flow in urban environments0
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation0
BEVMAPMATCH: Multimodal BEV Neural Map Matching for Robust Re-Localization of Autonomous Vehicles0
Diffusion MRI Transformer with a Diffusion Space Rotary Positional Embedding (D-RoPE)0
World Reasoning Arena0
Dictionary-based Pathology Mining with Hard-instance-assisted Classifier Debiasing for Genetic Biomarker Prediction from WSIs0
RS-SSM: Refining Forgotten Specifics in State Space Model for Video Semantic Segmentation0
Can MLLMs Read Students' Minds? Unpacking Multimodal Error Analysis in Handwritten Math0
Learning Explicit Continuous Motion Representation for Dynamic Gaussian Splatting from Monocular Videos0
Robust Principal Component Completion0
MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning0
MCLMR: A Model-Agnostic Causal Learning Framework for Multi-Behavior Recommendation0
UniAI-GraphRAG: Synergizing Ontology-Guided Extraction, Multi-Dimensional Clustering, and Dual-Channel Fusion for Robust Multi-Hop Reasoning0
AG-EgoPose: Leveraging Action-Guided Motion and Kinematic Joint Encoding for Egocentric 3D Pose Estimation0
VolDiT: Controllable Volumetric Medical Image Synthesis with Diffusion Transformers0
WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing0
Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models0
EagleNet: Energy-Aware Fine-Grained Relationship Learning Network for Text-Video Retrieval0
CRAFT: Grounded Multi-Agent Coordination Under Partial Information0
V2U4Real: A Real-world Large-scale Dataset for Vehicle-to-UAV Cooperative Perception0
HeSS: Head Sensitivity Score for Sparsity Redistribution in VGGT0
From Intent to Evidence: A Categorical Approach for Structural Evaluation of Deep Research Agents0
Show:102550
← PrevPage 3 of 9486Next →