SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1680116850 of 474278 papers

TitleStatusHype
Hyperbolic Dual Feature Augmentation for Open-Environment0
ORIDa: Object-centric Real-world Image Composition Dataset0
Biologically Inspired Deep Learning Approaches for Fetal Ultrasound Image Classification0
Optimization over Sparse Support-Preserving Sets: Two-Step Projection with Global Optimality GuaranteesCode0
Effective Data Pruning through Score ExtrapolationCode0
SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive DecodingCode0
A Sample Efficient Conditional Independence Test in the Presence of DiscretizationCode0
Inherently Faithful Attention Maps for Vision TransformersCode0
SSS: Semi-Supervised SAM-2 with Efficient Prompting for Medical Imaging SegmentationCode0
AstroCompress: A benchmark dataset for multi-purpose compression of astronomical dataCode0
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance SamplingCode0
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversationsCode0
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision ModelsCode1
Differentially Private Relational Learning with Entity-level Privacy GuaranteesCode0
Sample Efficient Demonstration Selection for In-Context LearningCode0
Enhancing generalizability of model discovery across parameter space with multi-experiment equation learning (ME-EQL)Code0
OpenRR-1k: A Scalable Dataset for Real-World Reflection RemovalCode0
Normalized Radon Cumulative Distribution Transforms for Invariance and Robustness in Optimal Transport Based Image ClassificationCode0
Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study0
HGFormer: A Hierarchical Graph Transformer Framework for Two-Stage Colonel Blotto Games via Reinforcement Learning0
Implementing Keyword Spotting on the MCUX947 Microcontroller with Integrated NPU0
NeurIPS 2024 ML4CFD Competition: Results and Retrospective Analysis0
Graph Prompting for Graph Learning Models: Recent Advances and Future Directions0
Diffusion-based Time Series Forecasting for Sewerage Systems0
Towards Robust Deep Reinforcement Learning against Environmental State Perturbation0
Real-Time Cascade Mitigation in Power Systems Using Influence Graph Improved by Reinforcement Learning0
Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring0
From Pixels to Graphs: using Scene and Knowledge Graphs for HD-EPIC VQA Challenge0
Landsat-Bench: Datasets and Benchmarks for Landsat Foundation ModelsCode1
Diffusion Models for Safety Validation of Autonomous Driving Systems0
KP-PINNs: Kernel Packet Accelerated Physics Informed Neural NetworksCode0
Bridging RDF Knowledge Graphs with Graph Neural Networks for Semantically-Rich Recommender SystemsCode0
SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware SkippingCode0
AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP0
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMsCode1
Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents0
RadioDUN: A Physics-Inspired Deep Unfolding Network for Radio Map Estimation0
Improved LLM Agents for Financial Document Question Answering0
Bayesian Inverse Physics for Neuro-Symbolic Robot Learning0
Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens0
Unlocking the Potential of Large Language Models in the Nuclear Industry with Synthetic Data0
Re4MPC: Reactive Nonlinear MPC for Multi-model Motion Planning via Deep Reinforcement LearningCode1
AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety BasinCode1
SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting0
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language ModelCode7
Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language ModelsCode1
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete DiffusionCode1
MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised LearningCode1
ConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Preference OptimizationCode1
MIRAGE: Multimodal foundation model and benchmark for comprehensive retinal OCT image analysisCode1
Show:102550
← PrevPage 337 of 9486Next →