SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 79017950 of 661570 papers

TitleStatusHype
Learning to Wander: Improving the Global Image Geolocation Ability of LMMs via Actionable Reasoning0
MoXaRt: Audio-Visual Object-Guided Sound Interaction for XR0
A Bipartite Graph Approach to U.S.-China Cross-Market Return Forecasting0
Modeling Stage-wise Evolution of User Interests for News Recommendation0
Aligning Large Language Models with Searcher Preferences0
Muscle Synergy Priors Enhance Biomechanical Fidelity in Predictive Musculoskeletal Locomotion Simulation0
VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization0
Learning to Negotiate: Multi-Agent Deliberation for Collective Value Alignment in LLMs0
Spatial self-supervised Peak Learning and correlation-based Evaluation of peak picking in Mass Spectrometry Imaging0
JEDI: Jointly Embedded Inference of Neural Dynamics0
An Event-Driven E-Skin System with Dynamic Binary Scanning and real time SNN Classification0
IMTBench: A Multi-Scenario Cross-Modal Collaborative Evaluation Benchmark for In-Image Machine Translation0
Naïve Exposure of Generative AI Capabilities Undermines Deepfake Detection0
Taking Shortcuts for Categorical VQA Using Super Neurons0
UHD Image Deblurring via Autoregressive Flow with Ill-conditioned Constraints0
IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs0
AILS-NTUA at SemEval-2026 Task 8: Evaluating Multi-Turn RAG Conversations0
End-to-End Chatbot Evaluation with Adaptive Reasoning and Uncertainty Filtering0
Trajectory-Informed Memory Generation for Self-Improving Agent Systems0
UAV-MARL: Multi-Agent Reinforcement Learning for Time-Critical and Dynamic Medical Supply Delivery0
Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning0
DSFlash: Comprehensive Panoptic Scene Graph Generation in Realtime0
Making Bielik LLM Reason (Better): A Field Report0
SCORE: Replacing Layer Stacking with Contractive Recurrent Depth0
Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning0
Automatic End-to-End Data Integration using Large Language Models0
Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues0
R4-CGQA: Retrieval-based Vision Language Models for Computer Graphics Image Quality Assessment0
PET-F2I: A Comprehensive Benchmark and Parameter-Efficient Fine-Tuning of LLMs for PET/CT Report Impression Generation0
Quantization Robustness of Monotone Operator Equilibrium Networks0
UniStitch: Unifying Semantic and Geometric Features for Image Stitching0
HAPEns: Hardware-Aware Post-Hoc Ensembling for Tabular Data0
Need for Speed: Zero-Shot Depth Completion with Single-Step Diffusion0
Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning0
Gradient Flow Drifting: Generative Modeling via Wasserstein Gradient Flows of KDE-Approximated Divergences0
Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation0
Disentangling Similarity and Relatedness in Topic Models0
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models0
Splat2Real: Novel-view Scaling for Physical AI with 3D Gaussian Splatting0
Are Video Reasoning Models Ready to Go Outside?0
How To Embed Matters: Evaluation of EO Embedding Design Choices0
A^2-Edit: Precise Reference-Guided Image Editing of Arbitrary Objects and Ambiguous Masks0
Spatio-Temporal Attention Graph Neural Network: Explaining Causalities With Attention0
Emulating Clinician Cognition via Self-Evolving Deep Clinical Research0
Surrogate models for nuclear fusion with parametric Shallow Recurrent Decoder Networks: applications to magnetohydrodynamics0
A Platform-Agnostic Multimodal Digital Human Modelling Framework: Neurophysiological Sensing in Game-Based Interaction0
MapGCLR: Geospatial Contrastive Learning of Representations for Online Vectorized HD Map Construction0
Repurposing Backdoors for Good: Ephemeral Intrinsic Proofs for Verifiable Aggregation in Cross-silo Federated Learning0
EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution0
Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval0
Show:102550
← PrevPage 159 of 13232Next →