SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1585115900 of 474278 papers

TitleStatusHype
Simple Radiology VLLM Test-time Scaling with Thought Graph TraversalCode0
FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap DeepfakesCode0
ViSAGe: Video-to-Spatial Audio Generation0
Self-supervised Learning of Echocardiographic Video Representations via Online Cluster DistillationCode1
From Sharpness to Better Generalization for Speech Deepfake Detection0
Semantic Scheduling for LLM InferenceCode0
Deep Symmetric Autoencoders from the Eckart-Young-Schmidt PerspectiveCode0
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM AgentsCode1
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security TasksCode2
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic SoundscapesCode2
Efficient Long-Context LLM Inference via KV Cache Clustering0
Efficient Speech Enhancement via Embeddings from Pre-trained Generative AudioencodersCode2
GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers0
Vectorized Sparse Second-Order Forward Automatic Differentiation for Optimal Control Direct MethodsCode1
Dual‑detector Re‑optimization for Federated Weakly Supervised Video Anomaly Detection Via Adaptive Dynamic Recursive MappingCode1
A Multi-Agent Probabilistic Inference Framework Inspired by Kairanban-Style CoT System with IdoBata Conversation for Debiasing0
DiffPR: Diffusion-Based Phase Reconstruction via Frequency-Decoupled Learning0
Design of 3D Beamforming and Deployment Strategies for ISAC-based HAPS Systems0
The Sample Complexity of Parameter-Free Stochastic Convex Optimization0
Convolutional method for data assimilation An improved method on neuronal electrophysiological data0
Measuring multi-calibration0
Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data0
Optimal experiment design for practical parameter identifiability and model discrimination0
HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation0
Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation0
LLM-as-a-Judge for Reference-less Automatic Code Validation and Refinement for Natural Language to Bash in IT Automation0
Can Time-Series Foundation Models Perform Building Energy Management Tasks?0
Conformal Safety Shielding for Imperfect-Perception Agents0
Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients0
A Hybrid Adaptive Nash Equilibrium Solver for Distributed Multi-Agent Systems with Game-Theoretic Jump Triggering0
Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity ControlCode0
Joint Denoising of Cryo-EM Projection Images using Polar Transformers0
FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models0
Learning a Continue-Thinking Token for Enhanced Test-Time ScalingCode0
Don't Pay Attention0
BotTrans: A Multi-Source Graph Domain Adaptation Approach for Social Bot DetectionCode0
Improving Group Robustness on Spurious Correlation via Evidential AlignmentCode0
ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change QueriesCode1
An Attention-based Spatio-Temporal Neural Operator for Evolving Physics0
UCD: Unlearning in LLMs via Contrastive Decoding0
Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models0
Brain2Vec: A Deep Learning Framework for EEG-Based Stress Detection Using CNN-LSTM-Attention0
Efficient Traffic Classification using HW-NAS: Advanced Analysis and Optimization for Cybersecurity on Resource-Constrained Devices0
Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation0
Anti-Aliased 2D Gaussian SplattingCode1
LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability AnalysisCode0
Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and AlgorithmsCode0
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy LogicCode0
SocialCredit+0
Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving0
Show:102550
← PrevPage 318 of 9486Next →