SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1835118400 of 474278 papers

TitleStatusHype
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech0
Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss0
Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs0
Ensemble-MIX: Enhancing Sample Efficiency in Multi-Agent RL Using Ensemble Methods0
Target Sensing Performance in Disaster-Specific ISAC Networks0
Quantized Dissipative Uncertain Model for Fractional T_S Fuzzy systems with Time_Varying Delays Under Networked Control System0
Recursive Privacy-Preserving Estimation Over Markov Fading Channels0
Unit Commitment with Cost-Oriented Temporal Resolution0
Dynamic Epsilon Scheduling: A Multi-Factor Adaptive Perturbation Budget for Adversarial Training0
Grasp2Grasp: Vision-Based Dexterous Grasp Translation via Schrödinger Bridges0
Adversarial Attacks on Robotic Vision Language Action ModelsCode1
Solving the Pod Repositioning Problem with Deep Reinforced Adaptive Large Neighborhood Search0
ThinkTank: A Framework for Generalizing Domain-Specific AI Agent Systems into Universal Collaborative Intelligence PlatformsCode1
NetPress: Dynamically Generated LLM Benchmarks for Network ApplicationsCode1
Accelerating Model-Based Reinforcement Learning using Non-Linear Trajectory Optimization0
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data0
The Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative WritingCode0
Towards Source Attribution of Singing Voice Deepfake with Multimodal Foundation ModelsCode0
Grounded Vision-Language Interpreter for Integrated Task and Motion Planning0
Adaptive Differential Denoising for Respiratory Sounds ClassificationCode1
Prompt-Unseen-Emotion: Zero-shot Expressive Speech Synthesis with Prompt-LLM Contextual Knowledge for Mixed Emotions0
Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games0
Rodrigues Network for Learning Robot Actions0
How do Pre-Trained Models Support Software Engineering? An Empirical Study in Hugging Face0
Backpressure-based Mean-field Type Game for Scheduling in Multi-Hop Wireless Sensor Networks0
IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation0
Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning0
Structural Vibration Monitoring with Diffractive Optical Processors0
Axiomatics of Restricted Choices by Linear Orders of Sets with Minimum as Fallback0
On the influence of language similarity in non-target speaker verification trials0
Adaptive Graph Pruning for Multi-Agent CommunicationCode0
MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation0
Rethinking Machine Unlearning in Image Generation ModelsCode1
TL;DR: Too Long, Do Re-weighting for Efficient LLM Reasoning CompressionCode1
EgoVLM: Policy Optimization for Egocentric Video UnderstandingCode0
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather0
InterRVOS: Interaction-aware Referring Video Object Segmentation0
Multi Layered Autonomy and AI Ecologies in Robotic Art Installations0
Beyond Text Compression: Evaluating Tokenizers Across Scales0
SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence0
Improving Performance of Spike-based Deep Q-Learning using Ternary Neurons0
Enriching Location Representation with Detailed Semantic Information0
ATAG: AI-Agent Application Threat Assessment with Attack Graphs0
Spatial Association Between Near-Misses and Accident Blackspots in Sydney, Australia: A Getis-Ord G_i^* Analysis0
Corrigibility as a Singular Target: A Vision for Inherently Reliable Foundation Models0
TestAgent: An Adaptive and Intelligent Expert for Human Assessment0
Data Leakage and Deceptive Performance: A Critical Examination of Credit Card Fraud Detection Methodologies0
Universal Reusability in Recommender Systems: The Case for Dataset- and Task-Independent Frameworks0
A Learned Cost Model-based Cross-engine Optimizer for SQL Workloads0
Rethinking Dynamic Networks and Heterogeneous Computing with Automatic Parallelization0
Show:102550
← PrevPage 368 of 9486Next →