SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1075110800 of 661570 papers

TitleStatusHype
Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR0
Act, Think or Abstain: Complexity-Aware Adaptive Inference for Vision-Language-Action Models0
Balancing Privacy-Quality-Efficiency in Federated Learning through Round-Based Interleaving of Protection Techniques0
Generic Camera Calibration using Blurry Images0
A Geometry-Adaptive Deep Variational Framework for Phase Discovery in the Landau-Brazovskii Model0
C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning0
Guidelines for the Annotation and Visualization of Legal Argumentation Structures in Chinese Judicial Decisions0
Trainable Bitwise Soft Quantization for Input Feature Compression0
Incentive Aware AI Regulations: A Credal Characterisation0
Diffusion LLMs can think EoS-by-EoS0
Distilling Formal Logic into Neural Spaces: A Kernel Alignment Approach for Signal Temporal Logic0
Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics0
Not All Trust is the Same: Effects of Decision Workflow and Explanations in Human-AI Decision Making0
Core-based Hierarchies for Efficient GraphRAG0
Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding0
Early Warning of Intraoperative Adverse Events via Transformer-Driven Multi-Label Learning0
Digital Twin Driven Textile Classification and Foreign Object Recognition in Automated Sorting Systems0
Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards0
CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception0
Wiki-R1: Incentivizing Multimodal Reasoning for Knowledge-based VQA via Data and Sampling Curriculum0
Visual-Informed Speech Enhancement Using Attention-Based Beamforming0
Whispering to a Blackbox: Bootstrapping Frozen OCR with Visual Prompts0
UniSTOK: Uniform Inductive Spatio-Temporal Kriging0
X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes0
Knowledge Divergence and the Value of Debate for Scalable Oversight0
Latent Policy Steering through One-Step Flow Policies0
Fusion4CA: Boosting 3D Object Detection via Comprehensive Image Exploitation0
Bayes with No Shame: Admissibility Geometries of Predictive Inference0
Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence AttributionCode0
Frequency-Aware Error-Bounded Caching for Accelerating Diffusion Transformers0
How important are the genes to explain the outcome - the asymmetric Shapley value as an honest importance metric for high-dimensional features0
Ailed: A Psyche-Driven Chess Engine with Dynamic Emotional Modulation0
DiSCTT: Consensus-Guided Self-Curriculum for Efficient Test-Time Adaptation in Reasoning0
PACE: A Personalized Adaptive Curriculum Engine for 9-1-1 Call-taker Training0
Learning Causal Structure of Time Series using Best Order Score Search0
Robust Node Affinities via Jaccard-Biased Random Walks and Rank Aggregation0
OpenFrontier: General Navigation with Visual-Language Grounded Frontiers0
On the Necessity of Learnable Sheaf Laplacians0
Harnessing Synthetic Data from Generative AI for Statistical Inference0
An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs0
The Spatial and Temporal Resolution of Motor Intention in Multi-Target Prediction0
An interpretable prototype parts-based neural network for medical tabular data0
Ensembling Language Models with Sequential Monte Carlo0
Latent Wasserstein Adversarial Imitation Learning0
NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries0
Residual RL--MPC for Robust Microrobotic Cell Pushing Under Time-Varying Flow0
NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance0
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling0
Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes0
HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token0
Show:102550
← PrevPage 216 of 13232Next →