SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1915119200 of 474278 papers

TitleStatusHype
Disentangling Granularity: An Implicit Inductive Bias in Factorized VAEs0
Multi-Domain ABSA Conversation Dataset Generation via LLMs for Real-World Evaluation and Model Comparison0
PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models0
Time Blindness: Why Video-Language Models Can't See What Humans Can?0
Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks0
DeepBoost-AF: A Novel Unsupervised Feature Learning and Gradient Boosting Fusion for Robust Atrial Fibrillation Detection in Raw ECG Signals0
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control0
Quick-Draw Bandits: Quickly Optimizing in Nonstationary Environments with Extremely Many Arms0
Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting0
A SHAP-based explainable multi-level stacking ensemble learning method for predicting the length of stay in acute stroke0
LightSAM: Parameter-Agnostic Sharpness-Aware Minimization0
Rethinking Continual Learning with Progressive Neural Collapse0
On Fairness of Task Arithmetic: The Role of Task Vectors0
GradPower: Powering Gradients for Faster Language Model Pre-Training0
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective0
Multi-task Learning for Heterogeneous Multi-source Block-Wise Missing Data0
Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning0
Graph Flow Matching: Enhancing Image Generation with Neighbor-Aware Flow Fields0
Smooth Model Compression without Fine-Tuning0
Neuro-Symbolic Operator for Interpretable and Generalizable Characterization of Complex Piezoelectric Systems0
Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees0
Learning Distributions over Permutations and Rankings with Factorized Representations0
QGAN-based data augmentation for hybrid quantum-classical neural networks0
Cascading Adversarial Bias from Injection to Distillation in Language Models0
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking0
Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity0
Aligning Protein Conformation Ensemble Generation with Physical Feedback0
MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM0
Multi-task Learning for Heterogeneous Data via Integrating Shared and Task-Specific Encodings0
Data Fusion for Partial Identification of Causal Effects0
Distributed gradient methods under heavy-tailed communication noise0
Geospatial Foundation Models to Enable Progress on Sustainable Development Goals0
Interpretable phenotyping of Heart Failure patients with Dutch discharge letters0
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning0
CrossICL: Cross-Task In-Context Learning via Unsupervised Demonstration Transfer0
Beyond Exponential Decay: Rethinking Error Accumulation in Large Language Models0
Benchmarking Large Language Models for Cryptanalysis and Mismatched-Generalization0
Intuitionistic Fuzzy Sets for Large Language Model Data Annotation: A Novel Approach to Side-by-Side Preference Labeling0
Automated Structured Radiology Report Generation0
HiCaM: A Hierarchical-Causal Modification Framework for Long-Form Text Modification0
Exploring the Impact of Occupational Personas on Domain-Specific QA0
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation0
Limited-Resource Adapters Are Regularizers, Not Linguists0
Improving Language and Modality Transfer in Translation by Character-level Modeling0
GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Matryoshka Representation Learning and Hybrid Loss Training0
When Harry Meets Superman: The Role of The Interlocutor in Persona-Based Dialogue Generation0
A Simple Linear Patch Revives Layer-Pruned Large Language Models0
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration0
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time0
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning0
Show:102550
← PrevPage 384 of 9486Next →