SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 2065120700 of 474278 papers

TitleStatusHype
Data-Distill-Net: A Data Distillation Approach Tailored for Reply-based Continual Learning0
Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?0
Task-Oriented Low-Label Semantic Communication With Self-Supervised Learning0
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions0
Learning Optimal Multimodal Information Bottleneck Representations0
Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations0
Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits0
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety0
Tensorization is a powerful but underexplored tool for compression and interpretability of neural networks0
Research on feature fusion and multimodal patent text based on graph attention network0
Variational Deep Learning via Implicit Regularization0
DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning0
Weighted Leave-One-Out Cross Validation0
Efficient Optimization Accelerator Framework for Multistate Ising Problems0
Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study0
MSD-LLM: Predicting Ship Detention in Port State Control Inspections with Large Language Model0
CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models0
Language Model-Enhanced Message Passing for Heterophilic Graph Learning0
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making0
Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback0
Hierarchical Tree Search-based User Lifelong Behavior Modeling on Large Language Model0
Leveraging Descriptions of Emotional Preferences in Recommender Systems0
Evaluating Large Language Models for Code Review0
LLMs as Better Recommenders with Natural Language Collaborative Signals: A Self-Assessing Retrieval Approach0
Improving Recommendation Fairness without Sensitive Attributes Using Multi-Persona LLMs0
One Model to Rank Them All: Unifying Online Advertising with End-to-End Learning0
Light distillation for Incremental Graph Convolution Collaborative Filtering0
Power allocation for cell-free MIMO integrated sensing and communication0
Continuous Learning for Children's ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence0
On the Robustness of RSMA to Adversarial BD-RIS-Induced Interference0
Causality and "In-the-Wild" Video-Based Person Re-ID: A Survey0
RedAHD: Reduction-Based End-to-End Automatic Heuristic Design with Large Language Models0
Temporal Sampling for Forgotten Reasoning in LLMsCode1
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM CompressionCode1
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model InferenceCode2
Advanced long-term earth system forecasting by learning the small-scale natureCode0
ExAnte: A Benchmark for Ex-Ante Inference in Large Language ModelsCode0
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache CompressionCode2
Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual LearningCode0
Deep Active Inference Agents for Delayed and Long-Horizon EnvironmentsCode0
Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RLCode0
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data RefinementCode5
Fine-grained List-wise Alignment for Generative Medication RecommendationCode0
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language ModelsCode0
Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future ProspectsCode2
Capability-Based Scaling Laws for LLM Red-TeamingCode0
LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense RetrievalCode0
Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interactions Prediction0
An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code GenerationCode0
Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided GateCode0
Show:102550
← PrevPage 414 of 9486Next →