SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1360113650 of 474278 papers

TitleStatusHype
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation0
Embodied AI Agents: Modeling the World0
PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG TheoryCode0
Binned semiparametric Bayesian networksCode0
JointRank: Rank Large Set with Single PassCode0
RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation ModelsCode0
Hitchhiking Rides Dataset: Two decades of crowd-sourced records on stochastic travelingCode0
RExBench: Can coding agents autonomously implement AI research extensions?Code0
Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training0
Score-Based Model for Low-Rank Tensor Recovery0
Interact2Vec -- An efficient neural network-based model for simultaneously learning users and items embeddings in recommender systems0
Exploring Modularity of Agentic Systems for Drug Discovery0
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMsCode2
UnMix-NeRF: Spectral Unmixing Meets Neural Radiance FieldsCode1
Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement LearningCode2
MolProphecy: Bridging Medicinal Chemists' Knowledge and Molecular Pre-Trained Models via a Multi-Modal FrameworkCode0
FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise AlignmentCode0
APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy OptimizationCode0
Estimating Correctness Without Oracles in LLM-Based Code GenerationCode0
Large Language Model Agent for Modular Task Execution in Drug Discovery0
BMFM-DNA: A SNP-aware DNA foundation model to capture variant effectsCode2
Elucidating and Endowing the Diffusion Training Paradigm for General Image Restoration0
Adaptive Multipath-Based SLAM for Distributed MIMO Systems0
ImplicitQA: Going beyond frames towards Implicit Video ReasoningCode0
CAT-SG: A Large Dynamic Scene Graph Dataset for Fine-Grained Understanding of Cataract Surgery0
Towards Transparent AI: A Survey on Explainable Large Language Models0
Early Stopping Tabular In-Context Learning0
AgentStealth: Reinforcing Large Language Model for Anonymizing User-generated TextCode0
Interpretable Representation Learning for Additive Rule Ensembles0
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation0
Interpretable Hierarchical Concept Reasoning through Attention-Guided Graph Learning0
Progtuning: Progressive Fine-tuning Framework for Transformer-based Language Models0
Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments0
Curriculum-Guided Antifragile Reinforcement Learning for Secure UAV Deconfliction under Observation-Space Attacks0
Temporal-Aware Graph Attention Network for Cryptocurrency Transaction Fraud Detection0
Optimising 4th-Order Runge-Kutta Methods: A Dynamic Heuristic Approach for Efficiency and Low Storage0
Potemkin Understanding in Large Language Models0
Can Gradient Descent Simulate Prompting?0
MT2-CSD: A New Dataset and Multi-Semantic Knowledge Fusion Method for Conversational Stance Detection0
DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning0
Cat and Mouse -- Can Fake Text Generation Outpace Detector Systems?0
Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning0
Bridging Offline and Online Reinforcement Learning for LLMs0
Explainable AI for Radar Resource Management: Modified LIME in Deep Reinforcement Learning0
Data Efficacy for Language Model Training0
TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence0
Little By Little: Continual Learning via Self-Activated Sparse Mixture-of-Rank Adaptive Learning0
FedDAA: Dynamic Client Clustering for Concept Drift Adaptation in Federated Learning0
Generative Adversarial Evasion and Out-of-Distribution Detection for UAV Cyber-Attacks0
Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design0
Show:102550
← PrevPage 273 of 9486Next →