SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1770117750 of 474278 papers

TitleStatusHype
K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction PredictionCode1
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop SchedulingCode1
MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion RecognitionCode1
Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning FrameworkCode1
tn4ml: Tensor Network Training and Customization for Machine LearningCode1
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme DetectionCode1
MaxSup: Overcoming Representation Collapse in Label SmoothingCode1
WeedsGalore: A Multispectral and Multitemporal UAV-based Dataset for Crop and Weed Segmentation in Agricultural Maize FieldsCode1
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal TransportCode1
PartSDF: Part-Based Implicit Neural Representation for Composite 3D Shape Parametrization and OptimizationCode1
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language ModelsCode1
Myna: Masking-Based Contrastive Learning of Musical RepresentationsCode1
A Cognitive Writing Perspective for Constrained Long-Form Text GenerationCode1
DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based AgentCode1
Universal Embedding Function for Traffic Classification via QUIC Domain Recognition Pretraining: A Transfer Learning SuccessCode1
CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City SpaceCode1
Scientific Machine Learning of Flow Resistance Using Universal Shallow Water Equations with Differentiable ProgrammingCode1
Uncertainty-Aware Graph Structure LearningCode1
k-Graph: A Graph Embedding for Interpretable Time Series ClusteringCode1
Towards Text-Image Interleaved RetrievalCode1
Demonstrating specification gaming in reasoning modelsCode1
R2-KG: General-Purpose Dual-Agent Framework for Reliable Reasoning on Knowledge GraphsCode1
Automating Prompt Leakage Attacks on Large Language Models Using Agentic ApproachCode1
Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial TrainingCode1
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?Code1
Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from JailbreakingCode1
Enhancing Audio-Visual Spiking Neural Networks through Semantic-Alignment and Cross-Modal Residual LearningCode1
Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before SearchCode1
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space CapacityCode1
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing InducementsCode1
Disentangling Long-Short Term State Under Unknown Interventions for Online Time Series ForecastingCode1
MVCNet: Multi-View Contrastive Network for Motor Imagery ClassificationCode1
RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation ParadigmCode1
G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable RecommendationCode1
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video GenerationCode1
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language ModelsCode1
Positional Encoding in Transformer-Based Time Series Models: A SurveyCode1
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language ModelCode1
Causal Inference for Qualitative OutcomesCode1
Maximum Entropy Reinforcement Learning with Diffusion PolicyCode1
Towards Mechanistic Interpretability of Graph Transformers via Attention GraphsCode1
A Physics-Informed Blur Learning Framework for Imaging SystemsCode1
VANPY: Voice Analysis FrameworkCode1
SMART: Self-Aware Agent for Tool Overuse MitigationCode1
Deep Learning of Proteins with Local and Global Regions of DisorderCode1
VRoPE: Rotary Position Embedding for Video Large Language ModelsCode1
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment AnalysisCode1
Learning Dexterous Bimanual Catch Skills through Adversarial-Cooperative Heterogeneous-Agent Reinforcement LearningCode1
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?Code1
Leveraging Labelled Data Knowledge: A Cooperative Rectification Learning Network for Semi-supervised 3D Medical Image SegmentationCode1
Show:102550
← PrevPage 355 of 9486Next →