SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1565115700 of 474278 papers

TitleStatusHype
Training-free LLM Merging for Multi-task LearningCode0
The SWE-Bench Illusion: When State-of-the-Art LLMs Remember Instead of Reason0
Information fusion strategy integrating pre-trained language model and contrastive learning for materials knowledge mining0
Model Merging for Knowledge EditingCode0
Domain Generalization for Person Re-identification: A Survey Towards Domain-Agnostic Person MatchingCode1
Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And NormalizationCode0
TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation TasksCode1
ConsistencyChecker: Tree-based Evaluation of LLM Generalization CapabilitiesCode0
Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and BenchmarkCode0
AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task SolvingCode7
A Comprehensive Survey of Deep Research: Systems, Methodologies, and ApplicationsCode3
Trust-MARL: Trust-Based Multi-Agent Reinforcement Learning Framework for Cooperative On-Ramp Merging Control in Heterogeneous Traffic Flow0
GroupNL: Low-Resource and Robust CNN Design over Cloud and Device0
BSA: Ball Sparse Attention for Large-scale GeometriesCode1
Mitigating Non-Target Speaker Bias in Guided Speaker Embedding0
Optimized Spectral Fault Receptive Fields for Diagnosis-Informed Prognosis0
Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity0
Restoring Gaussian Blurred Face Images for Deanonymization Attacks0
QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety0
Towards Fairness Assessment of Dutch Hate Speech Detection0
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics0
Style-based Composer Identification and Attribution of Symbolic Music Scores: a Systematic Survey0
Component Based Quantum Machine Learning Explainability0
ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering0
Semivalue-based data valuation is arbitrary and gameable0
From Human to Machine Psychology: A Conceptual Framework for Understanding Well-Being in Large Language Model0
Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation0
Learning Best Paths in Quantum Networks0
Deep Fusion of Ultra-Low-Resolution Thermal Camera and Gyroscope Data for Lighting-Robust and Compute-Efficient Rotational Odometry0
AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making0
Phonikud: Hebrew Grapheme-to-Phoneme Conversion for Real-Time Text-to-Speech0
OscNet v1.5: Energy Efficient Hopfield Network on CMOS Oscillators for Image ClassificationCode0
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following0
MatchPlant: An Open-Source Pipeline for UAV-Based Single-Plant Detection and Data ExtractionCode0
MEraser: An Effective Fingerprint Erasure Approach for Large Language ModelsCode0
Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025Code0
Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-ProcessingCode0
ANIRA: An Architecture for Neural Network Inference in Real-Time Audio ApplicationsCode3
SplashNet: Split-and-Share Encoders for Accurate and Efficient Typing with Surface ElectromyographyCode0
Fairness Research For Machine Learning Should Integrate Societal Considerations0
GSDNet: Revisiting Incomplete Multimodal-Diffusion from Graph Spectrum Perspective for Conversation Emotion Recognition0
Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek0
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under UncertaintyCode0
StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling0
Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas0
Optimizing Federated Learning using Remote Embeddings for Graph Neural Networks0
How Grounded is Wikipedia? A Study on Structured Evidential SupportCode0
Feeling Machines: Ethics, Culture, and the Rise of Emotional AI0
IndoorWorld: Integrating Physical Task Solving and Social Simulation in A Heterogeneous Multi-Agent Environment0
InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning0
Show:102550
← PrevPage 314 of 9486Next →