SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 651700 of 659983 papers

TitleStatusHype
LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks0
LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset0
LLMORPH: Automated Metamorphic Testing of Large Language Models0
LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops0
M3T: Discrete Multi-Modal Motion Tokens for Sign Language Production0
Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks0
λSplit: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy0
Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge0
Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages0
Boost Like a (Var)Pro: Trust-Region Gradient Boosting via Variable Projection0
Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges0
GTO Wizard Benchmark0
Echoes: A semantically-aligned music deepfake detection dataset0
Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models0
Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement0
Prototype Fusion: A Training-Free Multi-Layer Approach to OOD Detection0
PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation0
Learning What Can Be Picked: Active Reachability Estimation for Efficient Robotic Fruit Harvesting0
Assessment Design in the AI Era: A Method for Identifying Items Functioning Differentially for Humans and Chatbots0
MoCHA: Denoising Caption Supervision for Motion-Text Retrieval0
Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL0
Autoregressive Guidance of Deep Spatially Selective Filters using Bayesian Tracking for Efficient Extraction of Moving Speakers0
Bi-CRCL: Bidirectional Conservative-Radical Complementary Learning with Pre-trained Foundation Models for Class-incremental Medical Image Analysis0
An Adapter-free Fine-tuning Approach for Tuning 3D Foundation Models0
Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems0
BXRL: Behavior-Explainable Reinforcement Learning0
Detection and Classification of (Pre)Cancerous Cells in Pap Smears: An Ensemble Strategy for the RIVA Cervical Cytology Challenge0
Kronecker-Structured Nonparametric Spatiotemporal Point Processes0
Manifold Generalization Provably Proceeds Memorization in Diffusion Models0
Sparse Autoencoders for Interpretable Medical Image Representation Learning0
Parameter-Efficient Fine-Tuning for Medical Text Summarization: A Comparative Study of Lora, Prompt Tuning, and Full Fine-Tuning0
Drop-In Perceptual Optimization for 3D Gaussian Splatting0
CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training0
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation0
Mamba-VMR: Multimodal Query Augmentation via Generated Videos for Precise Temporal Grounding0
OpenEarth-Agent: From Tool Calling to Tool Creation for Open-Environment Earth Observation0
More Isn't Always Better: Balancing Decision Accuracy and Conformity Pressures in Multi-AI Advice0
dynActivation: A Trainable Activation Family for Adaptive Nonlinearity0
RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation0
Multimodal Survival Analysis with Locally Deployable Large Language Models0
Data Curation for Machine Learning Interatomic Potentials by Determinantal Point Processes0
DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation0
SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning0
On the Failure of Topic-Matched Contrast Baselines in Multi-Directional Refusal Abliteration0
PreferRec: Learning and Transferring Pareto Preferences for Multi-objective Re-ranking0
MIHT: A Hoeffding Tree for Time Series Classification using Multiple Instance Learning0
Autoregressive vs. Masked Diffusion Language Models: A Controlled Comparison0
A Context Engineering Framework for Improving Enterprise AI Agents based on Digital-Twin MDP0
Multiperspectivity as a Resource for Narrative Similarity Prediction0
Unveiling the Mechanism of Continuous Representation Full-Waveform Inversion: A Wave Based Neural Tangent Kernel Framework0
Show:102550
← PrevPage 14 of 13200Next →