SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 69016950 of 661570 papers

TitleStatusHype
Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity0
AI Planning Framework for LLM-Based Web Agents0
Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval0
TaoBench: Do Automated Theorem Prover LLMs Generalize Beyond MathLib?0
Show, Don't Tell: Detecting Novel Objects by Watching Human Videos0
FC-Track: Overlap-Aware Post-Association Correction for Online Multi-Object Tracking0
Catalyst4D: High-Fidelity 3D-to-4D Scene Editing via Dynamic Propagation0
Empowering Semantic-Sensitive Underwater Image Enhancement with VLM0
The RIGID Framework: Research-Integrated, Generative AI-Mediated Instructional Design0
Generalized Recognition of Basic Surgical Actions Enables Skill Assessment and Vision-Language-Model-based Surgical Planning0
What Makes VLMs Robust? Towards Reconciling Robustness and Accuracy in Vision-Language Models0
A Multi-task Large Reasoning Model for Molecular Science0
OARS: Process-Aware Online Alignment for Generative Real-World Image Super-Resolution0
Context is all you need: Towards autonomous model-based process design using agentic AI in flowsheet simulations0
Rethinking Multiple-Choice Questions for RLVR: Unlocking Potential via Distractor Design0
Hierarchical Dual-Change Collaborative Learning for UAV Scene Change Captioning0
Multimodal Protein Language Models for Enzyme Kinetic Parameters: From Substrate Recognition to Conformational Adaptation0
Team LEYA in 10th ABAW Competition: Multimodal Ambivalence/Hesitancy Recognition Approach0
Wear Classification of Abrasive Flap Wheels using a Hierarchical Deep Learning Approach0
Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation0
Surrogates for Physics-based and Data-driven Modelling of Parametric Systems: Review and New Perspectives0
TRACE: Structure-Aware Character Encoding for Robust and Generalizable Document Watermarking0
Test-time RL alignment exposes task familiarity artifacts in LLM benchmarks0
Explainable AI Using Inherently Interpretable Components for Wearable-based Health Monitoring0
Enhanced Drug-drug Interaction Prediction Using Adaptive Knowledge Integration0
Forecasting Epileptic Seizures from Contactless Camera via Cross-Species Transfer Learning0
A theory of learning data statistics in diffusion models, from easy to hard0
Learning from Child-Directed Speech in Two-Language Scenarios: A French-English Case Study0
ODRL Policy Comparison Through Normalisation0
VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation0
Retrieval-Enhanced Real Estate Appraisal0
Delta1 with LLM: symbolic and neural integration for credible and explainable reasoning0
Efficient Real-World Autonomous Racing via Attenuated Residual Policy OptimizationCode0
SCOPE: Semantic Coreset with Orthogonal Projection Embeddings for Federated learning0
Test-Time Attention Purification for Backdoored Large Vision Language Models0
Deconstructing the Failure of Ideal Noise Correction: A Three-Pillar Diagnosis0
Accelerating Stroke MRI with Diffusion Probabilistic Models through Large-Scale Pre-training and Target-Specific Fine-Tuning0
FraudFox: Adaptable Fraud Detection in the Real World0
Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation0
From AI Weather Prediction to Infrastructure Resilience: A Correction-Downscaling Framework for Tropical Cyclone Impacts0
VIGS-SLAM: Visual Inertial Gaussian Splatting SLAM0
Colluding LoRA: A Composite Attack on LLM Safety Alignment0
SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks0
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!0
A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks0
Reinforcing the Weakest Links: Modernizing SIENA with Targeted Deep Learning IntegrationCode0
On Linear Separability of the MNIST Handwritten Digits Dataset0
Scaling Laws and Pathologies of Single-Layer PINNs: Network Width and PDE Nonlinearity0
As Language Models Scale, Low-order Linear Depth Dynamics Emerge0
A Reduction Algorithm for Markovian Contextual Linear Bandits0
Show:102550
← PrevPage 139 of 13232Next →