SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 11511200 of 659983 papers

TitleStatusHype
Safety as Computation: Certified Answer Reuse via Capability Closure in Task-Oriented Dialogue0
SynPO: Synergizing Descriptiveness and Preference Optimization for Video Detailed Captioning0
Joint Surrogate Learning of Objectives, Constraints, and Sensitivities for Efficient Multi-objective Optimization of Neural Dynamical Systems0
Consistent but Dangerous: Per-Sample Safety Classification Reveals False Reliability in Medical Vision-Language Models0
AutoMOOSE: An Agentic AI for Autonomous Phase-Field Simulation0
OrbitStream: Training-Free Adaptive 360-degree Video Streaming via Semantic Potential Fields0
SkinCLIP-VL: Consistency-Aware Vision-Language Learning for Multimodal Skin Cancer Diagnosis0
KLDrive: Fine-Grained 3D Scene Reasoning for Autonomous Driving based on Knowledge Graph0
Deep Attention-based Sequential Ensemble Learning for BLE-Based Indoor Localization in Care Facilities0
TabPFN Extensions for Interpretable Geotechnical Modelling0
Fuel Consumption Prediction: A Comparative Analysis of Machine Learning Paradigms0
Reading Between the Lines: How Electronic Nonverbal Cues shape Emotion Decoding0
Benchmarking Scientific Machine Learning Models for Air Quality Data0
Statistical Learning for Latent Embedding Alignment with Application to Brain Encoding and Decoding0
Confidence Freeze: Early Success Induces a Metastable Decoupling of Metacognition and Behaviour0
A Two-stage Transformer Framework for Temporal Localization of Distracted Driver Behaviors0
Harmful Visual Content Manipulation Matters in Misinformation Detection Under Multimedia Scenarios0
SGAD-SLAM: Splatting Gaussians at Adjusted Depth for Better Radiance Fields in RGBD SLAM0
Semi-Supervised Learning with Balanced Deep Representation Distributions0
DGRNet: Disagreement-Guided Refinement for Uncertainty-Aware Brain Tumor Segmentation0
Stochastic approximation in non-markovian environments revisited0
Representation-Level Adversarial Regularization for Clinically Aligned Multitask Thyroid Ultrasound Assessment0
Mixture of Chapters: Scaling Learnt Memory in Transformers0
Learning to Optimize Joint Source and RIS-assisted Channel Encoding for Multi-User Semantic Communication Systems0
Learning Progressive Adaptation for Multi-Modal Tracking0
CounterScene: Counterfactual Causal Reasoning in Generative World Models for Safety-Critical Closed-Loop Evaluation0
ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models0
DMMRL: Disentangled Multi-Modal Representation Learning via Variational Autoencoders for Molecular Property Prediction0
Frequency Switching Mechanism for Parameter-E!cient Multi-Task Learning0
LiFR-Seg: Anytime High-Frame-Rate Segmentation via Event-Guided Propagation0
ReDiffuse: Rotation Equivariant Diffusion Model for Multi-focus Image Fusion0
Anatomical Prior-Driven Framework for Autonomous Robotic Cardiac Ultrasound Standard View Acquisition0
One Pool Is Not Enough: Multi-Cluster Memory for Practical Test-Time Adaptation0
Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs0
Beyond a Single Signal: SPECTREG2, A Unified MultiExpert Anomaly Detector for Unknown Unknowns0
Revisiting Tree Search for LLMs: Gumbel and Sequential Halving for Budget-Scalable Reasoning0
Many Dialects, Many Languages, One Cultural Lens: Evaluating Multilingual VLMs for Bengali Culture Understanding Across Historically Linked Languages and Regional Dialects0
GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing0
Prompt replay: speeding up grpo with on-policy reuse of high-signal prompts0
LLM-based Automated Architecture View Generation: Where Are We Now?0
ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization0
Architecture for Multi-Unmanned Aerial Vehicles based Autonomous Precision Agriculture Systems0
Context Selection for Hypothesis and Statistical Evidence Extraction from Full-Text Scientific Articles0
Is Monitoring Enough? Strategic Agent Selection For Stealthy Attack in Multi-Agent Discussions0
Boundary-Aware Instance Segmentation in Microscopy Imaging0
Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows0
A Large-Scale Remote Sensing Dataset and VLM-based Algorithm for Fine-Grained Road Hierarchy Classification0
Does AI Homogenize Student Thinking? A Multi-Dimensional Analysis of Structural Convergence in AI-Augmented Essays0
Plant Taxonomy Meets Plant Counting: A Fine-Grained, Taxonomic Dataset for Counting Hundreds of Plant Species0
When Convenience Becomes Risk: A Semantic View of Under-Specification in Host-Acting Agents0
Show:102550
← PrevPage 24 of 13200Next →