SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 451500 of 659983 papers

TitleStatusHype
Bio-Inspired Event-Based Visual Servoing for Ground Robots0
AdvSplat: Adversarial Attacks on Feed-Forward Gaussian Splatting Models0
CoRe: Joint Optimization with Contrastive Learning for Medical Image Registration0
The Diminishing Returns of Early-Exit Decoding in Modern LLMs0
An In-Depth Study of Filter-Agnostic Vector Search on a PostgreSQL Database System: [Experiments and Analysis]0
Mind the Hitch: Dynamic Calibration and Articulated Perception for Autonomous Trucks0
LLMs Do Not Grade Essays Like Humans0
CDMT-EHR: A Continuous-Time Diffusion Framework for Generating Mixed-Type Time-Series Electronic Health Records0
Semantic Iterative Reconstruction: One-Shot Universal Anomaly Detection0
AI-driven Intent-Based Networking Approach for Self-configuration of Next Generation Networks0
Human-in-the-Loop Pareto Optimization: Trade-off Characterization for Assist-as-Needed Training and Performance Evaluation0
Lightweight Fairness for LLM-Based Recommendations via Kernelized Projection and Gated Adapters0
Latent Algorithmic Structure Precedes Grokking: A Mechanistic Study of ReLU MLPs on Modular Arithmetic0
Retinal Disease Classification from Fundus Images using CNN Transfer Learning0
Digital Twin-Assisted Measurement Design and Channel Statistics Prediction0
Re-Prompting SAM 3 via Object Retrieval: 3rd of the 5th PVUW MOSE Track0
The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense0
PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning0
APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs0
Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM0
AI Generalisation Gap In Comorbid Sleep Disorder Staging0
ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings0
A Theory of LLM Information Susceptibility0
Ukrainian Visual Word Sense Disambiguation Benchmark0
Steering Code LLMs with Activation Directions for Language and Library Control0
Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments0
Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds0
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion0
Decorrelation, Diversity, and Emergent Intelligence: The Isomorphism Between Social Insect Colonies and Ensemble Machine Learning0
Inverting Neural Networks: New Methods to Generate Neural Network Inputs from Prescribed Outputs0
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning0
Test-Time Adaptation via Cache Personalization for Facial Expression Recognition in Videos0
TimeTox: An LLM-Based Pipeline for Automated Extraction of Time Toxicity from Clinical Trial Protocols0
A transformer architecture alteration to incentivise externalised reasoning0
Bounding Box Anomaly Scoring for simple and efficient Out-of-Distribution detection0
Improving LLM Predictions via Inter-Layer Structural Encoders0
Vision-based Deep Learning Analysis of Unordered Biomedical Tabular Datasets via Optimal Spatial Cartography0
MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation0
Voice Privacy from an Attribute-based Perspective0
PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset0
SOUPLE: Enhancing Audio-Visual Localization and Segmentation with Learnable Prompt Contexts0
Exposure-Normalized Bed and Chair Fall Rates via Continuous AI Monitoring0
Conditionally Identifiable Latent Representation for Multivariate Time Series with Structural Dynamics0
Stepwise Variational Inference with Vine Copulas0
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data0
A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks0
Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions0
VLA-IAP: Training-Free Visual Token Pruning via Interaction Alignment for Vision-Language-Action Models0
Minibal: Balanced Game-Playing Without Opponent Modeling0
Efficient Benchmarking of AI Agents0
Show:102550
← PrevPage 10 of 13200Next →