SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 48014850 of 661570 papers

TitleStatusHype
From Natural Language to Executable Option Strategies via Large Language Models0
Tabular LLMs for Interpretable Few-Shot Alzheimer's Disease Prediction with Multimodal Biomedical DataCode0
Ethical Fairness without Demographics in Human-Centered AI0
The Cost of Reasoning: Chain-of-Thought Induces Overconfidence in Vision-Language Models0
Incongruent Positivity: When Miscalibrated Positivity Undermines Online Supportive Conversations0
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models0
SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding0
LUMINA: A Multi-Vendor Mammography Benchmark with Energy Harmonization Protocol0
Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text0
When the City Teaches the Car: Label-Free 3D Perception from Infrastructure0
Automated identification of Ichneumonoidea wasps via YOLO-based deep learning: Integrating HiresCam for Explainable AI0
Transformer-Encoder Trees for Efficient Multilingual Machine Translation and Speech Translation0
A Scalable Approach to Solving Simulation-Based Network Security Games0
Semantic One-Dimensional Tokenizer for Image Reconstruction and Generation0
Over-the-air White-box Attack on the Wav2Vec Speech Recognition Neural Network0
Edge-Efficient Two-Stream Multimodal Architecture for Non-Intrusive Bathroom Fall Detection0
CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning0
ProgressiveAvatars: Progressive Animatable 3D Gaussian Avatars0
Data-driven generalized perimeter control: Zürich case study0
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models0
Transformers can do Bayesian Clustering0
Knowing What You Cannot Explain: Learning to Reject Low-Quality Explanations0
EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing0
Accurate Shift Invariant Convolutional Neural Networks Using Gaussian-Hermite Moments0
Patient4D: Temporally Consistent Patient Body Mesh Recovery from Monocular Operating Room Video0
LLM-Powered Flood Depth Estimation from Social Media Imagery: A Vision-Language Model Framework with Mechanistic Interpretability for Transportation Resilience0
Self-Regularized Learning Methods0
Exploiting the English Grammar Profile for L2 grammatical analysis with LLMs0
Generalist Multimodal LLMs Gain Biometric Expertise via Human Salience0
CircuitLM: A Multi-Agent LLM-Aided Design Framework for Generating Circuit Schematics from Natural Language Prompts0
Formal verification of tree-based machine learning models for lateral spreading0
Integrating Inductive Biases in Transformers via Distillation for Financial Time Series Forecasting0
Ensemble Self-Training for Unsupervised Machine Translation0
SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue0
Block-Recurrent Dynamics in Vision Transformers1
BAWSeg: A UAV Multispectral Benchmark for Barley Weed Segmentation0
Self-Aware Markov Models for Discrete Reasoning0
Linearized Bregman Iterations for Sparse Spiking Neural Networks0
VideoVerse: Does Your T2V Generator Have World Model Capability to Synthesize Videos?0
Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers0
Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning0
Exploring Collatz Dynamics with Human-LLM Collaboration0
AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification0
High-Fidelity Compression of Seismic Velocity Models via SIREN Auto-Decoders0
OpenHospital: A Thing-in-itself Arena for Evolving and Benchmarking LLM-based Collective Intelligence0
Advancing Visual Reliability: Color-Accurate Underwater Image Enhancement for Real-Time Underwater Missions0
InViC: Intent-aware Visual Cues for Medical Visual Question Answering0
Deep Reinforcement Learning-Assisted Automated Operator Portfolio for Constrained Multi-objective Optimization0
Near-light Photometric Stereo with Symmetric Lights0
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU0
Show:102550
← PrevPage 97 of 13232Next →