SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers247,172 code links4,818 tasks

Papers

Showing 301350 of 658356 papers

TitleStatusHype
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following0
Any-Subgroup Equivariant Networks via Symmetry Breaking0
ICLAD: In-Context Learning for Unified Tabular Anomaly Detection Across Supervision Regimes0
Teaching an Agent to Sketch One Part at a Time0
Stochastic Sequential Decision Making over Expanding Networks with Graph Filtering0
Vision Tiny Recursion Model (ViTRM): Parameter-Efficient Image Classification via Recursive State Refinement0
Beyond the Desk: Barriers and Future Opportunities for AI to Assist Scientists in Embodied Physical Tasks0
Linear Social Choice with Few Queries: A Moment-Based Approach0
FedAgain: A Trust-Based and Robust Federated Learning Strategy for an Automated Kidney Stone Identification in Ureteroscopy0
Learning to Disprove: Formal Counterexample Generation with Large Language Models0
ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models0
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis0
ReXInTheWild: A Unified Benchmark for Medical Photograph Understanding0
Inducing Sustained Creativity and Diversity in Large Language Models0
Recognising BSL Fingerspelling in Continuous Signing Sequences0
SurfaceXR: Fusing Smartwatch IMUs and Egocentric Hand Pose for Seamless Surface Interactions0
AURORA: Adaptive Unified Representation for Robust Ultrasound AnalysisCode0
Cooperation and Exploitation in LLM Policy Synthesis for Sequential Social DilemmasCode0
TRACE: Trajectory Recovery with State Propagation Diffusion for Urban MobilityCode0
End-to-End QGAN-Based Image Synthesis via Neural Noise Encoding and Intensity Calibration0
Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework0
Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring0
Sheaf Neural Networks and biomedical applications0
Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model0
Score Reversal Is Not Free for Quantum Diffusion Models0
To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs0
Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders0
Generalization of Long-Range Machine Learning Potentials in Complex Chemical Spaces0
All-in-One Slider for Attribute Manipulation in Diffusion ModelsCode0
PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents0
Language Model Maps for Prompt-Response Distributions via Log-Likelihood Vectors0
WarPGNN: A Parametric Thermal Warpage Analysis Framework with Physics-aware Graph Neural Network0
Bridging Network Fragmentation: A Semantic-Augmented DRL Framework for UAV-aided VANETs0
AU Codes, Language, and Synthesis: Translating Anatomy to Text for Facial Behavior Synthesis0
Student views in AI Ethics and Social Impact0
ITKIT: Feasible CT Image Analysis based on SimpleITK and MMEngine0
Investigating Faithfulness in Large Audio Language Models0
From Workflow Automation to Capability Closure: A Formal Framework for Safe and Revenue-Aware Customer Service AI0
Redundancy-as-Masking: Formalizing the Artificial Age Score (AAS) to Model Memory Aging in Generative AI0
Augmenting Rating-Scale Measures with Text-Derived Items Using the Information-Determined Scoring (IDS) Framework0
REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation0
Holter-to-Sleep: AI-Enabled Repurposing of Single-Lead ECG for Sleep Phenotyping0
Learning Consistent Temporal Grounding between Related Tasks in Sports Coaching0
Look Before You Fuse: 2D-Guided Cross-Modal Alignment for Robust 3D Detection0
AgroCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture0
Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration0
SJD-PAC: Accelerating Speculative Jacobi Decoding via Proactive Drafting and Adaptive Continuation0
Fast and Interpretable Autoregressive Estimation with Neural Network Backpropagation0
SignAgent: Agentic LLMs for Linguistically-Grounded Sign Language Annotation and Dataset Curation0
SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits0
Show:102550
← PrevPage 7 of 13168Next →