SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 93769400 of 474278 papers

TitleStatusHype
Using KL-Divergence to Focus Frequency Information in Low-Light Image EnhancementCode0
Knowledge Distillation Detection for Open-weights ModelsCode0
Automated Model Evaluation for Object Detection via Prediction Consistency and ReliabilityCode0
EMR-AGENT: Automating Cohort and Feature Extraction from EMR DatabasesCode0
Guiding Multimodal Large Language Models with Blind and Low Vision People Visual Questions for Proactive Visual InterpretationsCode0
Microscaling Floating Point Formats for Large Language ModelsCode0
SpurBreast: A Curated Dataset for Investigating Spurious Correlations in Real-world Breast MRI ClassificationCode0
microCLIP: Unsupervised CLIP Adaptation via Coarse-Fine Token Fusion for Fine-Grained Image ClassificationCode0
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous DrivingCode0
AudioStory: Generating Long-Form Narrative Audio with Large Language ModelsCode0
NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning AbilitiesCode0
SynCED-EnDe 2025: A Synthetic and Curated English - German Dataset for Critical Error Detection in Machine Translation0
NeuroTTT: Bridging Pretraining-Downstream Task Misalignment in EEG Foundation Models via Test-Time TrainingCode0
Code2Video: A Code-centric Paradigm for Educational Video GenerationCode0
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes0
PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language ModelsCode0
Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity MeasureCode0
JoyAgent-JDGenie: Technical Report on the GAIA0
Instant4D: 4D Gaussian Splatting in Minutes0
LSPO: Length-aware Dynamic Sampling for Policy Optimization in LLM Reasoning0
PSScreen: Partially Supervised Multiple Retinal Disease ScreeningCode0
DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent SpaceCode0
IC-Custom: Diverse Image Customization via In-Context Learning0
DepthLM: Metric Depth From Vision Language Models0
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error AnnotationsCode0
Show:102550
← PrevPage 376 of 18972Next →