SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 55015550 of 177340 papers

TitleStatusHype
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion ModelsCode2
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1Code2
Scattertext: a Browser-Based Tool for Visualizing how Corpora DifferCode2
ScaleKD: Strong Vision Transformers Could Be Excellent TeachersCode2
CheXpert Plus: Augmenting a Large Chest X-ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image FormatsCode2
NeRF-RPN: A general framework for object detection in NeRFsCode2
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language ModelsCode2
Automatic Differentiation-based Full Waveform Inversion with Flexible WorkflowsCode2
AirMorph: Topology-Preserving Deep Learning for Pulmonary Airway AnalysisCode2
Attacks, Defenses and Evaluations for LLM Conversation Safety: A SurveyCode2
SpatialScore: Towards Unified Evaluation for Multimodal Spatial UnderstandingCode2
An Empirical Study of Qwen3 QuantizationCode2
One-shot Entropy MinimizationCode2
KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose EstimationCode2
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical TasksCode2
Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse DatasetsCode2
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term ModelingCode2
ColorizeDiffusion v2: Enhancing Reference-based Sketch Colorization Through Separating UtilitiesCode2
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RLCode2
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation BenchmarkCode2
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention DistillationCode2
A Transformer-Based Siamese Network for Change DetectionCode2
Focal Modulation NetworksCode2
An Embodied Generalist Agent in 3D WorldCode2
Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTVCode2
JaxUED: A simple and useable UED library in JaxCode2
AutoDefense: Multi-Agent LLM Defense against Jailbreak AttacksCode2
Reliable, Reproducible, and Really Fast Leaderboards with EvalicaCode2
PillarNet: Real-Time and High-Performance Pillar-based 3D Object DetectionCode2
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise AttentionCode2
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single ModelCode2
Flows: Building Blocks of Reasoning and Collaborating AICode2
Supervised Contrastive LearningCode2
All You Need to Know About Training Image Retrieval ModelsCode2
Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-TuningCode2
AERO: Audio Super Resolution in the Spectral DomainCode2
MosaicBERT: A Bidirectional Encoder Optimized for Fast PretrainingCode2
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook RetrievalCode2
A Rotation-Translation-Decoupled Solution for Robust and Efficient Visual-Inertial InitializationCode2
Training Generative Adversarial Networks with Limited DataCode2
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and BeyondCode2
Rethinking Visual Geo-localization for Large-Scale ApplicationsCode2
Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLMCode2
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion ModelingCode2
SUM: Saliency Unification through Mamba for Visual Attention ModelingCode2
Masked Autoregressive Flow for Density EstimationCode2
SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRICode2
Detecting, Explaining, and Mitigating Memorization in Diffusion ModelsCode2
Generalized Inner Loop Meta-LearningCode2
AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental LearningCode2
Show:102550
← PrevPage 111 of 3547Next →