SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 63016325 of 474278 papers

TitleStatusHype
Reasoning Language Models: A BlueprintCode2
Agent-R: Training Language Model Agents to Reflect via Iterative Self-TrainingCode2
Avoiding Shortcuts: Enhancing Channel-Robust Specific Emitter Identification via Single-Source Domain GeneralizationCode2
Advancing Language Model Reasoning through Reinforcement Learning and Inference ScalingCode2
A Survey on Diffusion Models for Anomaly DetectionCode2
A generalizable 3D framework and model for self-supervised learning in medical imagingCode2
Recurrent Diffusion for Large-Scale Parameter GenerationCode2
Investigating the Scalability of Approximate Sparse Retrieval Algorithms to Massive DatasetsCode2
Beyond Any-Shot Adaptation: Predicting Optimization Outcome for Robustness Gains without Extra PayCode2
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context ScenarioCode2
FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable LocalizationCode2
Discrete Prior-based Temporal-coherent Content Prediction for Blind Face Video RestorationCode2
Diffusion Models in Recommendation Systems: A SurveyCode2
Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education SystemsCode2
LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual TasksCode2
Lossless Compression of Vector IDs for Approximate Nearest Neighbor SearchCode2
Scaling up self-supervised learning for improved surgical foundation modelsCode2
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the KeyCode2
Prompt-CAM: A Simpler Interpretable Transformer for Fine-Grained AnalysisCode2
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh GenerationCode2
Practical Continual Forgetting for Pre-trained Vision ModelsCode2
AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image GenerationCode2
A Simple Aerial Detection Baseline of Multimodal Language ModelsCode2
The Devil is in Temporal Token: High Quality Video Reasoning SegmentationCode2
What Limits LLM-based Human Simulation: LLMs or Our Design?Code2
Show:102550
← PrevPage 253 of 18972Next →