SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1305113100 of 177340 papers

TitleStatusHype
Self-Consistent Recursive Diffusion Bridge for Medical Image TranslationCode2
HoliTom: Holistic Token Merging for Fast Video Large Language ModelsCode2
Tuning-Free Image Customization with Image and Text GuidanceCode2
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning PerspectiveCode2
Variable Bitrate Neural FieldsCode2
LION: Empowering Multimodal Large Language Model with Dual-Level Visual KnowledgeCode2
Cross-lingual and Multilingual CLIPCode2
A Diffusion Model Framework for Unsupervised Neural Combinatorial OptimizationCode2
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score MatchingCode2
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
BMFM-DNA: A SNP-aware DNA foundation model to capture variant effectsCode2
Honeybee: Locality-enhanced Projector for Multimodal LLMCode2
GigaPose: Fast and Robust Novel Object Pose Estimation via One CorrespondenceCode2
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series AnalysisCode2
Density Estimation via Binless Multidimensional IntegrationCode2
RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose EstimationCode2
MVGamba: Unify 3D Content Generation as State Space Sequence ModelingCode2
Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse ViewsCode2
Autoregressive Pretraining with Mamba in VisionCode2
pymdp: A Python library for active inference in discrete state spacesCode2
3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised AnomalyCode2
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique PipelineCode2
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language ModelsCode2
StreamBench: Towards Benchmarking Continuous Improvement of Language AgentsCode2
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical ReasoningCode2
CLEAR: A Fully User-side Image Search SystemCode2
Bring Reason to Vision: Understanding Perception and Reasoning through Model MergingCode2
Agent-SafetyBench: Evaluating the Safety of LLM AgentsCode2
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank AdaptationCode2
Logits-Based FinetuningCode2
Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature AttentionCode2
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual RecognitionCode2
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven AgentsCode2
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-ResolutionCode2
VLSBench: Unveiling Visual Leakage in Multimodal SafetyCode2
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning PruningCode2
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar MemoriesCode2
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language ModelsCode2
Restoring and attributing ancient texts using deep neural networksCode2
BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion GenerationCode2
Data-efficient Large Vision Models through Sequential AutoregressionCode2
Attention Calibration for Disentangled Text-to-Image PersonalizationCode2
Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of BearingCode2
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual GroundingCode2
1M-Deepfakes Detection ChallengeCode2
Large Language Models Can Learn Temporal ReasoningCode2
Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and FeedbackCode2
DynRefer: Delving into Region-level Multimodal Tasks via Dynamic ResolutionCode2
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object SegmentationCode2
Commonsense Prototype for Outdoor Unsupervised 3D Object DetectionCode2
Show:102550
← PrevPage 262 of 3547Next →