SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1235112400 of 474278 papers

TitleStatusHype
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference TimeCode2
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal UnderstandingCode2
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future DirectionsCode2
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM AgentsCode2
Evaluating Large Language Models: A Comprehensive SurveyCode2
Parsel: Algorithmic Reasoning with Language Models by Composing DecompositionsCode2
MolTC: Towards Molecular Relational Modeling In Language ModelsCode2
FastReID: A Pytorch Toolbox for General Instance Re-identificationCode2
DEGAS: Detailed Expressions on Full-Body Gaussian AvatarsCode2
Self-Consistent Recursive Diffusion Bridge for Medical Image TranslationCode2
HoliTom: Holistic Token Merging for Fast Video Large Language ModelsCode2
Tuning-Free Image Customization with Image and Text GuidanceCode2
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning PerspectiveCode2
Variable Bitrate Neural FieldsCode2
LION: Empowering Multimodal Large Language Model with Dual-Level Visual KnowledgeCode2
Cross-lingual and Multilingual CLIPCode2
A Diffusion Model Framework for Unsupervised Neural Combinatorial OptimizationCode2
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score MatchingCode2
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
BMFM-DNA: A SNP-aware DNA foundation model to capture variant effectsCode2
Honeybee: Locality-enhanced Projector for Multimodal LLMCode2
GigaPose: Fast and Robust Novel Object Pose Estimation via One CorrespondenceCode2
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series AnalysisCode2
Density Estimation via Binless Multidimensional IntegrationCode2
RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose EstimationCode2
MVGamba: Unify 3D Content Generation as State Space Sequence ModelingCode2
Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse ViewsCode2
Autoregressive Pretraining with Mamba in VisionCode2
pymdp: A Python library for active inference in discrete state spacesCode2
3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised AnomalyCode2
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique PipelineCode2
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language ModelsCode2
StreamBench: Towards Benchmarking Continuous Improvement of Language AgentsCode2
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical ReasoningCode2
CLEAR: A Fully User-side Image Search SystemCode2
Bring Reason to Vision: Understanding Perception and Reasoning through Model MergingCode2
Agent-SafetyBench: Evaluating the Safety of LLM AgentsCode2
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank AdaptationCode2
Logits-Based FinetuningCode2
Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature AttentionCode2
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual RecognitionCode2
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven AgentsCode2
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-ResolutionCode2
VLSBench: Unveiling Visual Leakage in Multimodal SafetyCode2
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning PruningCode2
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar MemoriesCode2
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language ModelsCode2
Restoring and attributing ancient texts using deep neural networksCode2
BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion GenerationCode2
Data-efficient Large Vision Models through Sequential AutoregressionCode2
Show:102550
← PrevPage 248 of 9486Next →