SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 12511300 of 177339 papers

TitleStatusHype
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree SearchCode4
Conditional Prompt Learning for Vision-Language ModelsCode4
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image EditingCode4
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2Code4
FG-CLIP: Fine-Grained Visual and Textual AlignmentCode4
xLAM: A Family of Large Action Models to Empower AI Agent SystemsCode4
Self-Play Preference Optimization for Language Model AlignmentCode4
Spherical Channels for Modeling Atomic InteractionsCode4
Scaling and evaluating sparse autoencodersCode4
AutoCoder: Enhancing Code Large Language Model with AIEV-InstructCode4
GPT-4V(ision) is a Generalist Web Agent, if GroundedCode4
Habitat 3.0: A Co-Habitat for Humans, Avatars and RobotsCode4
The GAN is dead; long live the GAN! A Modern GAN BaselineCode4
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal ModelsCode4
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model SeriesCode4
GigaAM: Efficient Self-Supervised Learner for Speech RecognitionCode4
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised PretrainingCode4
A Survey of LLM DATACode4
Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt OptimizationCode4
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation MethodsCode4
Multi-head Temporal Latent AttentionCode4
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning DatasetCode4
A Survey on Video Diffusion ModelsCode4
MEDITRON-70B: Scaling Medical Pretraining for Large Language ModelsCode4
Deep Residual Learning for Image RecognitionCode4
Multi-label Cluster Discrimination for Visual Representation LearningCode4
Craw4LLM: Efficient Web Crawling for LLM PretrainingCode4
Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language ModelsCode4
MiMo-VL Technical ReportCode4
LightGlue: Local Feature Matching at Light SpeedCode4
Catastrophic Forgetting in Deep Learning: A Comprehensive TaxonomyCode4
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow ModelsCode4
Deepfake Generation and Detection: A Benchmark and SurveyCode4
Easi3R: Estimating Disentangled Motion from DUSt3R Without TrainingCode4
Pytorch-Wildlife: A Collaborative Deep Learning Framework for ConservationCode4
Agent Q: Advanced Reasoning and Learning for Autonomous AI AgentsCode4
InceptionNeXt: When Inception Meets ConvNeXtCode4
Neural Network DiffusionCode4
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and MiningCode4
Hierarchically Coherent Multivariate Mixture NetworksCode4
Self-Supervised Prompt OptimizationCode4
Mamba-FETrack: Frame-Event Tracking via State Space ModelCode4
Accelerating Data Processing and Benchmarking of AI Models for PathologyCode4
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleCode4
EasyRec: An easy-to-use, extendable and efficient framework for building industrial recommendation systemsCode4
On Path to Multimodal Historical Reasoning: HistBench and HistAgentCode4
SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical FlowCode4
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep LearningCode4
fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial IntelligenceCode4
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language ModelsCode4
Show:102550
← PrevPage 26 of 3547Next →