| DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | Aug 15, 2024 | Automated Theorem ProvingLanguage Modeling | CodeCode Available | 4 | 5 |
| Conditional Prompt Learning for Vision-Language Models | Mar 10, 2022 | Domain GeneralizationPrompt Engineering | CodeCode Available | 4 | 5 |
| DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | Feb 4, 2024 | Image Generation | CodeCode Available | 4 | 5 |
| Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 | Nov 17, 2023 | | CodeCode Available | 4 | 5 |
| FG-CLIP: Fine-Grained Visual and Textual Alignment | May 8, 2025 | Image-text Retrievalobject-detection | CodeCode Available | 4 | 5 |
| xLAM: A Family of Large Action Models to Empower AI Agent Systems | Sep 5, 2024 | AI Agent | CodeCode Available | 4 | 5 |
| Self-Play Preference Optimization for Language Model Alignment | May 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 | 5 |
| Spherical Channels for Modeling Atomic Interactions | Jun 29, 2022 | 10-shot image generationComputational chemistry | CodeCode Available | 4 | 5 |
| Scaling and evaluating sparse autoencoders | Jun 6, 2024 | Language Modelling | CodeCode Available | 4 | 5 |
| AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct | May 23, 2024 | Class-level Code GenerationCode Completion | CodeCode Available | 4 | 5 |
| GPT-4V(ision) is a Generalist Web Agent, if Grounded | Jan 3, 2024 | Image CaptioningQuestion Answering | CodeCode Available | 4 | 5 |
| Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots | Oct 19, 2023 | Social Navigation | CodeCode Available | 4 | 5 |
| The GAN is dead; long live the GAN! A Modern GAN Baseline | Jan 9, 2025 | Image Generation | CodeCode Available | 4 | 5 |
| TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models | May 20, 2024 | Philosophy | CodeCode Available | 4 | 5 |
| MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series | May 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 | 5 |
| GigaAM: Efficient Self-Supervised Learner for Speech Recognition | Jun 1, 2025 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 4 | 5 |
| AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Aug 10, 2023 | Audio GenerationIn-Context Learning | CodeCode Available | 4 | 5 |
| A Survey of LLM DATA | May 24, 2025 | Large Language ModelManagement | CodeCode Available | 4 | 5 |
| Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization | Apr 2, 2024 | RAGRetrieval | CodeCode Available | 4 | 5 |
| LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods | Dec 7, 2024 | | CodeCode Available | 4 | 5 |
| Multi-head Temporal Latent Attention | May 19, 2025 | GPUspeech-recognition | CodeCode Available | 4 | 5 |
| OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Feb 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 4 | 5 |
| A Survey on Video Diffusion Models | Oct 16, 2023 | Image GenerationSurvey | CodeCode Available | 4 | 5 |
| MEDITRON-70B: Scaling Medical Pretraining for Large Language Models | Nov 27, 2023 | ArticlesConditional Text Generation | CodeCode Available | 4 | 5 |
| Deep Residual Learning for Image Recognition | Dec 10, 2015 | Classification | CodeCode Available | 4 | 5 |
| Multi-label Cluster Discrimination for Visual Representation Learning | Jul 24, 2024 | Contrastive LearningImage-text Retrieval | CodeCode Available | 4 | 5 |
| Craw4LLM: Efficient Web Crawling for LLM Pretraining | Feb 19, 2025 | 10-shot image generation | CodeCode Available | 4 | 5 |
| Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models | Mar 11, 2025 | FormInformation Retrieval | CodeCode Available | 4 | 5 |
| MiMo-VL Technical Report | Jun 4, 2025 | Multimodal Reasoning | CodeCode Available | 4 | 5 |
| LightGlue: Local Feature Matching at Light Speed | Jun 23, 2023 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 4 | 5 |
| Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy | Dec 16, 2023 | Deep Learningimage-classification | CodeCode Available | 4 | 5 |
| FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models | Dec 11, 2024 | | CodeCode Available | 4 | 5 |
| Deepfake Generation and Detection: A Benchmark and Survey | Mar 26, 2024 | AttributeFace Generation | CodeCode Available | 4 | 5 |
| Easi3R: Estimating Disentangled Motion from DUSt3R Without Training | Mar 31, 2025 | 4D reconstructionCamera Pose Estimation | CodeCode Available | 4 | 5 |
| Pytorch-Wildlife: A Collaborative Deep Learning Framework for Conservation | May 21, 2024 | Deep Learning | CodeCode Available | 4 | 5 |
| Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents | Aug 13, 2024 | Decision Making | CodeCode Available | 4 | 5 |
| InceptionNeXt: When Inception Meets ConvNeXt | Mar 29, 2023 | Image ClassificationSemantic Segmentation | CodeCode Available | 4 | 5 |
| Neural Network Diffusion | Feb 20, 2024 | Decoder | CodeCode Available | 4 | 5 |
| BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining | Oct 19, 2022 | Document ClassificationLanguage Modelling | CodeCode Available | 4 | 5 |
| Hierarchically Coherent Multivariate Mixture Networks | May 11, 2023 | Computational EfficiencyTime Series | CodeCode Available | 4 | 5 |
| Self-Supervised Prompt Optimization | Feb 7, 2025 | | CodeCode Available | 4 | 5 |
| Mamba-FETrack: Frame-Event Tracking via State Space Model | Apr 28, 2024 | GPUMamba | CodeCode Available | 4 | 5 |
| Accelerating Data Processing and Benchmarking of AI Models for Pathology | Feb 10, 2025 | Benchmarking | CodeCode Available | 4 | 5 |
| DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale | Jun 30, 2022 | CPUGPU | CodeCode Available | 4 | 5 |
| EasyRec: An easy-to-use, extendable and efficient framework for building industrial recommendation systems | Sep 26, 2022 | feature selectionRecommendation Systems | CodeCode Available | 4 | 5 |
| On Path to Multimodal Historical Reasoning: HistBench and HistAgent | May 26, 2025 | Optical Character Recognition (OCR) | CodeCode Available | 4 | 5 |
| SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow | May 23, 2024 | Optical Flow Estimation | CodeCode Available | 4 | 5 |
| Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning | Jan 28, 2022 | | CodeCode Available | 4 | 5 |
| fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence | Jul 1, 2024 | GPUPoint cloud reconstruction | CodeCode Available | 4 | 5 |
| SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models | Nov 13, 2023 | Described Object DetectionLanguage Modeling | CodeCode Available | 4 | 5 |