| Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction | May 20, 2024 | Drug DesignMolecular Docking | CodeCode Available | 5 |
| Showing Many Labels in Multi-label Classification Models: An Empirical Study of Adversarial Examples | Sep 26, 2024 | Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION | CodeCode Available | 5 |
| IMAGDressing-v1: Customizable Virtual Dressing | Jul 17, 2024 | DenoisingImage Generation | CodeCode Available | 5 |
| ChatDBG: Augmenting Debugging with Large Language Models | Mar 25, 2024 | C++ codeNavigate | CodeCode Available | 5 |
| Enabling Novel Mission Operations and Interactions with ROSA: The Robot Operating System Agent | Oct 9, 2024 | | CodeCode Available | 5 |
| RLHF Workflow: From Reward Modeling to Online RLHF | May 13, 2024 | ChatbotHumanEval | CodeCode Available | 5 |
| Generating Physically Stable and Buildable LEGO Designs from Text | May 8, 2025 | 3D GenerationLarge Language Model | CodeCode Available | 5 |
| A Survey on Knowledge Distillation of Large Language Models | Feb 20, 2024 | Data AugmentationKnowledge Distillation | CodeCode Available | 5 |
| Reservoir-enhanced Segment Anything Model for Subsurface Diagnosis | Apr 26, 2025 | Anomaly DetectionGPR | CodeCode Available | 5 |
| Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Mar 11, 2024 | Object DetectionOpen-vocabulary object detection | CodeCode Available | 5 |
| Scalable Pre-training of Large Autoregressive Image Models | Jan 16, 2024 | Image Classification | CodeCode Available | 5 |
| ReLoRA: High-Rank Training Through Low-Rank Updates | Jul 11, 2023 | GPU | CodeCode Available | 5 |
| A Comprehensive Study of Knowledge Editing for Large Language Models | Jan 2, 2024 | knowledge editingModel Editing | CodeCode Available | 5 |
| BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster | Apr 3, 2022 | AutoMLDistributed Computing | CodeCode Available | 5 |
| DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention | Sep 25, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| TaskWeaver: A Code-First Agent Framework | Nov 29, 2023 | Natural Language Understanding | CodeCode Available | 5 |
| StarVector: Generating Scalable Vector Graphics Code from Images and Text | Dec 17, 2023 | Code GenerationLanguage Modeling | CodeCode Available | 5 |
| APISR: Anime Production Inspired Real-World Anime Super-Resolution | Mar 3, 2024 | Super-Resolution | CodeCode Available | 5 |
| Granite Code Models: A Family of Open Foundation Models for Code Intelligence | May 7, 2024 | Code GenerationDecoder | CodeCode Available | 5 |
| VGGSfM: Visual Geometry Grounded Deep Structure From Motion | Jan 1, 2024 | Camera CalibrationPoint Tracking | CodeCode Available | 5 |
| Maia-2: A Unified Model for Human-AI Alignment in Chess | Sep 30, 2024 | Decision Making | CodeCode Available | 5 |
| Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities | Oct 15, 2024 | Language Modelling | CodeCode Available | 5 |
| MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction | Apr 17, 2022 | Image RestorationSpectral Reconstruction | CodeCode Available | 5 |
| OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML | Jan 15, 2025 | | CodeCode Available | 5 |
| IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems | Jan 19, 2025 | Navigate | CodeCode Available | 5 |
| Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass | Jan 23, 2025 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 5 |
| LIMO: Less is More for Reasoning | Feb 5, 2025 | MathMathematical Reasoning | CodeCode Available | 5 |
| The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey | Feb 14, 2025 | Autonomous DrivingSurvey | CodeCode Available | 5 |
| SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning | Sep 9, 2024 | AI AgentKnowledge Graphs | CodeCode Available | 5 |
| Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions | Nov 21, 2024 | Reinforcement Learning (RL) | CodeCode Available | 5 |
| Fake News Detection: It's All in the Data! | Jul 2, 2024 | AllDiversity | CodeCode Available | 5 |
| The BrowserGym Ecosystem for Web Agent Research | Dec 6, 2024 | Benchmarking | CodeCode Available | 5 |
| SCBench: A KV Cache-Centric Analysis of Long-Context Methods | Dec 13, 2024 | MambaQuantization | CodeCode Available | 5 |
| The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation | Apr 7, 2025 | Inference OptimizationReferring Video Object Segmentation | CodeCode Available | 5 |
| BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation | Jan 28, 2022 | Image CaptioningImage-text matching | CodeCode Available | 5 |
| OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding | Jun 27, 2024 | DecoderSegmentation | CodeCode Available | 5 |
| Can Foundation Models Wrangle Your Data? | May 20, 2022 | Entity ResolutionImputation | CodeCode Available | 5 |
| Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer | Apr 8, 2024 | MuJoCoPhysical Simulations | CodeCode Available | 5 |
| Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid | Aug 4, 2024 | document understanding | CodeCode Available | 5 |
| Tora: Trajectory-oriented Diffusion Transformer for Video Generation | Jul 31, 2024 | Video CompressionVideo Generation | CodeCode Available | 5 |
| FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation | Oct 16, 2024 | Audio GenerationGPU | CodeCode Available | 5 |
| WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? | Mar 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| SuperAnimal pretrained pose estimation models for behavioral analysis | Mar 14, 2022 | 2D Pose EstimationAnimal Pose Estimation | CodeCode Available | 5 |
| Visual Identification of Problematic Bias in Large Label Spaces | Jan 17, 2022 | Fairness | CodeCode Available | 5 |
| LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models | Jun 21, 2023 | | CodeCode Available | 5 |
| Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers | Jun 30, 2025 | Multimodal Reasoning | CodeCode Available | 5 |
| Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research | Jan 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning | Jun 2, 2025 | AI AgentDiversity | CodeCode Available | 5 |
| FeatUp: A Model-Agnostic Framework for Features at Any Resolution | Mar 15, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 5 |
| DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Nov 21, 2024 | Long-tailed Object DetectionObject | CodeCode Available | 5 |