| SAMPart3D: Segment Any Part in 3D Objects | Nov 11, 2024 | 3D Generation3D Part Segmentation | CodeCode Available | 4 |
| Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography | May 28, 2024 | Computational EfficiencyComputed Tomography (CT) | CodeCode Available | 4 |
| Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey | Mar 16, 2025 | Autonomous Drivingmultimodal generation | CodeCode Available | 4 |
| RGBD GS-ICP SLAM | Mar 19, 2024 | 3DGSSimultaneous Localization and Mapping | CodeCode Available | 4 |
| Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models | Jun 4, 2024 | Common Sense Reasoning | CodeCode Available | 4 |
| Exploring the Capabilities of Large Multimodal Models on Dense Text | May 9, 2024 | Prompt EngineeringVisual Question Answering (VQA) | CodeCode Available | 4 |
| CameraCtrl: Enabling Camera Control for Text-to-Video Generation | Apr 2, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 4 |
| SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models | Mar 14, 2024 | BlockingGPU | CodeCode Available | 4 |
| Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers | Aug 12, 2024 | GSM8KMath | CodeCode Available | 4 |
| Data quality dimensions for fair AI | May 11, 2023 | ClassificationFairness | CodeCode Available | 4 |
| AnyText: Multilingual Visual Text Generation And Editing | Nov 6, 2023 | Image GenerationOptical Character Recognition (OCR) | CodeCode Available | 4 |
| Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders | Dec 12, 2024 | Gaze Target Estimation | CodeCode Available | 4 |
| BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation | May 26, 2022 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 4 |
| SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing | May 7, 2024 | Image ManipulationLanguage Modeling | CodeCode Available | 4 |
| TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot Control | Feb 24, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 4 |
| CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models | Mar 24, 2025 | | CodeCode Available | 4 |
| Kubric: A scalable dataset generator | Mar 7, 2022 | FairnessNeRF | CodeCode Available | 4 |
| Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation | Aug 8, 2024 | ChunkingFact Checking | CodeCode Available | 4 |
| R^2-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction | May 31, 2024 | 3DGSNeRF | CodeCode Available | 4 |
| AgentGym: Evolving Large Language Model-based Agents across Diverse Environments | Jun 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| RecBole 2.0: Towards a More Up-to-Date Recommendation Library | Jun 15, 2022 | BenchmarkingData Augmentation | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching | Sep 1, 2024 | Patch MatchingStereo Matching | CodeCode Available | 4 |
| Long Context Transfer from Language to Vision | Jun 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| RealisDance: Equip controllable character animation with realistic hands | Sep 10, 2024 | | CodeCode Available | 4 |
| NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals | Jul 18, 2024 | Experimental DesignGPU | CodeCode Available | 4 |
| Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning | May 6, 2025 | Image Generation | CodeCode Available | 4 |
| TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters | Oct 30, 2024 | model | CodeCode Available | 4 |
| A Closer Look at Deep Learning Methods on Tabular Datasets | Jul 1, 2024 | AttributeDeep Learning | CodeCode Available | 4 |
| Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Mar 14, 2024 | GSM8KLanguage Modelling | CodeCode Available | 4 |
| Magicoder: Empowering Code Generation with OSS-Instruct | Dec 4, 2023 | Code GenerationHumanEval | CodeCode Available | 4 |
| Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models | Apr 15, 2025 | Humanoid ControlReinforcement Learning (RL) | CodeCode Available | 4 |
| Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference | Oct 6, 2023 | GPUImage Generation | CodeCode Available | 4 |
| XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL | Jul 7, 2025 | Text to SQLText-To-SQL | CodeCode Available | 4 |
| VM-UNet: Vision Mamba UNet for Medical Image Segmentation | Feb 4, 2024 | Image SegmentationMamba | CodeCode Available | 4 |
| FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy | Jul 1, 2023 | Federated LearningPersonalized Federated Learning | CodeCode Available | 4 |
| Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering | Feb 26, 2024 | Evidence SelectionOpen-Ended Question Answering | CodeCode Available | 4 |
| NExT-GPT: Any-to-Any Multimodal LLM | Sep 11, 2023 | AI Agent | CodeCode Available | 4 |
| Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning | Jun 3, 2025 | Code Generationreinforcement-learning | CodeCode Available | 4 |
| Eliminating Domain Bias for Federated Learning in Representation Space | Nov 25, 2023 | Federated LearningPrivacy Preserving | CodeCode Available | 4 |
| MotionClone: Training-Free Motion Cloning for Controllable Video Generation | Jun 8, 2024 | DenoisingMotion Generation | CodeCode Available | 4 |
| Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation | Feb 23, 2025 | Benchmarking | CodeCode Available | 4 |
| GIM: Learning Generalizable Image Matcher From Internet Videos | Feb 16, 2024 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 4 |
| Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach | Dec 4, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 4 |
| Pearl: A Production-ready Reinforcement Learning Agent | Dec 6, 2023 | Benchmarkingreinforcement-learning | CodeCode Available | 4 |
| Towards All-in-One Medical Image Re-Identification | Mar 11, 2025 | All | CodeCode Available | 4 |
| LocAgent: Graph-Guided LLM Agents for Code Localization | Mar 12, 2025 | GitHub issue resolutionNavigate | CodeCode Available | 4 |
| GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning | Aug 20, 2023 | FairnessFederated Learning | CodeCode Available | 4 |
| Data-centric Artificial Intelligence: A Survey | Mar 17, 2023 | Survey | CodeCode Available | 4 |
| Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement | Mar 9, 2025 | Domain GeneralizationObject Detection | CodeCode Available | 4 |