| OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation | May 6, 2025 | Robot ManipulationVision-Language-Action | CodeCode Available | 3 |
| DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing | Mar 21, 2024 | Image Generationspatial-aware image editing | CodeCode Available | 3 |
| SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation | Feb 18, 2025 | Voice Cloning | CodeCode Available | 3 |
| Open-Source Web Service with Morphological Dictionary-Supplemented Deep Learning for Morphosyntactic Analysis of Czech | Jun 18, 2024 | Deep LearningDependency Parsing | CodeCode Available | 3 |
| Model Inversion Attacks: A Survey of Approaches and Countermeasures | Nov 15, 2024 | Survey | CodeCode Available | 3 |
| GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation | Oct 14, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 3 |
| CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms | Nov 16, 2021 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 3 |
| Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera | Jan 5, 2025 | Data AugmentationDepth Estimation | CodeCode Available | 3 |
| Leveraging Self-Supervised Learning for Speaker Diarization | Sep 14, 2024 | Self-Supervised Learningspeaker-diarization | CodeCode Available | 3 |
| ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation | Mar 13, 2024 | Simulated Gaussian Manipulation | CodeCode Available | 3 |
| Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models | Jan 30, 2025 | Action RecognitionDomain Adaptation | CodeCode Available | 3 |
| REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites | Apr 15, 2025 | Autonomous Web NavigationBenchmarking | CodeCode Available | 3 |
| VideoTetris: Towards Compositional Text-to-Video Generation | Jun 6, 2024 | DenoisingText-to-Video Generation | CodeCode Available | 3 |
| FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering | Aug 15, 2024 | Computational EfficiencyScheduling | CodeCode Available | 3 |
| ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget | Jul 31, 2024 | Document-level Closed Information ExtractionEntity Linking | CodeCode Available | 3 |
| EasyVolcap: Accelerating Neural Volumetric Video Research | Dec 11, 2023 | | CodeCode Available | 3 |
| SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images | Oct 2, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 3 |
| High-Speed Stereo Visual SLAM for Low-Powered Computing Devices | Oct 5, 2024 | GPU | CodeCode Available | 3 |
| Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining | Feb 5, 2024 | Image SegmentationMamba | CodeCode Available | 3 |
| Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video | Mar 27, 2025 | Camera Pose EstimationDepth Estimation | CodeCode Available | 3 |
| Automated Movie Generation via Multi-Agent CoT Planning | Mar 10, 2025 | Video Generation | CodeCode Available | 3 |
| OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data | May 24, 2025 | Image Stylization | CodeCode Available | 3 |
| KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding | Mar 4, 2025 | HumanEvalmbpp | CodeCode Available | 3 |
| EPRecon: An Efficient Framework for Real-Time Panoptic 3D Reconstruction from Monocular Video | Sep 3, 2024 | 3D ReconstructionScene Understanding | CodeCode Available | 3 |
| UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs | Apr 11, 2024 | | CodeCode Available | 3 |