| Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models | Nov 17, 2022 | Gesture GenerationMotion Synthesis | CodeCode Available | 2 | 5 |
| Generalized Portrait Quality Assessment | Feb 14, 2024 | Face Image Quality Assessment | CodeCode Available | 2 | 5 |
| Benchmarking Potential Based Rewards for Learning Humanoid Locomotion | Jul 19, 2023 | BenchmarkingReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems | Apr 6, 2024 | Logical ReasoningMath | CodeCode Available | 2 | 5 |
| FreeSplat++: Generalizable 3D Gaussian Splatting for Efficient Indoor Scene Reconstruction | Mar 29, 2025 | 3DGSIndoor Scene Reconstruction | CodeCode Available | 2 | 5 |
| MEM: Multi-Modal Elevation Mapping for Robotics and Learning | Sep 28, 2023 | ColorizationGPU | CodeCode Available | 2 | 5 |
| OPUS: Occupancy Prediction Using a Sparse Set | Sep 14, 2024 | Autonomous DrivingPrediction | CodeCode Available | 2 | 5 |
| NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation | Apr 17, 2025 | Data AugmentationDiversity | CodeCode Available | 2 | 5 |
| Vision Mamba: A Comprehensive Survey and Taxonomy | May 7, 2024 | MambaMedical Image Analysis | CodeCode Available | 2 | 5 |
| Accelerated Methods for Deep Reinforcement Learning | Mar 7, 2018 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 2 | 5 |
| Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery | Feb 7, 2023 | | CodeCode Available | 2 | 5 |
| Steering Large Language Models between Code Execution and Textual Reasoning | Oct 4, 2024 | Code GenerationMath | CodeCode Available | 2 | 5 |
| Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking | Apr 12, 2025 | Knowledge Distillation | CodeCode Available | 2 | 5 |
| QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference | Feb 15, 2024 | GPUQuantization | CodeCode Available | 2 | 5 |
| Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation | Feb 22, 2024 | Retrieval | CodeCode Available | 2 | 5 |
| MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors | May 2, 2024 | 3D Object Captioning3D Object Classification | CodeCode Available | 2 | 5 |
| Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation | May 14, 2024 | Decoder | CodeCode Available | 2 | 5 |
| MedMNIST v2 -- A large-scale lightweight benchmark for 2D and 3D biomedical image classification | Oct 27, 2021 | AutoMLimage-classification | CodeCode Available | 2 | 5 |
| DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation | Jun 6, 2024 | Real-Time Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation | Jan 1, 2024 | SegmentationVideo Segmentation | CodeCode Available | 2 | 5 |
| Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning | Jun 25, 2024 | Combinatorial OptimizationGraph Neural Network | CodeCode Available | 2 | 5 |
| Language as Queries for Referring Video Object Segmentation | Jan 3, 2022 | ObjectObject Tracking | CodeCode Available | 2 | 5 |
| ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild | Jul 4, 2024 | Chart UnderstandingDecision Making | CodeCode Available | 2 | 5 |
| The Balanced-Pairwise-Affinities Feature Transform | Jun 25, 2024 | Few-Shot Image ClassificationImage Clustering | CodeCode Available | 2 | 5 |
| LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the Wild | May 9, 2024 | Depression DetectionNavigate | CodeCode Available | 2 | 5 |
| dlordinal: a Python package for deep ordinal classification | Jul 24, 2024 | ClassificationOrdinal Classification | CodeCode Available | 2 | 5 |
| Scribble-Supervised LiDAR Semantic Segmentation | Mar 16, 2022 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 | 5 |
| Ruri: Japanese General Text Embeddings | Sep 12, 2024 | Knowledge Distillation | CodeCode Available | 2 | 5 |
| Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting | Sep 19, 2024 | Scene UnderstandingSemantic Segmentation | CodeCode Available | 2 | 5 |
| Autoregressive Search Engines: Generating Substrings as Document Identifiers | Apr 22, 2022 | Information RetrievalRetrieval | CodeCode Available | 2 | 5 |
| Accelerating Sparse Deep Neural Networks | Apr 16, 2021 | GPUMath | CodeCode Available | 2 | 5 |
| HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation | Oct 18, 2024 | DisentanglementImage Generation | CodeCode Available | 2 | 5 |
| DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems | May 30, 2022 | Diversityreinforcement-learning | CodeCode Available | 2 | 5 |
| Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition | May 23, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| Zero-Shot Coreset Selection: Efficient Pruning for Unlabeled Data | Nov 22, 2024 | | CodeCode Available | 2 | 5 |
| Autoregressive Distillation of Diffusion Transformers | Apr 15, 2025 | | CodeCode Available | 2 | 5 |
| DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering | Jul 19, 2023 | Camera CalibrationNovel View Synthesis | CodeCode Available | 2 | 5 |
| MeshSplats: Mesh-Based Rendering with Gaussian Splatting Initialization | Feb 11, 2025 | | CodeCode Available | 2 | 5 |
| Deep Learning Generates Synthetic Cancer Histology for Explainability and Education | Nov 12, 2022 | Deep LearningGenerative Adversarial Network | CodeCode Available | 2 | 5 |
| MMRL: Multi-Modal Representation Learning for Vision-Language Models | Mar 11, 2025 | Prompt EngineeringRepresentation Learning | CodeCode Available | 2 | 5 |
| AQUA-SLAM: Tightly-Coupled Underwater Acoustic-Visual-Inertial SLAM with Sensor Calibration | Mar 14, 2025 | Simultaneous Localization and Mapping | CodeCode Available | 2 | 5 |
| Logits DeConfusion with CLIP for Few-Shot Learning | Apr 16, 2025 | Few-Shot Learning | CodeCode Available | 2 | 5 |
| HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution | Jun 27, 2023 | 4kIn-Context Learning | CodeCode Available | 2 | 5 |
| CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets | Sep 29, 2023 | Language ModellingMathematical Reasoning | CodeCode Available | 2 | 5 |
| Efficient Estimation of Word Representations in Vector Space | Jan 16, 2013 | Word Similarity | CodeCode Available | 2 | 5 |
| Law of Vision Representation in MLLMs | Aug 29, 2024 | cross-modal alignmentLanguage Modeling | CodeCode Available | 2 | 5 |
| CLUE: A Chinese Language Understanding Evaluation Benchmark | Apr 13, 2020 | General ClassificationMachine Reading Comprehension | CodeCode Available | 2 | 5 |
| MASS: Multi-Agent Simulation Scaling for Portfolio Construction | May 15, 2025 | | CodeCode Available | 2 | 5 |
| Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial Defects | Jul 10, 2025 | 3D Anomaly DetectionAnomaly Detection | CodeCode Available | 2 | 5 |
| Massive Activations in Large Language Models | Feb 27, 2024 | | CodeCode Available | 2 | 5 |