| Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials | Jun 4, 2024 | Federated Learning | CodeCode Available | 3 | 5 |
| LightGBM: A Highly Efficient Gradient Boosting Decision Tree | Dec 1, 2017 | | CodeCode Available | 3 | 5 |
| Faster Diffusion via Temporal Attention Decomposition | Apr 3, 2024 | | CodeCode Available | 3 | 5 |
| A Comprehensive Survey on Composed Image Retrieval | Feb 19, 2025 | AttributeImage Retrieval | CodeCode Available | 3 | 5 |
| Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling | Nov 27, 2023 | NeRF | CodeCode Available | 3 | 5 |
| AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement | Feb 24, 2025 | | CodeCode Available | 3 | 5 |
| RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models | May 23, 2024 | HallucinationSentence | CodeCode Available | 3 | 5 |
| TextBox 2.0: A Text Generation Library with Pre-trained Language Models | Dec 26, 2022 | Abstractive Text SummarizationData-to-Text Generation | CodeCode Available | 3 | 5 |
| Representing 3D sparse map points and lines for camera relocalization | Feb 28, 2024 | Camera RelocalizationIndoor Localization | CodeCode Available | 3 | 5 |
| AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning | Mar 10, 2025 | Autonomous DrivingCommon Sense Reasoning | CodeCode Available | 3 | 5 |
| WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing | Oct 26, 2021 | DenoisingSelf-Supervised Learning | CodeCode Available | 3 | 5 |
| Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection | May 23, 2024 | Anomaly DetectionMulti-class Anomaly Detection | CodeCode Available | 3 | 5 |
| LoRA+: Efficient Low Rank Adaptation of Large Models | Feb 19, 2024 | | CodeCode Available | 3 | 5 |
| Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving | May 27, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 3 | 5 |
| ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models | Jan 30, 2024 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 | 5 |
| Odyssey: Empowering Minecraft Agents with Open-World Skills | Jul 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 3 | 5 |
| Learnable latent embeddings for joint behavioral and neural analysis | Apr 1, 2022 | | CodeCode Available | 3 | 5 |
| Vision Transformers: From Semantic Segmentation to Dense Prediction | Jul 19, 2022 | image-classificationImage Classification | CodeCode Available | 3 | 5 |
| ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters | May 4, 2022 | GPUImitation Learning | CodeCode Available | 3 | 5 |
| SEA-LION: Southeast Asian Languages in One Network | Apr 8, 2025 | | CodeCode Available | 3 | 5 |
| ReNoise: Real Image Inversion Through Iterative Noising | Mar 21, 2024 | DenoisingImage Manipulation | CodeCode Available | 3 | 5 |
| VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction | Feb 27, 2024 | NeRF | CodeCode Available | 3 | 5 |
| Deep Frequency Derivative Learning for Non-stationary Time Series Forecasting | Jun 29, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 3 | 5 |
| REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers | Apr 14, 2025 | | CodeCode Available | 3 | 5 |
| Tree Search for Language Model Agents | Jul 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation | Oct 4, 2024 | 16kCode Generation | CodeCode Available | 3 | 5 |
| StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos | Sep 11, 2024 | Video Inpainting | CodeCode Available | 3 | 5 |
| e3nn: Euclidean Neural Networks | Jul 18, 2022 | | CodeCode Available | 3 | 5 |
| Latent Action Pretraining from Videos | Oct 15, 2024 | QuantizationRobot Manipulation | CodeCode Available | 3 | 5 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 | 5 |
| Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection | Nov 19, 2023 | 2D Object DetectionDeepFake Detection | CodeCode Available | 3 | 5 |
| CarLLaVA: Vision language models for camera-only closed-loop driving | Jun 14, 2024 | Autonomous DrivingBench2Drive | CodeCode Available | 3 | 5 |
| Mixed Precision Training | Oct 10, 2017 | | CodeCode Available | 3 | 5 |
| Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting | Dec 19, 2019 | Interpretable Machine LearningTime Series | CodeCode Available | 3 | 5 |
| TokenFlow: Consistent Diffusion Features for Consistent Video Editing | Jul 19, 2023 | Video Editing | CodeCode Available | 3 | 5 |
| Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| ZeST: Zero-Shot Material Transfer from a Single Image | Apr 9, 2024 | Appearance TransferObject | CodeCode Available | 3 | 5 |
| HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery | Apr 12, 2023 | 3D Human Pose Estimation3D Human Reconstruction | CodeCode Available | 3 | 5 |
| SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation | Feb 18, 2025 | Object RearrangementRobot Manipulation | CodeCode Available | 3 | 5 |
| PixelHacker: Image Inpainting with Structural and Semantic Consistency | Apr 29, 2025 | DenoisingImage Generation | CodeCode Available | 3 | 5 |
| Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models | Jun 13, 2024 | Mathobject-detection | CodeCode Available | 3 | 5 |
| Automatic Instruction Evolving for Large Language Models | Jun 2, 2024 | GSM8KHumanEval | CodeCode Available | 3 | 5 |
| Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation | Sep 19, 2024 | RAGRetrieval | CodeCode Available | 3 | 5 |
| LaRa: Efficient Large-Baseline Radiance Fields | Jul 5, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 3 | 5 |
| Neural Message Passing Induced by Energy-Constrained Diffusion | Sep 13, 2024 | Inductive Bias | CodeCode Available | 3 | 5 |
| Text2MDT: Extracting Medical Decision Trees from Medical Texts | Jan 4, 2024 | | CodeCode Available | 3 | 5 |
| Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution | Dec 11, 2023 | DecoderSuper-Resolution | CodeCode Available | 3 | 5 |
| SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment | Mar 12, 2025 | Autonomous DrivingBench2Drive | CodeCode Available | 3 | 5 |
| SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation | Nov 26, 2024 | DiversityImage Segmentation | CodeCode Available | 3 | 5 |
| FlashFace: Human Image Personalization with High-fidelity Identity Preservation | Mar 25, 2024 | Face SwappingImage Generation | CodeCode Available | 3 | 5 |