| MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Apr 8, 2024 | GPUMultiple-choice | CodeCode Available | 3 |
| ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters | May 4, 2022 | GPUImitation Learning | CodeCode Available | 3 |
| Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI | Jul 16, 2025 | GPU | CodeCode Available | 3 |
| EfficientQAT: Efficient Quantization-Aware Training for Large Language Models | Jul 10, 2024 | GPUQuantization | CodeCode Available | 3 |
| Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences | Jun 16, 2025 | Document SummarizationGPU | CodeCode Available | 3 |
| Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence | Feb 12, 2020 | BIG-bench Machine LearningGPU | CodeCode Available | 3 |
| Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models | Jun 23, 2025 | Domain AdaptationGPU | CodeCode Available | 3 |
| LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture | Sep 4, 2024 | GPUMamba | CodeCode Available | 3 |
| LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale | Aug 10, 2024 | GPULanguage Modelling | CodeCode Available | 3 |
| APOLLO: SGD-like Memory, AdamW-level Performance | Dec 6, 2024 | GPUQuantization | CodeCode Available | 3 |