| Atom: Low-bit Quantization for Efficient and Accurate LLM Serving | Oct 29, 2023 | GPUQuantization | CodeCode Available | 2 |
| FP8-LM: Training FP8 Large Language Models | Oct 27, 2023 | GPU | CodeCode Available | 2 |
| QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models | Oct 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 2 |
| ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models | Oct 16, 2023 | General Reinforcement LearningGPU | CodeCode Available | 2 |
| LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation | Oct 16, 2023 | GPUImage Animation | CodeCode Available | 2 |
| Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes | Oct 12, 2023 | GPUNovel View Synthesis | CodeCode Available | 2 |
| GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models | Oct 12, 2023 | GPUText to 3D | CodeCode Available | 2 |
| DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training | Oct 5, 2023 | GPU | CodeCode Available | 2 |
| MEM: Multi-Modal Elevation Mapping for Robotics and Learning | Sep 28, 2023 | ColorizationGPU | CodeCode Available | 2 |
| ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers | Sep 28, 2023 | GPUInstruction Following | CodeCode Available | 2 |