| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 |
| Occupancy-Based Dual Contouring | Sep 20, 2024 | 3D ReconstructionGPU | CodeCode Available | 2 |
| Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization | Sep 19, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval | Sep 16, 2024 | CPUGPU | CodeCode Available | 2 |
| Super Monotonic Alignment Search | Sep 12, 2024 | CPUGPU | CodeCode Available | 2 |
| Enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare Applications | Sep 2, 2024 | CPUFederated Learning | CodeCode Available | 2 |
| Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation | Sep 2, 2024 | GPU | CodeCode Available | 2 |
| LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models | Aug 31, 2024 | 8kGPU | CodeCode Available | 2 |
| MemLong: Memory-Augmented Retrieval for Long Text Modeling | Aug 30, 2024 | 4kDecoder | CodeCode Available | 2 |
| deepmriprep: Voxel-based Morphometry (VBM) Preprocessing via Deep Neural Networks | Aug 20, 2024 | GPUImage Registration | CodeCode Available | 2 |