| Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction | Sep 25, 2024 | GPUToken Reduction | CodeCode Available | 2 |
| Occupancy-Based Dual Contouring | Sep 20, 2024 | 3D ReconstructionGPU | CodeCode Available | 2 |
| Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization | Sep 19, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval | Sep 16, 2024 | CPUGPU | CodeCode Available | 2 |
| Super Monotonic Alignment Search | Sep 12, 2024 | CPUGPU | CodeCode Available | 2 |
| Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation | Sep 2, 2024 | GPU | CodeCode Available | 2 |
| Enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare Applications | Sep 2, 2024 | CPUFederated Learning | CodeCode Available | 2 |
| LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models | Aug 31, 2024 | 8kGPU | CodeCode Available | 2 |
| MemLong: Memory-Augmented Retrieval for Long Text Modeling | Aug 30, 2024 | 4kDecoder | CodeCode Available | 2 |
| deepmriprep: Voxel-based Morphometry (VBM) Preprocessing via Deep Neural Networks | Aug 20, 2024 | GPUImage Registration | CodeCode Available | 2 |
| Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters | Aug 7, 2024 | GPU | CodeCode Available | 2 |
| Palu: Compressing KV-Cache with Low-Rank Projection | Jul 30, 2024 | GPUQuantization | CodeCode Available | 2 |
| HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors | Jul 26, 2024 | Depth EstimationGPU | CodeCode Available | 2 |
| ESOD: Efficient Small Object Detection on High-Resolution Images | Jul 23, 2024 | GPUObject | CodeCode Available | 2 |
| Forecasting GPU Performance for Deep Learning Training and Inference | Jul 18, 2024 | Deep LearningGPU | CodeCode Available | 2 |
| Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale | Jul 17, 2024 | GPULAMBADA | CodeCode Available | 2 |
| From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients | Jul 15, 2024 | GPU | CodeCode Available | 2 |
| Differentiable Voxelization and Mesh Morphing | Jul 15, 2024 | GPU | CodeCode Available | 2 |
| Gradient Boosting Reinforcement Learning | Jul 11, 2024 | GPUreinforcement-learning | CodeCode Available | 2 |
| MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis | Jul 10, 2024 | GPUImage Generation | CodeCode Available | 2 |
| P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds | Jul 7, 2024 | 3D Single Object TrackingGPU | CodeCode Available | 2 |
| MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression | Jun 21, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation | Jun 21, 2024 | 3D GenerationGPU | CodeCode Available | 2 |
| Duoduo CLIP: Efficient 3D Understanding with Multi-View Images | Jun 17, 2024 | GPUObject | CodeCode Available | 2 |
| GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View Diffusion | Jun 14, 2024 | 3D GenerationGPU | CodeCode Available | 2 |