| Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients | Jun 25, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving | Jun 24, 2024 | CPUGPU | CodeCode Available | 7 |
| GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism | Jun 24, 2024 | GPU | —Unverified | 0 |
| MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network | Jun 24, 2024 | GPU | CodeCode Available | 0 |
| Video-Infinity: Distributed Long Video Generation | Jun 24, 2024 | GPUVideo Generation | —Unverified | 0 |
| Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA | Jun 23, 2024 | Decision MakingGPU | CodeCode Available | 0 |
| LaneSegNet Design Study | Jun 22, 2024 | Autonomous VehiclesDecoder | —Unverified | 0 |
| GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation | Jun 21, 2024 | 3D GenerationGPU | CodeCode Available | 2 |
| MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression | Jun 21, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA | Jun 20, 2024 | Autonomous DrivingCPU | CodeCode Available | 1 |
| ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning | Jun 20, 2024 | GPUVideo Generation | CodeCode Available | 0 |
| Consistency Models Made Easy | Jun 20, 2024 | Computational EfficiencyGPU | CodeCode Available | 3 |
| UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture | Jun 20, 2024 | CPUGPU | —Unverified | 0 |
| CE-SSL: Computation-Efficient Semi-Supervised Learning for ECG-based Cardiovascular Diseases Detection | Jun 20, 2024 | Computational EfficiencyElectrocardiography (ECG) | CodeCode Available | 1 |
| GPU-Accelerated DCOPF using Gradient-Based Optimization | Jun 19, 2024 | CPUGPU | CodeCode Available | 0 |
| VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models | Jun 19, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Sparse High Rank Adapters | Jun 19, 2024 | CPUGPU | —Unverified | 0 |
| Under the Hood of Tabular Data Generation Models: Benchmarks with Extensive Tuning | Jun 18, 2024 | GPUHyperparameter Optimization | —Unverified | 0 |
| LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation | Jun 18, 2024 | GPUNatural Language Understanding | CodeCode Available | 1 |
| MCSD: An Efficient Language Model with Diverse Fusion | Jun 18, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Contraction rates for conjugate gradient and Lanczos approximate posteriors in Gaussian process regression | Jun 18, 2024 | GPU | —Unverified | 0 |
| Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead | Jun 17, 2024 | GPUModel Compression | —Unverified | 0 |
| Multispectral Snapshot Image Registration Using Learned Cross Spectral Disparity Estimation and a Deep Guided Occlusion Reconstruction Network | Jun 17, 2024 | CPUData Augmentation | CodeCode Available | 0 |
| Endor: Hardware-Friendly Sparse Format for Offloaded LLM Inference | Jun 17, 2024 | CPUGPU | —Unverified | 0 |
| VideoLLM-online: Online Video Large Language Model for Streaming Video | Jun 17, 2024 | GPULanguage Modeling | —Unverified | 0 |