| Automated Text Scoring in the Age of Generative AI for the GPU-poor | Jul 2, 2024 | GPU | —Unverified | 0 |
| MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention | Jul 2, 2024 | GPULanguage Modelling | CodeCode Available | 9 |
| PQCache: Product Quantization-based KVCache for Long Context LLM Inference | Jul 1, 2024 | GPUQuantization | —Unverified | 0 |
| Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs | Jul 1, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence | Jul 1, 2024 | GPUPoint cloud reconstruction | CodeCode Available | 4 |
| Needle in the Haystack for Memory Based Large Language Models | Jul 1, 2024 | DecoderGPU | —Unverified | 0 |
| Badllama 3: removing safety finetuning from Llama 3 in minutes | Jul 1, 2024 | GPU | —Unverified | 0 |
| SpectralKAN: Kolmogorov-Arnold Network for Hyperspectral Images Change Detection | Jul 1, 2024 | Change DetectionComputational Efficiency | CodeCode Available | 0 |
| M^2IST: Multi-Modal Interactive Side-Tuning for Efficient Referring Expression Comprehension | Jul 1, 2024 | GPUReferring Expression | —Unverified | 0 |
| Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules | Jun 30, 2024 | GPU | CodeCode Available | 0 |
| Hierarchical Memory for Long Video QA | Jun 30, 2024 | GPUQuestion Answering | —Unverified | 0 |
| LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes | Jun 30, 2024 | GPU | —Unverified | 0 |
| Explore as a Storm, Exploit as a Raindrop: On the Benefit of Fine-Tuning Kernel Schedulers with Coordinate Descent | Jun 28, 2024 | GPUScheduling | CodeCode Available | 0 |
| LLMEasyQuant: Scalable Quantization for Parallel and Distributed LLM Inference | Jun 28, 2024 | GPUQuantization | CodeCode Available | 1 |
| Meta Large Language Model Compiler: Foundation Models of Compiler Optimization | Jun 27, 2024 | Compiler OptimizationGPU | —Unverified | 0 |
| Graph Neural Network as Computationally Efficient Emulator of Ice-sheet and Sea-level System Model (ISSM) | Jun 26, 2024 | CPUGPU | —Unverified | 0 |
| Real-time Structure Flow | Jun 26, 2024 | Autonomous VehiclesGPU | —Unverified | 0 |
| DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image | Jun 26, 2024 | GPU | —Unverified | 0 |
| SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding | Jun 26, 2024 | GPUManagement | CodeCode Available | 1 |
| MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data | Jun 26, 2024 | DecoderGPU | —Unverified | 0 |
| ConStyle v2: A Strong Prompter for All-in-One Image Restoration | Jun 26, 2024 | AllGPU | CodeCode Available | 1 |
| On Scaling Up 3D Gaussian Splatting Training | Jun 26, 2024 | 3DGS3D Reconstruction | CodeCode Available | 4 |
| Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients | Jun 25, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks | Jun 25, 2024 | GPU | CodeCode Available | 0 |
| Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes | Jun 25, 2024 | GPUimage-classification | CodeCode Available | 1 |