| DεpS: Delayed ε-Shrinking for Faster Once-For-All Training | Jul 8, 2024 | AllGPU | —Unverified | 0 |
| HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion | Jul 8, 2024 | GPU | CodeCode Available | 1 |
| Momentum Auxiliary Network for Supervised Local Learning | Jul 8, 2024 | GPUimage-classification | CodeCode Available | 1 |
| Fast On-device LLM Inference with NPUs | Jul 8, 2024 | CPUGPU | CodeCode Available | 5 |
| P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds | Jul 7, 2024 | 3D Single Object TrackingGPU | CodeCode Available | 2 |
| Accelerating MRI Uncertainty Estimation with Mask-based Bayesian Neural Network | Jul 7, 2024 | CPUDiagnostic | —Unverified | 0 |
| The Solution for the AIGC Inference Performance Optimization Competition | Jul 6, 2024 | Computational EfficiencyGPU | —Unverified | 0 |
| SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction | Jul 6, 2024 | Dynamic ReconstructionGPU | CodeCode Available | 1 |
| PatchEX: High-Quality Real-Time Temporal Supersampling through Patch-based Parallel Extrapolation | Jul 5, 2024 | GPU | —Unverified | 0 |
| Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement | Jul 5, 2024 | GPUMixture-of-Experts | —Unverified | 0 |
| Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning | Jul 5, 2024 | GPU | CodeCode Available | 0 |
| LoCo: Low-Bit Communication Adaptor for Large-scale Model Training | Jul 5, 2024 | GPU | CodeCode Available | 0 |
| Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization | Jul 5, 2024 | GPUImage-to-Image Translation | CodeCode Available | 1 |
| GOALPlace: Begin with the End in Mind | Jul 5, 2024 | GPU | —Unverified | 0 |
| Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents | Jul 5, 2024 | GPUImitation Learning | —Unverified | 0 |
| Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy | Jul 4, 2024 | GPU | CodeCode Available | 0 |
| Green Multigrid Network | Jul 4, 2024 | GPUOperator learning | —Unverified | 0 |
| Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms | Jul 3, 2024 | BenchmarkingCPU | —Unverified | 0 |
| M5: A Whole Genome Bacterial Encoder at Single Nucleotide Resolution | Jul 3, 2024 | GPU | —Unverified | 0 |
| Achieving High Throughput with a Trainable Neural-Network-Based Equalizer for Communications on FPGA | Jul 3, 2024 | GPU | —Unverified | 0 |
| Implementation and Analysis of GPU Algorithms for Vecchia Approximation | Jul 3, 2024 | Gaussian ProcessesGPU | CodeCode Available | 0 |
| LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control | Jul 3, 2024 | Computational EfficiencyFace Reenactment | CodeCode Available | 11 |
| QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices | Jul 2, 2024 | GPUQuantization | CodeCode Available | 1 |
| HRSAM: Efficient Interactive Segmentation in High-Resolution Images | Jul 2, 2024 | Data AugmentationGPU | CodeCode Available | 1 |
| SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images | Jul 2, 2024 | GPU | CodeCode Available | 0 |
| Automated Text Scoring in the Age of Generative AI for the GPU-poor | Jul 2, 2024 | GPU | —Unverified | 0 |
| MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention | Jul 2, 2024 | GPULanguage Modelling | CodeCode Available | 9 |
| PQCache: Product Quantization-based KVCache for Long Context LLM Inference | Jul 1, 2024 | GPUQuantization | —Unverified | 0 |
| Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs | Jul 1, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence | Jul 1, 2024 | GPUPoint cloud reconstruction | CodeCode Available | 4 |
| Needle in the Haystack for Memory Based Large Language Models | Jul 1, 2024 | DecoderGPU | —Unverified | 0 |
| Badllama 3: removing safety finetuning from Llama 3 in minutes | Jul 1, 2024 | GPU | —Unverified | 0 |
| SpectralKAN: Kolmogorov-Arnold Network for Hyperspectral Images Change Detection | Jul 1, 2024 | Change DetectionComputational Efficiency | CodeCode Available | 0 |
| M^2IST: Multi-Modal Interactive Side-Tuning for Efficient Referring Expression Comprehension | Jul 1, 2024 | GPUReferring Expression | —Unverified | 0 |
| Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules | Jun 30, 2024 | GPU | CodeCode Available | 0 |
| Hierarchical Memory for Long Video QA | Jun 30, 2024 | GPUQuestion Answering | —Unverified | 0 |
| LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes | Jun 30, 2024 | GPU | —Unverified | 0 |
| Explore as a Storm, Exploit as a Raindrop: On the Benefit of Fine-Tuning Kernel Schedulers with Coordinate Descent | Jun 28, 2024 | GPUScheduling | CodeCode Available | 0 |
| LLMEasyQuant: Scalable Quantization for Parallel and Distributed LLM Inference | Jun 28, 2024 | GPUQuantization | CodeCode Available | 1 |
| Meta Large Language Model Compiler: Foundation Models of Compiler Optimization | Jun 27, 2024 | Compiler OptimizationGPU | —Unverified | 0 |
| Graph Neural Network as Computationally Efficient Emulator of Ice-sheet and Sea-level System Model (ISSM) | Jun 26, 2024 | CPUGPU | —Unverified | 0 |
| Real-time Structure Flow | Jun 26, 2024 | Autonomous VehiclesGPU | —Unverified | 0 |
| DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image | Jun 26, 2024 | GPU | —Unverified | 0 |
| SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding | Jun 26, 2024 | GPUManagement | CodeCode Available | 1 |
| MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data | Jun 26, 2024 | DecoderGPU | —Unverified | 0 |
| ConStyle v2: A Strong Prompter for All-in-One Image Restoration | Jun 26, 2024 | AllGPU | CodeCode Available | 1 |
| On Scaling Up 3D Gaussian Splatting Training | Jun 26, 2024 | 3DGS3D Reconstruction | CodeCode Available | 4 |
| Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients | Jun 25, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks | Jun 25, 2024 | GPU | CodeCode Available | 0 |
| Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes | Jun 25, 2024 | GPUimage-classification | CodeCode Available | 1 |