| FastFace: Fast-converging Scheduler for Large-scale Face Recognition Training with One GPU | Apr 17, 2024 | Face RecognitionGPU | CodeCode Available | 0 |
| KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation | Nov 26, 2024 | CPUGPU | CodeCode Available | 0 |
| MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning | Dec 18, 2019 | GPUPlaying the Game of 2048 | CodeCode Available | 0 |
| Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting | Oct 17, 2024 | GPU | CodeCode Available | 0 |
| MG-GCN: Scalable Multi-GPU GCN Training Framework | Oct 17, 2021 | GPU | CodeCode Available | 0 |
| METER: a mobile vision transformer architecture for monocular depth estimation | Mar 13, 2024 | CPUData Augmentation | CodeCode Available | 0 |
| cito: An R package for training neural networks using torch | Mar 16, 2023 | CPUExplainable artificial intelligence | CodeCode Available | 0 |
| Meta Networks for Neural Style Transfer | Sep 13, 2017 | GPUStyle Transfer | CodeCode Available | 0 |
| Message Scheduling for Performant, Many-Core Belief Propagation | Sep 24, 2019 | GPUProtein Folding | CodeCode Available | 0 |
| Posterior-Guided Neural Architecture Search | Jun 23, 2019 | GPUimage-classification | CodeCode Available | 0 |
| On Exact Computation with an Infinitely Wide Neural Net | Apr 26, 2019 | Gaussian ProcessesGPU | CodeCode Available | 0 |
| Memory-efficient Segmentation of High-resolution Volumetric MicroCT Images | May 31, 2022 | GPUImage Segmentation | CodeCode Available | 0 |
| Advancing Video Self-Supervised Learning via Image Foundation Models | May 25, 2025 | GPURepresentation Learning | CodeCode Available | 0 |
| Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers | Apr 7, 2020 | GPU | CodeCode Available | 0 |
| A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs | Nov 27, 2024 | Computational EfficiencyCPU | CodeCode Available | 0 |
| Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach | Oct 3, 2024 | energy managementGPU | CodeCode Available | 0 |
| A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks | Jan 14, 2019 | GPU | CodeCode Available | 0 |
| Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Apr 9, 2021 | GPULanguage Modeling | CodeCode Available | 0 |
| Online Tensor Methods for Learning Latent Variable Models | Sep 3, 2013 | ArticlesCommunity Detection | CodeCode Available | 0 |
| Memory-Efficient Implementation of DenseNets | Jul 21, 2017 | GPU | CodeCode Available | 0 |
| Efficient Large-scale Approximate Nearest Neighbor Search on the GPU | Feb 20, 2017 | CPUGPU | CodeCode Available | 0 |
| Megapixel Image Generation with Step-Unrolled Denoising Autoencoders | Jun 24, 2022 | DenoisingGPU | CodeCode Available | 0 |
| Measuring the Energy Consumption and Efficiency of Deep Neural Networks: An Empirical Analysis and Design Recommendations | Mar 13, 2024 | CPUGPU | CodeCode Available | 0 |
| maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs | Jan 27, 2015 | Computational EfficiencyDeep Learning | CodeCode Available | 0 |
| SHTOcc: Effective 3D Occupancy Prediction with Sparse Head and Tail Voxels | May 28, 2025 | Autonomous DrivingGPU | CodeCode Available | 0 |