| A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction | Nov 8, 2023 | BenchmarkingClick-Through Rate Prediction | CodeCode Available | 0 | 5 |
| CoDiCast: Conditional Diffusion Model for Global Weather Prediction with Uncertainty Quantification | Sep 9, 2024 | Computational EfficiencyDenoising | CodeCode Available | 0 | 5 |
| Factored Latent-Dynamic Conditional Random Fields for Single and Multi-label Sequence Modeling | Nov 9, 2019 | GPUModel Selection | CodeCode Available | 0 | 5 |
| Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search | Jan 14, 2024 | GPUimage-classification | CodeCode Available | 0 | 5 |
| MIOpen: An Open Source Library For Deep Learning Primitives | Sep 30, 2019 | Deep LearningGPU | CodeCode Available | 0 | 5 |
| Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform | Sep 8, 2018 | GPU | CodeCode Available | 0 | 5 |
| Efficient and generalizable nested Fourier-DeepONet for three-dimensional geological carbon sequestration | Sep 25, 2024 | GPU | CodeCode Available | 0 | 5 |
| Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference | Mar 11, 2025 | GPU | CodeCode Available | 0 | 5 |
| Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients | Jul 17, 2023 | BenchmarkingGPU | CodeCode Available | 0 | 5 |
| An Analysis of Neural Language Modeling at Multiple Scales | Mar 22, 2018 | GPULanguage Modeling | CodeCode Available | 0 | 5 |
| M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining | Jan 29, 2024 | GPUzero-shot-classification | CodeCode Available | 0 | 5 |
| BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet | May 27, 2017 | CPUGPU | CodeCode Available | 0 | 5 |
| A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks | Dec 25, 2023 | GPUparameter-efficient fine-tuning | CodeCode Available | 0 | 5 |
| MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning | Dec 18, 2019 | GPUPlaying the Game of 2048 | CodeCode Available | 0 | 5 |
| MG-GCN: Scalable Multi-GPU GCN Training Framework | Oct 17, 2021 | GPU | CodeCode Available | 0 | 5 |
| FastFace: Fast-converging Scheduler for Large-scale Face Recognition Training with One GPU | Apr 17, 2024 | Face RecognitionGPU | CodeCode Available | 0 | 5 |
| Edge-Guided Occlusion Fading Reduction for a Light-Weighted Self-Supervised Monocular Depth Estimation | Nov 26, 2019 | Depth EstimationGPU | CodeCode Available | 0 | 5 |
| BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget | Jun 10, 2019 | GPU | CodeCode Available | 0 | 5 |
| METER: a mobile vision transformer architecture for monocular depth estimation | Mar 13, 2024 | CPUData Augmentation | CodeCode Available | 0 | 5 |
| BlockQNN: Efficient Block-wise Neural Network Architecture Generation | Aug 16, 2018 | GPUimage-classification | CodeCode Available | 0 | 5 |
| PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System | Apr 10, 2024 | CPUDistributed Optimization | CodeCode Available | 0 | 5 |
| Message Scheduling for Performant, Many-Core Belief Propagation | Sep 24, 2019 | GPUProtein Folding | CodeCode Available | 0 | 5 |
| BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks | Jun 25, 2024 | GPU | CodeCode Available | 0 | 5 |
| Meta Networks for Neural Style Transfer | Sep 13, 2017 | GPUStyle Transfer | CodeCode Available | 0 | 5 |
| Memory-efficient Segmentation of High-resolution Volumetric MicroCT Images | May 31, 2022 | GPUImage Segmentation | CodeCode Available | 0 | 5 |