| Towards providing reliable job completion time predictions using PCS | Jan 18, 2024 | FairnessGPU | CodeCode Available | 0 |
| PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map Consistency | Jan 17, 2024 | GPUIncremental Learning | CodeCode Available | 4 |
| Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices | Jan 17, 2024 | Dynamic neural networksGPU | CodeCode Available | 1 |
| Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Jan 17, 2024 | GPUImage Classification | CodeCode Available | 2 |
| LoMA: Lossless Compressed Memory Attention | Jan 16, 2024 | GPU | —Unverified | 0 |
| Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jan 16, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models | Jan 16, 2024 | GPUQuantization | CodeCode Available | 3 |
| TP-Aware Dequantization | Jan 15, 2024 | GPUQuantization | —Unverified | 0 |
| Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search | Jan 14, 2024 | GPUimage-classification | CodeCode Available | 0 |
| Beyond Traditional Approaches: Multi-Task Network for Breast Ultrasound Diagnosis | Jan 14, 2024 | Anomaly ClassificationCancer Classification | CodeCode Available | 0 |