| I/O Lower Bounds for Auto-tuning of Convolutions in CNNs | Dec 31, 2020 | GPU | —Unverified | 0 |
| IRLI: Iterative Re-partitioning for Learning to Index | Mar 17, 2021 | GPUInformation Retrieval | —Unverified | 0 |
| Irrational Complex Rotations Empower Low-bit Optimizers | Jan 22, 2025 | GPUQuantization | —Unverified | 0 |
| Is Architectural Complexity Overrated? Competitive and Interpretable Knowledge Graph Completion with RelatE | May 25, 2025 | GPUKnowledge Graph Completion | —Unverified | 0 |
| iServe: An Intent-based Serving System for LLMs | Jan 8, 2025 | GPU | —Unverified | 0 |
| ISO: Overlap of Computation and Communication within Seqenence For LLM Inference | Sep 4, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Isotonic Data Augmentation for Knowledge Distillation | Jul 3, 2021 | AttributeData Augmentation | —Unverified | 0 |
| Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs | Oct 23, 2024 | GPUScheduling | —Unverified | 0 |
| It's always personal: Using Early Exits for Efficient On-Device CNN Personalisation | Feb 2, 2021 | GPUModel Compression | —Unverified | 0 |
| JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba | Mar 5, 2025 | GPUMamba | —Unverified | 0 |