| QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach | May 4, 2025 | Code GenerationGPU | —Unverified | 0 |
| QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects | Feb 27, 2025 | 3D Pose EstimationAction Recognition | —Unverified | 0 |
| QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration | May 10, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation | May 6, 2024 | GPU | —Unverified | 0 |
| QuAILoRA: Quantization-Aware Initialization for LoRA | Oct 9, 2024 | Causal Language ModelingGPU | —Unverified | 0 |
| Qualities, challenges and future of genetic algorithms: a literature review | Nov 5, 2020 | Artificial LifeGPU | —Unverified | 0 |
| QuantEase: Optimization-based Quantization for Language Models | Sep 5, 2023 | GPUQuantization | —Unverified | 0 |
| Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control | Dec 2, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Quantized Neural Network Inference with Precision Batching | Feb 26, 2020 | GPULanguage Modeling | —Unverified | 0 |
| QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache | Feb 5, 2025 | GPU | —Unverified | 0 |
| 10,000 km Straight-line Transmission using a Real-time Software-defined GPU-Based Receiver | Apr 8, 2021 | GPUVideo Prediction | —Unverified | 0 |
| Quantum-Enhanced Support Vector Machine for Large-Scale Stellar Classification with GPU Acceleration | Nov 21, 2023 | ClassificationComputational Efficiency | —Unverified | 0 |
| Quantum-inspired tensor network for Earth science | Jan 15, 2023 | GPUQuantum Machine Learning | —Unverified | 0 |
| Quantum-Powered Personalized Learning | Aug 25, 2024 | Computational EfficiencyGPU | —Unverified | 0 |
| Quantum Walks-Based Adaptive Distribution Generation with Efficient CUDA-Q Acceleration | Apr 18, 2025 | GPU | —Unverified | 0 |
| Query-focused Sentence Compression in Linear Time | Apr 19, 2019 | GPUSentence | —Unverified | 0 |
| Query-focused Sentence Compression in Linear Time | Nov 1, 2019 | GPUSentence | —Unverified | 0 |
| Query Processing on Tensor Computation Runtimes | Mar 3, 2022 | CPUGPU | —Unverified | 0 |
| Queueing Analysis of GPU-Based Inference Servers with Dynamic Batching: A Closed-Form Characterization | Dec 13, 2019 | Computational EfficiencyForm | —Unverified | 0 |
| RADARS: Memory Efficient Reinforcement Learning Aided Differentiable Neural Architecture Search | Sep 13, 2021 | GPUNeural Architecture Search | —Unverified | 0 |
| RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation | Apr 18, 2024 | GPURAG | —Unverified | 0 |
| RAIN: Real-time Animation of Infinite Video Stream | Dec 27, 2024 | DenoisingGPU | —Unverified | 0 |
| Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks | Jun 24, 2020 | GPUimage-classification | —Unverified | 0 |
| Random 2.5D U-net for Fully 3D Segmentation | Oct 23, 2019 | GPUSegmentation | —Unverified | 0 |
| Random Offset Block Embedding Array (ROBE) for CriteoTB Benchmark MLPerf DLRM Model : 1000 Compression and 3.1 Faster Inference | Aug 4, 2021 | GPUModel Compression | —Unverified | 0 |