| HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation | Apr 27, 2022 | Domain AdaptationGPU | CodeCode Available | 2 | 5 |
| HLSTransform: Energy-Efficient Llama 2 Inference on FPGAs Via High Level Synthesis | Apr 29, 2024 | CPUEdge-computing | CodeCode Available | 2 | 5 |
| I-BERT: Integer-only BERT Quantization | Jan 5, 2021 | GPUNatural Language Inference | CodeCode Available | 2 | 5 |
| HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection | Feb 2, 2022 | Audio ClassificationEvent Detection | CodeCode Available | 2 | 5 |
| ALBERT: A Lite BERT for Self-supervised Learning of Language Representations | Sep 26, 2019 | Common Sense ReasoningGPU | CodeCode Available | 2 | 5 |
| Atom: Low-bit Quantization for Efficient and Accurate LLM Serving | Oct 29, 2023 | GPUQuantization | CodeCode Available | 2 | 5 |
| HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis | Oct 12, 2020 | CPUGPU | CodeCode Available | 2 | 5 |
| InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval | Jul 10, 2023 | GPUInformation Retrieval | CodeCode Available | 2 | 5 |
| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 | 5 |
| HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading | Feb 18, 2025 | Computational EfficiencyCPU | CodeCode Available | 2 | 5 |