| An approach to optimize inference of the DIART speaker diarization pipeline | Aug 5, 2024 | Inference OptimizationKnowledge Distillation | —Unverified | 0 |
| LLaSA: Large Language and E-Commerce Shopping Assistant | Aug 4, 2024 | Inference OptimizationSpecificity | CodeCode Available | 0 |
| Patched MOA: optimizing inference for diverse software development tasks | Jul 26, 2024 | Inference Optimization | CodeCode Available | 0 |
| Inference Optimization of Foundation Models on AI Accelerators | Jul 12, 2024 | Inference OptimizationModel Compression | —Unverified | 0 |
| Inference Performance Optimization for Large Language Models on CPUs | Jul 10, 2024 | CPUGPU | CodeCode Available | 3 |
| Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization | Jul 2, 2024 | Inference OptimizationSpeech Synthesis | —Unverified | 0 |
| Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval | Jun 10, 2024 | Inference OptimizationInformation Retrieval | —Unverified | 0 |
| Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks | May 20, 2024 | Inference OptimizationKnowledge Distillation | —Unverified | 0 |
| Advances and Open Challenges in Federated Foundation Models | Apr 23, 2024 | Computational EfficiencyFederated Learning | —Unverified | 0 |
| Federated Learning While Providing Model as a Service: Joint Training and Inference Optimization | Dec 20, 2023 | Federated LearningInference Optimization | —Unverified | 0 |