| Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts | Nov 19, 2023 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets | Nov 8, 2023 | Mixture-of-Expertsobject-detection | CodeCode Available | 1 |
| SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models | Oct 29, 2023 | GPUMixture-of-Experts | CodeCode Available | 1 |
| SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation | Oct 24, 2023 | Code GenerationCode Translation | CodeCode Available | 1 |
| Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach | Oct 18, 2023 | Blind Super-ResolutionDecoder | CodeCode Available | 1 |
| Merging Experts into One: Improving Computational Efficiency of Mixture of Experts | Oct 15, 2023 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Sparse Universal Transformer | Oct 11, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy | Oct 2, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection | Sep 26, 2023 | Instance SegmentationMixture-of-Experts | CodeCode Available | 1 |
| LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models | Sep 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis | Sep 7, 2023 | Image GenerationMixture-of-Experts | CodeCode Available | 1 |
| Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference | Aug 23, 2023 | CPUGPU | CodeCode Available | 1 |
| Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts | Aug 22, 2023 | Mixture-of-ExpertsNeRF | CodeCode Available | 1 |
| HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph Completion | Aug 12, 2023 | AttributeKnowledge Graph Completion | CodeCode Available | 1 |
| MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language Models | Jul 18, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 1 |
| Deep learning techniques for blind image super-resolution: A high-scale multi-domain perspective evaluation | Jun 15, 2023 | Image Quality AssessmentImage Super-Resolution | CodeCode Available | 1 |
| ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer | Jun 10, 2023 | Efficient ViTsMixture-of-Experts | CodeCode Available | 1 |
| Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks | Jun 7, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search | Jun 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts | May 30, 2023 | CPUGPU | CodeCode Available | 1 |
| Emergent Modularity in Pre-trained Transformers | May 28, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Lifting the Curse of Capacity Gap in Distilling Language Models | May 20, 2023 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | May 1, 2023 | Data IntegrationEntity Resolution | CodeCode Available | 1 |
| Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation | Apr 3, 2023 | Mixture-of-ExpertsTransfer Learning | CodeCode Available | 1 |
| Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild | Apr 2, 2023 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |