| k-Winners-Take-All Ensemble Neural Network | Jan 4, 2024 | AllMixture-of-Experts | CodeCode Available | 0 |
| Fast Inference of Mixture-of-Experts Language Models with Offloading | Dec 28, 2023 | Mixture-of-ExpertsQuantization | CodeCode Available | 4 |
| Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation | Dec 27, 2023 | Image RestorationMixture-of-Experts | —Unverified | 0 |
| Agent4Ranking: Semantic Robust Ranking via Personalized Query Rewriting Using Multi-agent LLM | Dec 24, 2023 | Mixture-of-Experts | —Unverified | 0 |
| SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling | Dec 23, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing | Dec 22, 2023 | Mixture-of-ExpertsMotion Generation | CodeCode Available | 1 |
| Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning | Dec 22, 2023 | Instruction FollowingMixture-of-Experts | CodeCode Available | 2 |
| Generator Assisted Mixture of Experts For Feature Acquisition in Batch | Dec 19, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning | Dec 19, 2023 | DiversityInstruction Following | —Unverified | 0 |
| From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape | Dec 18, 2023 | Mixture-of-Experts | —Unverified | 0 |
| When Parameter-efficient Tuning Meets General-purpose Vision-language Models | Dec 16, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin | Dec 15, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 2 |
| Online Action Recognition for Human Risk Prediction with Anticipated Haptic Alert via Wearables | Dec 14, 2023 | Action RecognitionMixture-of-Experts | CodeCode Available | 0 |
| Training of Neural Networks with Uncertain Data: A Mixture of Experts Approach | Dec 13, 2023 | Autonomous DrivingMixture-of-Experts | —Unverified | 0 |
| SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention | Dec 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts | Dec 12, 2023 | DenoisingDiversity | CodeCode Available | 1 |
| HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts | Dec 12, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Mixture-of-Linear-Experts for Long-term Time Series Forecasting | Dec 11, 2023 | Mixture-of-ExpertsTime Series | CodeCode Available | 1 |
| GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts | Dec 7, 2023 | DiversityGraph Neural Network | CodeCode Available | 1 |
| MoE-AMC: Enhancing Automatic Modulation Classification Performance Using Mixture-of-Experts | Dec 4, 2023 | ClassificationMixture-of-Experts | —Unverified | 0 |
| MoEC: Mixture of Experts Implicit Neural Compression | Dec 3, 2023 | Data CompressionMixture-of-Experts | —Unverified | 0 |
| Language-driven All-in-one Adverse Weather Removal | Dec 3, 2023 | AllDiversity | —Unverified | 0 |
| Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | Dec 1, 2023 | Chart Question AnsweringDocument AI | —Unverified | 0 |
| HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts | Nov 23, 2023 | Compositional Zero-Shot LearningMixture-of-Experts | —Unverified | 0 |
| Efficient Model Agnostic Approach for Implicit Neural Representation Based Arbitrary-Scale Image Super-Resolution | Nov 20, 2023 | Computational EfficiencyDecoder | —Unverified | 0 |
| Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts | Nov 19, 2023 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| Memory Augmented Language Models through Mixture of Word Experts | Nov 15, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Intentional Biases in LLM Responses | Nov 11, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets | Nov 8, 2023 | Mixture-of-Expertsobject-detection | CodeCode Available | 1 |
| CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval | Nov 6, 2023 | Mixture-of-ExpertsRetrieval | —Unverified | 0 |
| Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE | Nov 5, 2023 | DecoderMixture-of-Experts | CodeCode Available | 0 |
| Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach | Nov 1, 2023 | Domain AdaptationMixture-of-Experts | —Unverified | 0 |
| SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models | Oct 29, 2023 | GPUMixture-of-Experts | CodeCode Available | 1 |
| QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models | Oct 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 2 |
| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation | Oct 24, 2023 | Code GenerationCode Translation | CodeCode Available | 1 |
| A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts | Oct 22, 2023 | Density EstimationMixture-of-Experts | —Unverified | 0 |
| Manifold-Preserving Transformers are Effective for Short-Long Range Encoding | Oct 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Direct Neural Machine Translation with Task-level Mixture of Experts models | Oct 18, 2023 | Direct NMTLarge Language Model | —Unverified | 0 |
| Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs | Oct 18, 2023 | Contrastive LearningEntity Typing | CodeCode Available | 0 |
| Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach | Oct 18, 2023 | Blind Super-ResolutionDecoder | CodeCode Available | 1 |
| Merging Experts into One: Improving Computational Efficiency of Mixture of Experts | Oct 15, 2023 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer | Oct 15, 2023 | DiversityMixture-of-Experts | —Unverified | 0 |
| Adaptive Gating in Mixture-of-Experts based Language Models | Oct 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Sparse Universal Transformer | Oct 11, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts | Oct 7, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion | Oct 6, 2023 | Mixture-of-Experts | CodeCode Available | 0 |
| Reinforcement Learning-based Mixture of Vision Transformers for Video Violence Recognition | Oct 4, 2023 | Mixture-of-Expertsreinforcement-learning | —Unverified | 0 |
| Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness | Oct 3, 2023 | GPUMachine Translation | —Unverified | 0 |
| FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models | Oct 3, 2023 | Face TransferMixture-of-Experts | CodeCode Available | 0 |