| Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach | Nov 1, 2023 | Domain AdaptationMixture-of-Experts | —Unverified | 0 |
| A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts | Oct 22, 2023 | Density EstimationMixture-of-Experts | —Unverified | 0 |
| Manifold-Preserving Transformers are Effective for Short-Long Range Encoding | Oct 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Direct Neural Machine Translation with Task-level Mixture of Experts models | Oct 18, 2023 | Direct NMTLarge Language Model | —Unverified | 0 |
| Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs | Oct 18, 2023 | Contrastive LearningEntity Typing | CodeCode Available | 0 |
| Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer | Oct 15, 2023 | DiversityMixture-of-Experts | —Unverified | 0 |
| Adaptive Gating in Mixture-of-Experts based Language Models | Oct 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts | Oct 7, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion | Oct 6, 2023 | Mixture-of-Experts | CodeCode Available | 0 |
| Reinforcement Learning-based Mixture of Vision Transformers for Video Violence Recognition | Oct 4, 2023 | Mixture-of-Expertsreinforcement-learning | —Unverified | 0 |
| Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness | Oct 3, 2023 | GPUMachine Translation | —Unverified | 0 |
| FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models | Oct 3, 2023 | Face TransferMixture-of-Experts | CodeCode Available | 0 |
| Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts | Sep 25, 2023 | Density EstimationMixture-of-Experts | —Unverified | 0 |
| Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts | Sep 8, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives | Sep 1, 2023 | Mixture-of-Experts | CodeCode Available | 0 |
| Task-Based MoE for Multitask Multilingual Machine Translation | Aug 30, 2023 | Machine TranslationMixture-of-Experts | —Unverified | 0 |
| SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget | Aug 29, 2023 | Mixture-of-Expertsobject-detection | —Unverified | 0 |
| EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE | Aug 23, 2023 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Beyond Sharing: Conflict-Aware Multivariate Time Series Anomaly Detection | Aug 17, 2023 | Anomaly DetectionMixture-of-Experts | CodeCode Available | 0 |
| FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs | Aug 16, 2023 | GPUMixture-of-Experts | —Unverified | 0 |
| Experts Weights Averaging: A New General Training Scheme for Vision Transformers | Aug 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| A Novel Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle Trajectory and Driving Intention Prediction | Aug 1, 2023 | Mixture-of-ExpertsPosition | —Unverified | 0 |
| Uncertainty-Encoded Multi-Modal Fusion for Robust Object Detection in Autonomous Driving | Jul 30, 2023 | Autonomous DrivingMixture-of-Experts | —Unverified | 0 |
| Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform | Jul 11, 2023 | Continual LearningMixture-of-Experts | CodeCode Available | 0 |
| Bidirectional Attention as a Mixture of Continuous Word Experts | Jul 8, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 0 |