| MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition | Feb 11, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | —Unverified | 0 |
| Training Sparse Mixture Of Experts Text Embedding Models | Feb 11, 2025 | Mixture-of-ExpertsRAG | CodeCode Available | 4 |
| MoENAS: Mixture-of-Expert based Neural Architecture Search for jointly Accurate, Fair, and Robust Edge Deep Neural Networks | Feb 11, 2025 | Fairnessimage-classification | —Unverified | 0 |
| Memory Analysis on the Training Course of DeepSeek Models | Feb 11, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE | Feb 10, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing | Feb 10, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition | Feb 9, 2025 | Gesture RecognitionHand Gesture Recognition | —Unverified | 0 |
| Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline | Feb 9, 2025 | CPUGPU | CodeCode Available | 0 |
| Mol-MoE: Training Preference-Guided Routers for Molecule Generation | Feb 8, 2025 | BenchmarkingDrug Design | CodeCode Available | 0 |
| Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion | Feb 7, 2025 | class-incremental learningClass Incremental Learning | —Unverified | 0 |
| Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts | Feb 7, 2025 | Meta-LearningMixture-of-Experts | —Unverified | 0 |
| fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving | Feb 7, 2025 | CPUGPU | —Unverified | 0 |
| Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient | Feb 7, 2025 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Mixture of neural operator experts for learning boundary conditions and model selection | Feb 6, 2025 | Mixture-of-ExpertsModel Selection | —Unverified | 0 |
| CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference | Feb 6, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach | Feb 5, 2025 | Adversarial RobustnessMixture-of-Experts | —Unverified | 0 |
| ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction | Feb 4, 2025 | Computational EfficiencyLong-range modeling | —Unverified | 0 |
| M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference | Feb 4, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Brief analysis of DeepSeek R1 and it's implications for Generative AI | Feb 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling | Feb 3, 2025 | Mixture-of-Experts | —Unverified | 0 |
| MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation | Feb 3, 2025 | BenchmarkingFairness | —Unverified | 0 |
| MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs | Feb 3, 2025 | Mathematical ReasoningMixture-of-Experts | —Unverified | 0 |
| UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs | Feb 2, 2025 | Graph Neural NetworkMixture-of-Experts | CodeCode Available | 1 |
| Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective | Feb 2, 2025 | FairnessImage Segmentation | CodeCode Available | 0 |
| Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective | Feb 1, 2025 | Mixture-of-Experts | —Unverified | 0 |
| PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated Learning | Feb 1, 2025 | DenoisingFederated Learning | CodeCode Available | 1 |
| Pheromone-based Learning of Optimal Reasoning Paths | Jan 31, 2025 | ARCGSM8K | —Unverified | 0 |
| Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning | Jan 31, 2025 | Mixture-of-ExpertsVisual Prompt Tuning | —Unverified | 0 |
| MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability | Jan 30, 2025 | Drug DiscoveryMixture-of-Experts | —Unverified | 0 |
| Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks | Jan 29, 2025 | Link PredictionMixture-of-Experts | —Unverified | 0 |
| Free Agent in Agent-Based Mixture-of-Experts Generative AI Framework | Jan 29, 2025 | Fraud DetectionMixture-of-Experts | —Unverified | 0 |
| 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow | Jan 28, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 |
| Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference | Jan 27, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning | Jan 25, 2025 | Mixture-of-Experts | —Unverified | 0 |
| FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of Experts | Jan 25, 2025 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning | Jan 25, 2025 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| Mean-field limit from general mixtures of experts to quantum neural networks | Jan 24, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation | Jan 24, 2025 | Contrastive LearningMixture-of-Experts | CodeCode Available | 1 |
| Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images | Jan 24, 2025 | DenoisingDiagnostic | —Unverified | 0 |
| CSAOT: Cooperative Multi-Agent System for Active Object Tracking | Jan 23, 2025 | Autonomous NavigationDeep Reinforcement Learning | —Unverified | 0 |
| LLM4WM: Adapting LLM for Wireless Multi-Tasking | Jan 22, 2025 | General KnowledgeLanguage Modeling | —Unverified | 0 |
| UniUIR: Considering Underwater Image Restoration as An All-in-One Learner | Jan 22, 2025 | AllDepth Estimation | —Unverified | 0 |
| BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR | Jan 22, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Autonomy-of-Experts Models | Jan 22, 2025 | Decision MakingMixture-of-Experts | —Unverified | 0 |
| SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection | Jan 21, 2025 | Contrastive LearningFraud Detection | —Unverified | 0 |
| Modality Interactive Mixture-of-Experts for Fake News Detection | Jan 21, 2025 | Fake News DetectionMisinformation | CodeCode Available | 1 |
| Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models | Jan 21, 2025 | Mixture-of-Experts | —Unverified | 0 |
| MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks | Jan 21, 2025 | iFunMixture-of-Experts | CodeCode Available | 1 |
| Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models | Jan 21, 2025 | Mixture-of-Experts | —Unverified | 0 |
| FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models | Jan 18, 2025 | GPUMixture-of-Experts | —Unverified | 0 |