| ECG-EmotionNet: Nested Mixture of Expert (NMoE) Adaptation of ECG-Foundation Model for Driver Emotion Recognition | Mar 3, 2025 | Autonomous DrivingComputational Efficiency | —Unverified | 0 |
| Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting | Mar 3, 2025 | Domain GeneralizationMixture-of-Experts | —Unverified | 0 |
| DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models | Mar 3, 2025 | Mixture-of-ExpertsQuantization | —Unverified | 0 |
| Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion | Mar 2, 2025 | Mixture-of-Expertswhole slide images | —Unverified | 0 |
| CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering | Mar 1, 2025 | Continual LearningLanguage Modeling | —Unverified | 0 |
| CoSMoEs: Compact Sparse Mixture of Experts | Feb 28, 2025 | Mixture-of-Experts | —Unverified | 0 |
| UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Mixture of Experts for Recognizing Depression from Interview and Reading Tasks | Feb 27, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems | Feb 27, 2025 | Action DetectionActivity Detection | —Unverified | 0 |
| Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization | Feb 26, 2025 | Mixture-of-Experts | —Unverified | 0 |
| OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment | Feb 26, 2025 | Mixture-of-ExpertsRecommendation Systems | —Unverified | 0 |
| The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE | Feb 24, 2025 | Linear Mode ConnectivityMixture-of-Experts | —Unverified | 0 |
| Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks | Feb 24, 2025 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds | Feb 24, 2025 | DiagnosticMixture-of-Experts | —Unverified | 0 |
| BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference | Feb 24, 2025 | Mixture-of-Experts | —Unverified | 0 |
| An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning | Feb 22, 2025 | ARCContinual Learning | —Unverified | 0 |
| Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models | Feb 21, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Tight Clusters Make Specialized Experts | Feb 21, 2025 | ClusteringLanguage Modeling | CodeCode Available | 0 |
| Ray-Tracing for Conditionally Activated Neural Networks | Feb 20, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts | Feb 19, 2025 | Dictionary LearningMixture-of-Experts | —Unverified | 0 |
| Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models | Feb 18, 2025 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |
| DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs | Feb 18, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines | Feb 17, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Connector-S: A Survey of Connectors in Multi-modal Large Language Models | Feb 17, 2025 | Mixture-of-ExpertsSurvey | —Unverified | 0 |
| Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer Gate | Feb 17, 2025 | GPUMixture-of-Experts | CodeCode Available | 0 |
| Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time | Feb 16, 2025 | Mixture-of-Experts | —Unverified | 0 |
| ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models | Feb 16, 2025 | energy managementMixture-of-Experts | —Unverified | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 |
| Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting | Feb 13, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification | Feb 12, 2025 | Mixture-of-ExpertsNode Classification | —Unverified | 0 |
| MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition | Feb 11, 2025 | Audio-Visual Speech RecognitionComputational Efficiency | —Unverified | 0 |
| Memory Analysis on the Training Course of DeepSeek Models | Feb 11, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| MoENAS: Mixture-of-Expert based Neural Architecture Search for jointly Accurate, Fair, and Robust Edge Deep Neural Networks | Feb 11, 2025 | Fairnessimage-classification | —Unverified | 0 |
| MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing | Feb 10, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition | Feb 9, 2025 | Gesture RecognitionHand Gesture Recognition | —Unverified | 0 |
| Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline | Feb 9, 2025 | CPUGPU | CodeCode Available | 0 |
| Mol-MoE: Training Preference-Guided Routers for Molecule Generation | Feb 8, 2025 | BenchmarkingDrug Design | CodeCode Available | 0 |
| fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving | Feb 7, 2025 | CPUGPU | —Unverified | 0 |
| Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion | Feb 7, 2025 | class-incremental learningClass Incremental Learning | —Unverified | 0 |
| Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts | Feb 7, 2025 | Meta-LearningMixture-of-Experts | —Unverified | 0 |
| Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient | Feb 7, 2025 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Mixture of neural operator experts for learning boundary conditions and model selection | Feb 6, 2025 | Mixture-of-ExpertsModel Selection | —Unverified | 0 |
| Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach | Feb 5, 2025 | Adversarial RobustnessMixture-of-Experts | —Unverified | 0 |
| ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction | Feb 4, 2025 | Computational EfficiencyLong-range modeling | —Unverified | 0 |
| Brief analysis of DeepSeek R1 and it's implications for Generative AI | Feb 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference | Feb 4, 2025 | Mixture-of-Experts | —Unverified | 0 |
| MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation | Feb 3, 2025 | BenchmarkingFairness | —Unverified | 0 |
| CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling | Feb 3, 2025 | Mixture-of-Experts | —Unverified | 0 |
| MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs | Feb 3, 2025 | Mathematical ReasoningMixture-of-Experts | —Unverified | 0 |
| Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective | Feb 2, 2025 | FairnessImage Segmentation | CodeCode Available | 0 |