| PICO: Secure Transformers via Robust Prompt Isolation and Cybersecurity Oversight | Apr 26, 2025 | Mixture-of-ExpertsPICO | —Unverified | 0 |
| NoEsis: Differentially Private Knowledge Transfer in Modular LLM Adaptation | Apr 25, 2025 | Code CompletionMixture-of-Experts | —Unverified | 0 |
| Unveiling the Hidden: Movie Genre and User Bias in Spoiler Detection | Apr 24, 2025 | Graph AttentionMixture-of-Experts | CodeCode Available | 0 |
| BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts | Apr 24, 2025 | Backdoor AttackMixture-of-Experts | —Unverified | 0 |
| Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images | Apr 21, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core | Apr 21, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification | Apr 21, 2025 | Exemplar-FreeKnowledge Distillation | CodeCode Available | 1 |
| Multi-Type Context-Aware Conversational Recommender Systems via Mixture-of-Experts | Apr 18, 2025 | Mixture-of-ExpertsRecommendation Systems | —Unverified | 0 |
| HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering | Apr 18, 2025 | ClusteringGraph Clustering | —Unverified | 0 |
| D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving | Apr 17, 2025 | Mixture-of-ExpertsModel Compression | —Unverified | 0 |
| Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data | Apr 16, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Dense Backpropagation Improves Training for Sparse Mixture-of-Experts | Apr 16, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models | Apr 16, 2025 | Dictionary LearningMixture-of-Experts | —Unverified | 0 |
| Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming | Apr 14, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation | Apr 13, 2025 | Dictionary LearningDomain Generalization | —Unverified | 0 |
| MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| RouterKT: Mixture-of-Experts for Knowledge Tracing | Apr 11, 2025 | Knowledge TracingMixture-of-Experts | CodeCode Available | 0 |
| Regularized infill criteria for multi-objective Bayesian optimization with application to aircraft design | Apr 11, 2025 | Bayesian Optimizationglobal-optimization | —Unverified | 0 |
| Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning | Apr 10, 2025 | Mixture-of-Expertsreinforcement-learning | —Unverified | 0 |
| Kimi-VL Technical Report | Apr 10, 2025 | Long-Context UnderstandingMathematical Reasoning | CodeCode Available | 5 |
| Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models | Apr 10, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models | Apr 10, 2025 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 0 |
| C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing | Apr 10, 2025 | In-Context LearningMixture-of-Experts | CodeCode Available | 1 |
| Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Apr 10, 2025 | Mixture-of-Expertsobject-detection | —Unverified | 0 |
| Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models | Apr 9, 2025 | Instruction FollowingMathematical Problem-Solving | —Unverified | 0 |
| FedMerge: Federated Personalization via Model Merging | Apr 9, 2025 | Federated LearningMixture-of-Experts | —Unverified | 0 |
| MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-Resolution | Apr 9, 2025 | Computational EfficiencyDenoising | CodeCode Available | 1 |
| Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations | Apr 8, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 |
| HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference | Apr 8, 2025 | CPUGPU | CodeCode Available | 2 |
| RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation | Apr 4, 2025 | Change DetectionDepth Estimation | —Unverified | 0 |
| HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs | Apr 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators | Apr 3, 2025 | Mixture-of-ExpertsQuantization | CodeCode Available | 1 |
| MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism | Apr 3, 2025 | CPUGPU | —Unverified | 0 |
| Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design | Apr 2, 2025 | AttributeMixture-of-Experts | —Unverified | 0 |
| A Unified Virtual Mixture-of-Experts Framework:Enhanced Inference and Hallucination Mitigation in Single-Model System | Apr 1, 2025 | Dialogue GenerationEnsemble Learning | —Unverified | 0 |
| Detecting Financial Fraud with Hybrid Deep Learning: A Mix-of-Experts Approach to Sequential and Anomalous Patterns | Apr 1, 2025 | Fraud DetectionMixture-of-Experts | —Unverified | 0 |
| DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing Mechanism | Apr 1, 2025 | Common Sense ReasoningComputational Efficiency | CodeCode Available | 0 |
| Unimodal-driven Distillation in Multimodal Emotion Recognition with Dynamic Fusion | Mar 31, 2025 | Emotion RecognitionKnowledge Distillation | —Unverified | 0 |
| Mixture of Routers | Mar 30, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Sparse Mixture of Experts as Unified Competitive Learning | Mar 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning | Mar 29, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models | Mar 29, 2025 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities | Mar 28, 2025 | Mixture-of-ExpertsText Generation | —Unverified | 0 |
| RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts | Mar 27, 2025 | Code RepairFeature Engineering | —Unverified | 0 |
| LLaVA-CMoE: Towards Continual Mixture of Experts for Large Vision-Language Models | Mar 27, 2025 | Mixture-of-Experts | —Unverified | 0 |
| iMedImage Technical Report | Mar 27, 2025 | Anomaly DetectionDiagnostic | —Unverified | 0 |
| A multi-scale lithium-ion battery capacity prediction using mixture of experts and patch-based MLP | Mar 26, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Reasoning Beyond Limits: Advances and Open Problems for LLMs | Mar 26, 2025 | Mixture-of-ExpertsRAG | —Unverified | 0 |
| Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuning | Mar 26, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework | Mar 26, 2025 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |