| SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model | Jun 2, 2025 | Mixture-of-ExpertsUnsupervised Pre-training | CodeCode Available | 1 |
| Enhancing Multimodal Continual Instruction Tuning with BranchLoRA | May 31, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis | May 30, 2025 | BlockingMixture-of-Experts | —Unverified | 0 |
| Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction | May 30, 2025 | Domain GeneralizationMixture-of-Experts | —Unverified | 0 |
| Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer | May 30, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| GradPower: Powering Gradients for Faster Language Model Pre-Training | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks | May 30, 2025 | Mixture-of-Experts | —Unverified | 0 |
| A Survey of Generative Categories and Techniques in Multimodal Large Language Models | May 29, 2025 | Mixture-of-ExpertsSelf-Supervised Learning | —Unverified | 0 |
| Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts | May 29, 2025 | 3D Semantic SegmentationDomain Generalization | —Unverified | 0 |
| Revisiting Uncertainty Estimation and Calibration of Large Language Models | May 29, 2025 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| Noise-Robustness Through Noise: Asymmetric LoRA Adaption with Poisoning Expert | May 29, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Two Is Better Than One: Rotations Scale LoRAs | May 29, 2025 | Mixture-of-Experts | —Unverified | 0 |
| From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents | May 29, 2025 | AI AgentMixture-of-Experts | CodeCode Available | 0 |
| EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models | May 28, 2025 | Mixture-of-ExpertsMME | —Unverified | 0 |
| HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer | May 28, 2025 | Image GenerationMixture-of-Experts | CodeCode Available | 7 |
| ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation | May 28, 2025 | Contact-rich ManipulationMixture-of-Experts | —Unverified | 0 |
| A Human-Centric Approach to Explainable AI for Personalized Education | May 28, 2025 | Autonomous DrivingMixture-of-Experts | CodeCode Available | 0 |
| Advancing Expert Specialization for Better MoE | May 28, 2025 | Mixture-of-Experts | —Unverified | 0 |
| MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes | May 27, 2025 | BenchmarkingDenoising | —Unverified | 0 |
| Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate | May 26, 2025 | ImputationMixture-of-Experts | CodeCode Available | 0 |
| WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE | May 26, 2025 | Mixture-of-Experts | —Unverified | 0 |
| NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-ID | May 26, 2025 | AttributeCaption Generation | —Unverified | 0 |
| FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models | May 26, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments | May 26, 2025 | Data-free Knowledge DistillationFederated Learning | CodeCode Available | 0 |