| Accelerating MoE Model Inference with Expert Sharding | Mar 11, 2025 | DecoderGPU | —Unverified | 0 |
| GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts | Mar 10, 2025 | 3D ReconstructionAutonomous Driving | —Unverified | 0 |
| eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference | Mar 10, 2025 | Mixture-of-ExpertsScheduling | —Unverified | 0 |
| ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration | Mar 10, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| MoFE: Mixture of Frozen Experts Architecture | Mar 9, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models | Mar 9, 2025 | Anomaly DetectionMamba | CodeCode Available | 0 |
| MANDARIN: Mixture-of-Experts Framework for Dynamic Delirium and Coma Prediction in ICU Patients: Development and Validation of an Acute Brain Dysfunction Prediction Model | Mar 8, 2025 | Mixture-of-Experts | —Unverified | 0 |
| A Novel Trustworthy Video Summarization Algorithm Through a Mixture of LoRA Experts | Mar 8, 2025 | Mixture-of-ExpertsVideo Summarization | —Unverified | 0 |
| MoEMoE: Question Guided Dense and Scalable Sparse Mixture-of-Expert for Multi-source Multi-modal Answering | Mar 8, 2025 | Answer GenerationMixture-of-Experts | —Unverified | 0 |
| FMT:A Multimodal Pneumonia Detection Model Based on Stacking MOE Framework | Mar 7, 2025 | DiagnosticMedical Image Analysis | —Unverified | 0 |
| Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs | Mar 7, 2025 | Knowledge GraphsMixture-of-Experts | —Unverified | 0 |
| Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts | Mar 7, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning | Mar 7, 2025 | GPUMath | —Unverified | 0 |
| TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster | Mar 6, 2025 | Domain AdaptationMixture-of-Experts | —Unverified | 0 |
| Continual Pre-training of MoEs: How robust is your router? | Mar 6, 2025 | DecoderMixture-of-Experts | —Unverified | 0 |
| A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery | Mar 6, 2025 | DenoisingDrug Discovery | —Unverified | 0 |
| Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining | Mar 6, 2025 | GPUHyperparameter Optimization | —Unverified | 0 |
| Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling | Mar 6, 2025 | Mixture-of-ExpertsScheduling | —Unverified | 0 |
| Convergence Rates for Softmax Gating Mixture of Experts | Mar 5, 2025 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification | Mar 5, 2025 | Mixture-of-Experts | —Unverified | 0 |
| VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection | Mar 5, 2025 | DiagnosticMixture-of-Experts | CodeCode Available | 0 |
| Tabby: Tabular Data Synthesis with Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer | Mar 4, 2025 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 0 |
| How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model | Mar 3, 2025 | Decision MakingDemand Forecasting | —Unverified | 0 |
| PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation | Mar 3, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |