| Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining | Mar 6, 2025 | GPUHyperparameter Optimization | —Unverified | 0 |
| A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery | Mar 6, 2025 | DenoisingDrug Discovery | —Unverified | 0 |
| Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling | Mar 6, 2025 | Mixture-of-ExpertsScheduling | —Unverified | 0 |
| Question-Aware Gaussian Experts for Audio-Visual Question Answering | Mar 6, 2025 | Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA) | CodeCode Available | 1 |
| BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification | Mar 5, 2025 | Mixture-of-Experts | —Unverified | 0 |
| VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection | Mar 5, 2025 | DiagnosticMixture-of-Experts | CodeCode Available | 0 |
| Convergence Rates for Softmax Gating Mixture of Experts | Mar 5, 2025 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs | Mar 5, 2025 | Computational EfficiencyDescriptive | CodeCode Available | 1 |
| Tabby: Tabular Data Synthesis with Language Models | Mar 4, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font Generation | Mar 4, 2025 | Font GenerationMixture-of-Experts | CodeCode Available | 1 |
| Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer | Mar 4, 2025 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 0 |
| How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model | Mar 3, 2025 | Decision MakingDemand Forecasting | —Unverified | 0 |
| Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting | Mar 3, 2025 | Domain GeneralizationMixture-of-Experts | —Unverified | 0 |
| ECG-EmotionNet: Nested Mixture of Expert (NMoE) Adaptation of ECG-Foundation Model for Driver Emotion Recognition | Mar 3, 2025 | Autonomous DrivingComputational Efficiency | —Unverified | 0 |
| DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models | Mar 3, 2025 | Mixture-of-ExpertsQuantization | —Unverified | 0 |
| PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation | Mar 3, 2025 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion | Mar 2, 2025 | Mixture-of-Expertswhole slide images | —Unverified | 0 |
| CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering | Mar 1, 2025 | Continual LearningLanguage Modeling | —Unverified | 0 |
| CoSMoEs: Compact Sparse Mixture of Experts | Feb 28, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems | Feb 27, 2025 | Action DetectionActivity Detection | —Unverified | 0 |
| UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts | Feb 27, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts | Feb 27, 2025 | Computational EfficiencyGPU | CodeCode Available | 5 |
| Mixture of Experts for Recognizing Depression from Interview and Reading Tasks | Feb 27, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization | Feb 26, 2025 | Mixture-of-Experts | —Unverified | 0 |