Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1201–1250 of 1312 papers

Title	Date	Tasks	Status
Self-Routing Capsule Networks	Dec 1, 2019	ClusteringMixture-of-Experts	CodeCode Available
ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and Emotion Modeling	Feb 25, 2024	ChatbotDiversity	CodeCode Available
Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models	Feb 21, 2025	Mixture-of-Experts	CodeCode Available
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference	Dec 16, 2024	CPUGPU	CodeCode Available
DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts	Nov 5, 2024	Mixture-of-ExpertsSensitivity	CodeCode Available
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer	Mar 4, 2025	Computational EfficiencyMixture-of-Experts	CodeCode Available
Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate	Jul 8, 2025	Continual LearningMixture-of-Experts	CodeCode Available
Sequential Gaussian Processes for Online Learning of Nonstationary Functions	May 24, 2019	Gaussian ProcessesHyperparameter Optimization	CodeCode Available
Self-Supervised Multimodal Domino: in Search of Biomarkers for Alzheimer's Disease	Dec 25, 2020	Contrastive LearningDecoder	CodeCode Available
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available
Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification	Dec 11, 2024	Computational Efficiency	CodeCode Available
Video Relationship Detection Using Mixture of Experts	Mar 6, 2024	Action RecognitionMixture-of-Experts	CodeCode Available
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
Tensor-variate Mixture of Experts for Proportional Myographic Control of a Robotic Hand	Feb 28, 2019	Mixture-of-Expertsregression	CodeCode Available
Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection	Jan 6, 2025	Decision MakingMixture-of-Experts	CodeCode Available
Granger-causal Attentive Mixtures of Experts: Learning Important Features with Neural Networks	Feb 6, 2018	Feature ImportanceMixture-of-Experts	CodeCode Available
Adversarial Mixture Of Experts with Category Hierarchy Soft Constraint	Jul 24, 2020	ClusteringFeature Importance	CodeCode Available
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models	Apr 6, 2021	Mixture-of-ExpertsModel Selection	CodeCode Available
Covariate-guided Bayesian mixture model for multivariate time series	Jan 3, 2023	Mixture-of-ExpertsTime Series	CodeCode Available
Mixture Content Selection for Diverse Sequence Generation	Sep 4, 2019	Abstractive Text SummarizationDecoder	CodeCode Available
Countering Mainstream Bias via End-to-End Adaptive Local Learning	Apr 13, 2024	Collaborative FilteringMixture-of-Experts	CodeCode Available
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts	Feb 23, 2024	Mixture-of-Experts	CodeCode Available
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation	Apr 29, 2025	cross-modal alignmentDecoder	CodeCode Available
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts	Jul 13, 2024	DiversityMixture-of-Experts	CodeCode Available
Peirce in the Machine: How Mixture of Experts Models Perform Hypothesis Construction	Jun 24, 2024	Mixture-of-Experts	CodeCode Available
Condensing Multilingual Knowledge with Lightweight Language-Specific Modules	May 23, 2023	Machine TranslationMixture-of-Experts	CodeCode Available
Completed Feature Disentanglement Learning for Multimodal MRIs Analysis	Jul 6, 2024	DisentanglementMixture-of-Experts	CodeCode Available
Skeleton-Based Human Action Recognition with Noisy Labels	Mar 15, 2024	Action RecognitionDenoising	CodeCode Available
UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper Granularity	Dec 28, 2024	Image RestorationMixture-of-Experts	CodeCode Available
Manifold-Preserving Transformers are Effective for Short-Long Range Encoding	Oct 22, 2023	Language ModelingLanguage Modelling	CodeCode Available
GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving	Jul 19, 2025	Autonomous DrivingBench2Drive	CodeCode Available
FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models	Oct 3, 2023	Face TransferMixture-of-Experts	CodeCode Available
From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents	May 29, 2025	AI AgentMixture-of-Experts	CodeCode Available
A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers	Jun 1, 2024	Gaussian ProcessesMixture-of-Experts	CodeCode Available
Anomaly Detection by Recombining Gated Unsupervised Experts	Aug 31, 2020	Anomaly DetectionMixture-of-Experts	CodeCode Available
SMOSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks	Dec 17, 2024	continuous-controlContinuous Control	CodeCode Available
Finger Pose Estimation for Under-screen Fingerprint Sensor	May 5, 2025	Mixture-of-ExpertsPose Estimation	CodeCode Available
pFedMoE: Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Personalized Federated Learning	Feb 2, 2024	Federated LearningMixture-of-Experts	CodeCode Available
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models	Aug 17, 2024	Federated LearningMixture-of-Experts	CodeCode Available
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers	Feb 26, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing	Dec 24, 2024	Decision MakingFace Anti-Spoofing	CodeCode Available
Fast filtering of non-Gaussian models using Amortized Optimal Transport Maps	Mar 16, 2025	Mixture-of-Experts	CodeCode Available
A Gated Residual Kolmogorov-Arnold Networks for Mixtures of Experts	Sep 23, 2024	Kolmogorov-Arnold NetworksMixture-of-Experts	CodeCode Available
Bidirectional Attention as a Mixture of Continuous Word Experts	Jul 8, 2023	Language ModellingMixture-of-Experts	CodeCode Available
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy	Sep 11, 2021	Machine TranslationMixture-of-Experts	CodeCode Available
Tight Clusters Make Specialized Experts	Feb 21, 2025	ClusteringLanguage Modeling	CodeCode Available
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	May 19, 2025	Mixture-of-Experts	CodeCode Available
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors	Apr 2, 2024	Data PoisoningHate Speech Detection	CodeCode Available
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?	May 7, 2025	Large Language ModelMixture-of-Experts	CodeCode Available
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models	Aug 15, 2024	Mixture-of-Experts	CodeCode Available

Show:10 25 50

← PrevPage 25 of 27Next →

No leaderboard results yet.