Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1312 papers

Title	Date	Tasks	Status	Hype
EfficientLLM: Efficiency in Large Language Models	May 20, 2025	Mixture-of-ExpertsQuantization	—Unverified	0
StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning	May 20, 2025	class-incremental learningClass Incremental Learning	—Unverified	0
Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive Prompting	May 20, 2025	Continual Relation ExtractionMixture-of-Experts	CodeCode Available	0
U-SAM: An audio language Model for Unified Speech, Audio, and Music Understanding	May 20, 2025	cross-modal alignmentLanguage Modeling	CodeCode Available	1
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach	May 20, 2025	Audio-Visual Speech RecognitionMixture-of-Experts	—Unverified	0
Model Selection for Gaussian-gated Gaussian Mixture of Experts Using Dendrograms of Mixing Measures	May 19, 2025	Computational EfficiencyEnsemble Learning	—Unverified	0
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and Inference	May 19, 2025	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models	May 19, 2025	FairnessMixture-of-Experts	—Unverified	0
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	May 19, 2025	Mixture-of-Experts	CodeCode Available	0
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics	May 19, 2025	Mixture-of-ExpertsTime Series	—Unverified	0
Model Merging in Pre-training of Large Language Models	May 17, 2025	Mixture-of-Experts	—Unverified	0
Improving Coverage in Combined Prediction Sets with Weighted p-values	May 17, 2025	Conformal PredictionMixture-of-Experts	—Unverified	0
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition	May 17, 2025	Deep AttentionMamba	CodeCode Available	0
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging	May 17, 2025	Continual LearningMixture-of-Experts	—Unverified	0
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production	May 16, 2025	Mixture-of-Experts	—Unverified	0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified	0
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery	May 16, 2025	Causal DiscoveryCausal Inference	—Unverified	0
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating	May 16, 2025	Language ModelingLanguage Modelling	—Unverified	0
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	May 14, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	May 14, 2025	MathMathematical Problem-Solving	CodeCode Available	0
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	May 13, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale	May 13, 2025	Mixture-of-Experts	—Unverified	0
UMoE: Unifying Attention and FFN with Shared Experts	May 12, 2025	Mixture-of-Experts	—Unverified	0
Seed1.5-VL Technical Report	May 11, 2025	Mixture-of-ExpertsMultimodal Reasoning	—Unverified	0
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers	May 11, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0

Show:10 25 50

← PrevPage 5 of 53Next →

No leaderboard results yet.