Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–325 of 1312 papers

Title	Date	Tasks	Status	Hype
MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition	Feb 11, 2025	Audio-Visual Speech RecognitionComputational Efficiency	—Unverified	0
Training Sparse Mixture Of Experts Text Embedding Models	Feb 11, 2025	Mixture-of-ExpertsRAG	CodeCode Available	4
Memory Analysis on the Training Course of DeepSeek Models	Feb 11, 2025	GPUMixture-of-Experts	—Unverified	0
MoENAS: Mixture-of-Expert based Neural Architecture Search for jointly Accurate, Fair, and Robust Edge Deep Neural Networks	Feb 11, 2025	Fairnessimage-classification	—Unverified	0
MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing	Feb 10, 2025	GPUMixture-of-Experts	—Unverified	0
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE	Feb 10, 2025	DiversityLanguage Modeling	CodeCode Available	1
MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition	Feb 9, 2025	Gesture RecognitionHand Gesture Recognition	—Unverified	0
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline	Feb 9, 2025	CPUGPU	CodeCode Available	0
Mol-MoE: Training Preference-Guided Routers for Molecule Generation	Feb 8, 2025	BenchmarkingDrug Design	CodeCode Available	0
Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion	Feb 7, 2025	class-incremental learningClass Incremental Learning	—Unverified	0
Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts	Feb 7, 2025	Meta-LearningMixture-of-Experts	—Unverified	0
fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving	Feb 7, 2025	CPUGPU	—Unverified	0
Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient	Feb 7, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
Mixture of neural operator experts for learning boundary conditions and model selection	Feb 6, 2025	Mixture-of-ExpertsModel Selection	—Unverified	0
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference	Feb 6, 2025	Mixture-of-Experts	CodeCode Available	1
Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach	Feb 5, 2025	Adversarial RobustnessMixture-of-Experts	—Unverified	0
ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction	Feb 4, 2025	Computational EfficiencyLong-range modeling	—Unverified	0
M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference	Feb 4, 2025	Mixture-of-Experts	—Unverified	0
Brief analysis of DeepSeek R1 and it's implications for Generative AI	Feb 4, 2025	GPUMixture-of-Experts	—Unverified	0
CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling	Feb 3, 2025	Mixture-of-Experts	—Unverified	0
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation	Feb 3, 2025	BenchmarkingFairness	—Unverified	0
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs	Feb 3, 2025	Mathematical ReasoningMixture-of-Experts	—Unverified	0
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs	Feb 2, 2025	Graph Neural NetworkMixture-of-Experts	CodeCode Available	1
Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective	Feb 2, 2025	FairnessImage Segmentation	CodeCode Available	0
Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective	Feb 1, 2025	Mixture-of-Experts	—Unverified	0

Show:10 25 50

← PrevPage 13 of 53Next →

No leaderboard results yet.