Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 1312 papers

Title	Date	Tasks	Status	Hype
EfficientLLM: Efficiency in Large Language Models	May 20, 2025	Mixture-of-ExpertsQuantization	—Unverified	0
Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive Prompting	May 20, 2025	Continual Relation ExtractionMixture-of-Experts	CodeCode Available	0
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation	May 20, 2025	Machine TranslationMixture-of-Experts	—Unverified	0
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach	May 20, 2025	Audio-Visual Speech RecognitionMixture-of-Experts	—Unverified	0
U-SAM: An audio language Model for Unified Speech, Audio, and Music Understanding	May 20, 2025	cross-modal alignmentLanguage Modeling	CodeCode Available	1
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and Inference	May 19, 2025	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
Model Selection for Gaussian-gated Gaussian Mixture of Experts Using Dendrograms of Mixing Measures	May 19, 2025	Computational EfficiencyEnsemble Learning	—Unverified	0
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics	May 19, 2025	Mixture-of-ExpertsTime Series	—Unverified	0
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models	May 19, 2025	FairnessMixture-of-Experts	—Unverified	0
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	May 19, 2025	Mixture-of-Experts	CodeCode Available	0
Model Merging in Pre-training of Large Language Models	May 17, 2025	Mixture-of-Experts	—Unverified	0
Improving Coverage in Combined Prediction Sets with Weighted p-values	May 17, 2025	Conformal PredictionMixture-of-Experts	—Unverified	0
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging	May 17, 2025	Continual LearningMixture-of-Experts	—Unverified	0
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition	May 17, 2025	Deep AttentionMamba	CodeCode Available	0
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production	May 16, 2025	Mixture-of-Experts	—Unverified	0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified	0
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery	May 16, 2025	Causal DiscoveryCausal Inference	—Unverified	0
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating	May 16, 2025	Language ModelingLanguage Modelling	—Unverified	0
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	May 14, 2025	MathMathematical Problem-Solving	CodeCode Available	0
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	May 14, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale	May 13, 2025	Mixture-of-Experts	—Unverified	0
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	May 13, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
UMoE: Unifying Attention and FFN with Shared Experts	May 12, 2025	Mixture-of-Experts	—Unverified	0
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers	May 11, 2025	Computational EfficiencyMixture-of-Experts	—Unverified	0
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts	May 11, 2025	Mixture-of-Experts	—Unverified	0
Seed1.5-VL Technical Report	May 11, 2025	Mixture-of-ExpertsMultimodal Reasoning	—Unverified	0
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration	May 10, 2025	GPUMixture-of-Experts	—Unverified	0
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding	May 10, 2025	DescriptiveEmotion Recognition	CodeCode Available	1
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free	May 10, 2025	AttributeMixture-of-Experts	CodeCode Available	4
FloE: On-the-Fly MoE Inference on Memory-constrained GPU	May 9, 2025	CPUGPU	—Unverified	0
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design	May 9, 2025	Mixture-of-ExpertsQuantization	CodeCode Available	1
Divide-and-Conquer: Cold-Start Bundle Recommendation via Mixture of Diffusion Experts	May 8, 2025	Mixture-of-Experts	—Unverified	0
Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs	May 7, 2025	Mixture-of-Experts	—Unverified	0
SToLa: Self-Adaptive Touch-Language Framework with Tactile Commonsense Reasoning in Open-Ended Scenarios	May 7, 2025	DiversityMixture-of-Experts	—Unverified	0
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?	May 7, 2025	Large Language ModelMixture-of-Experts	CodeCode Available	0
STAR-Rec: Making Peace with Length Variance and Pattern Diversity in Sequential Recommendation	May 6, 2025	DiversityMixture-of-Experts	—Unverified	0
Faster MoE LLM Inference for Extremely Large Models	May 6, 2025	Inference OptimizationMixture-of-Experts	—Unverified	0
3D Gaussian Splatting Data Compression with Mixture of Priors	May 6, 2025	3DGSData Compression	—Unverified	0
Towards Smart Point-and-Shoot Photography	May 6, 2025	Mixture-of-ExpertsWord Embeddings	—Unverified	0
Multimodal Deep Learning-Empowered Beam Prediction in Future THz ISAC Systems	May 5, 2025	Beam PredictionDeep Learning	—Unverified	0
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques	May 5, 2025	Knowledge DistillationMixture-of-Experts	—Unverified	0
Finger Pose Estimation for Under-screen Fingerprint Sensor	May 5, 2025	Mixture-of-ExpertsPose Estimation	CodeCode Available	0
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields	May 4, 2025	Mixture-of-ExpertsNeRF	CodeCode Available	3
Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks	May 2, 2025	Mixture-of-Experts	—Unverified	0
CoCoAFusE: Beyond Mixtures of Experts via Model Fusion	May 2, 2025	Mixture-of-ExpertsPhilosophy	—Unverified	0
CICADA: Cross-Domain Interpretable Coding for Anomaly Detection and Adaptation in Multivariate Time Series	May 1, 2025	Anomaly DetectionMeta-Learning	—Unverified	0
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing	May 1, 2025	Mixture-of-Experts	CodeCode Available	1
MoxE: Mixture of xLSTM Experts with Entropy-Aware Routing for Efficient Language Modeling	May 1, 2025	Language ModelingLanguage Modelling	—Unverified	0
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation	Apr 29, 2025	cross-modal alignmentDecoder	CodeCode Available	0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication	Apr 28, 2025	GPUMixture-of-Experts	—Unverified	0

Show:10 25 50

← PrevPage 3 of 27Next →

No leaderboard results yet.