Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 1312 papers

Title	Date	Tasks	Status
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production	May 16, 2025	Mixture-of-Experts	—Unverified
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	May 14, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	May 14, 2025	MathMathematical Problem-Solving	CodeCode Available
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale	May 13, 2025	Mixture-of-Experts	—Unverified
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	May 13, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
UMoE: Unifying Attention and FFN with Shared Experts	May 12, 2025	Mixture-of-Experts	—Unverified
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers	May 11, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
Seed1.5-VL Technical Report	May 11, 2025	Mixture-of-ExpertsMultimodal Reasoning	—Unverified
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts	May 11, 2025	Mixture-of-Experts	—Unverified
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration	May 10, 2025	GPUMixture-of-Experts	—Unverified
FloE: On-the-Fly MoE Inference on Memory-constrained GPU	May 9, 2025	CPUGPU	—Unverified
Divide-and-Conquer: Cold-Start Bundle Recommendation via Mixture of Diffusion Experts	May 8, 2025	Mixture-of-Experts	—Unverified
SToLa: Self-Adaptive Touch-Language Framework with Tactile Commonsense Reasoning in Open-Ended Scenarios	May 7, 2025	DiversityMixture-of-Experts	—Unverified
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?	May 7, 2025	Large Language ModelMixture-of-Experts	CodeCode Available
Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs	May 7, 2025	Mixture-of-Experts	—Unverified
Faster MoE LLM Inference for Extremely Large Models	May 6, 2025	Inference OptimizationMixture-of-Experts	—Unverified
3D Gaussian Splatting Data Compression with Mixture of Priors	May 6, 2025	3DGSData Compression	—Unverified
STAR-Rec: Making Peace with Length Variance and Pattern Diversity in Sequential Recommendation	May 6, 2025	DiversityMixture-of-Experts	—Unverified
Towards Smart Point-and-Shoot Photography	May 6, 2025	Mixture-of-ExpertsWord Embeddings	—Unverified
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques	May 5, 2025	Knowledge DistillationMixture-of-Experts	—Unverified
Finger Pose Estimation for Under-screen Fingerprint Sensor	May 5, 2025	Mixture-of-ExpertsPose Estimation	CodeCode Available
Multimodal Deep Learning-Empowered Beam Prediction in Future THz ISAC Systems	May 5, 2025	Beam PredictionDeep Learning	—Unverified
Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks	May 2, 2025	Mixture-of-Experts	—Unverified
CoCoAFusE: Beyond Mixtures of Experts via Model Fusion	May 2, 2025	Mixture-of-ExpertsPhilosophy	—Unverified

Show:10 25 50

← PrevPage 18 of 53Next →

No leaderboard results yet.