SOTAVerified

Mixture-of-Experts

Papers

Showing 251300 of 1312 papers

TitleStatusHype
Go Wider Instead of DeeperCode1
Gradient-free variational learning with conditional mixture networksCode1
Norface: Improving Facial Expression Analysis by Identity NormalizationCode1
BiMediX: Bilingual Medical Mixture of Experts LLMCode1
MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language ModelsCode1
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and InferenceCode1
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-DesignCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic ForecastingCode1
Multi-Task Reinforcement Learning with Mixture of Orthogonal ExpertsCode1
Frequency-Adaptive Pan-Sharpening with Mixture of ExpertsCode1
Multi-view Depth Estimation using Epipolar Spatio-Temporal NetworksCode1
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural NetworksCode1
Multi-Head Mixture-of-ExpertsCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
Multilinear Mixture of Experts: Scalable Expert Specialization through FactorizationCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
DMoERM: Recipes of Mixture-of-Experts for Effective Reward ModelingCode1
Multimodal Clinical Trial Outcome Prediction with Large Language ModelsCode1
MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing NetworksCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
MoExtend: Tuning New Experts for Modality and Task ExtensionCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement LearningCode1
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identificationCode1
Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-ExpertsCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Distilling the Knowledge in a Neural NetworkCode1
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality ReductionCode1
Specialized federated learning using a mixture of expertsCode1
Emergent Modularity in Pre-trained TransformersCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of ExpertsCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided AdaptationCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-ResolutionCode1
Gated Multimodal Units for Information FusionCode1
MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font GenerationCode1
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-ExpertsCode1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Modality Interactive Mixture-of-Experts for Fake News DetectionCode1
Show:102550
← PrevPage 6 of 27Next →

No leaderboard results yet.