SOTAVerified

Mixture-of-Experts

Papers

Showing 301350 of 1312 papers

TitleStatusHype
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential RecommendationCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Heterogeneous Multi-task Learning with Expert DiversityCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned ExpertsCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
Efficient and Degradation-Adaptive Network for Real-World Image Super-ResolutionCode1
Graph Sparsification via Mixture of GraphsCode1
Go Wider Instead of DeeperCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
Gradient-free variational learning with conditional mixture networksCode1
Gated Multimodal Units for Information FusionCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided AdaptationCode1
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language ModelCode1
Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference0
Automatic Document Sketching: Generating Drafts from Analogous Texts0
Demystifying Softmax Gating Function in Gaussian Mixture of Experts0
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models0
Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling0
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks0
A Universal Approximation Theorem for Mixture of Experts Models0
AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction0
Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network0
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers0
A Unified Virtual Mixture-of-Experts Framework:Enhanced Inference and Hallucination Mitigation in Single-Model System0
A Unified Framework for Iris Anti-Spoofing: Introducing IrisGeneral Dataset and Masked-MoE Method0
A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds0
Deep Learning Mixture-of-Experts Approach for Cytotoxic Edema Assessment in Infants and Children0
Alternating Updates for Efficient Transformers0
Adaptive Conditional Expert Selection Network for Multi-domain Recommendation0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication0
Deep Gaussian Covariance Network0
Attention Weighted Mixture of Experts with Contrastive Learning for Personalized Ranking in E-commerce0
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis0
Data Expansion using Back Translation and Paraphrasing for Hate Speech Detection0
A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data0
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception0
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models0
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery0
Show:102550
← PrevPage 7 of 27Next →

No leaderboard results yet.