SOTAVerified

Mixture-of-Experts

Papers

Showing 201225 of 1312 papers

TitleStatusHype
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question AnsweringCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Addressing Confounding Feature Issue for Causal RecommendationCode1
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank CompensatorsCode1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-ExpertsCode1
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder ModelsCode1
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identificationCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Contrastive Learning and Mixture of Experts Enables Precise Vector EmbeddingsCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoECode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
Emergent Modularity in Pre-trained TransformersCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
Mixture of Attention Heads: Selecting Attention Heads Per TokenCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
Show:102550
← PrevPage 9 of 53Next →

No leaderboard results yet.