SOTAVerified

Mixture-of-Experts

Papers

Showing 101125 of 1312 papers

TitleStatusHype
Demystifying the Compression of Mixture-of-Experts Through a Unified FrameworkCode2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts LayerCode2
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT AdaptationCode2
Fast Feedforward NetworksCode2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language ModelsCode2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time AdaptationCode2
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization AlignmentCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
A Closer Look into Mixture-of-Experts in Large Language ModelsCode2
Delta Decompression for MoE-based LLMs CompressionCode2
MDFEND: Multi-domain Fake News DetectionCode2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous DrivingCode2
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-ExpertsCode2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
Superposition in Transformers: A Novel Way of Building Mixture of ExpertsCode2
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-ExpertsCode2
Task-Customized Mixture of Adapters for General Image FusionCode2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-ExpertsCode2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersCode2
LiMoE: Mixture of LiDAR Representation Learners from Automotive ScenesCode2
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-IdentificationCode2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet UpcyclingCode2
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsCode2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-TuningCode2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image RestorationCode2
Show:102550
← PrevPage 5 of 53Next →

No leaderboard results yet.