Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1312 papers

Title	Date	Tasks	Status	Hype
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer	Jan 23, 2017	Computational EfficiencyGPU	CodeCode Available	2
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation	Mar 18, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	2
Fast Feedforward Networks	Aug 28, 2023	Mixture-of-Experts	CodeCode Available	2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Feb 22, 2024	AllMixture-of-Experts	CodeCode Available	2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation	May 26, 2024	feature selectionMixture-of-Experts	CodeCode Available	2
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment	Feb 24, 2025	image-classificationImage Classification	CodeCode Available	2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More	Oct 8, 2024	Mixture-of-ExpertsQuantization	CodeCode Available	2
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2
Delta Decompression for MoE-based LLMs Compression	Feb 24, 2025	DiversityMixture-of-Experts	CodeCode Available	2
MDFEND: Multi-domain Fake News Detection	Jan 4, 2022	Fake News DetectionMixture-of-Experts	CodeCode Available	2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving	Sep 11, 2024	Autonomous DrivingFeature Engineering	CodeCode Available	2
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts	Mar 7, 2025	Mixture-of-ExpertsState Space Models	CodeCode Available	2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training	Nov 24, 2024	MathMixture-of-Experts	CodeCode Available	2
Superposition in Transformers: A Novel Way of Building Mixture of Experts	Dec 31, 2024	Mixture-of-Experts	CodeCode Available	2
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts	Mar 14, 2024	DenoisingMixture-of-Experts	CodeCode Available	2
Task-Customized Mixture of Adapters for General Image Fusion	Mar 19, 2024	Mixture-of-Experts	CodeCode Available	2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts	May 25, 2025	Mixture-of-Expertsmultimodal interaction	CodeCode Available	2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters	Nov 18, 2024	fill-maskFill Mask	CodeCode Available	2
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes	Jan 7, 2025	Mixture-of-ExpertsRepresentation Learning	CodeCode Available	2
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification	Dec 14, 2024	Mixture-of-ExpertsObject	CodeCode Available	2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling	Sep 28, 2024	image-classificationImage Classification	CodeCode Available	2
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts	May 9, 2024	Image CaptioningInstruction Following	CodeCode Available	2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning	Dec 22, 2023	Instruction FollowingMixture-of-Experts	CodeCode Available	2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration	Oct 20, 2024	AllComputational Efficiency	CodeCode Available	2

Show:10 25 50

← PrevPage 5 of 53Next →

No leaderboard results yet.