Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 1312 papers

Title	Date	Tasks	Status	Hype
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR	Jun 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Mixture of Experts in a Mixture of RL settings	Jun 26, 2024	Deep Reinforcement LearningMixture-of-Experts	—Unverified	0
MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias	Jun 25, 2024	Mixture-of-Experts	—Unverified	0
Peirce in the Machine: How Mixture of Experts Models Perform Hypothesis Construction	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	0
Theory on Mixture-of-Experts in Continual Learning	Jun 24, 2024	Continual LearningMixture-of-Experts	—Unverified	0
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	5
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
SimSMoE: Solving Representational Collapse via Similarity Measure	Jun 22, 2024	Mixture-of-Experts	—Unverified	0
Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation	Jun 19, 2024	Continual LearningImage Segmentation	—Unverified	0
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models	Jun 19, 2024	ARCMixture-of-Experts	CodeCode Available	1
P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts	Jun 18, 2024	Mixture-of-Experts	—Unverified	0
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory	Jun 18, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	0
Variational Distillation of Diffusion Policies into Mixture of Experts	Jun 18, 2024	DenoisingMixture-of-Experts	—Unverified	0
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding	Jun 17, 2024	Mixture-of-ExpertsNatural Language Understanding	CodeCode Available	0
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts	Jun 17, 2024	Mixture-of-Experts	CodeCode Available	1
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	0
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts	Jun 17, 2024	HallucinationMixture-of-Experts	CodeCode Available	1
Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction	Jun 14, 2024	Mixture-of-ExpertsPrediction	—Unverified	0
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion	Jun 14, 2024	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts	Jun 13, 2024	ManagementMixture-of-Experts	CodeCode Available	1
Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark	Jun 12, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters	Jun 10, 2024	Mixture-of-Experts	CodeCode Available	9
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter	Jun 7, 2024	CPUGPU	CodeCode Available	1
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks	Jun 7, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2
Style Mixture of Experts for Expressive Text-To-Speech Synthesis	Jun 5, 2024	Mixture-of-ExpertsSpeech Synthesis	—Unverified	0
Continual Traffic Forecasting via Mixture of Experts	Jun 5, 2024	Continual LearningMixture-of-Experts	—Unverified	0
Node-wise Filtering in Graph Neural Networks: A Mixture of Experts Approach	Jun 5, 2024	Mixture-of-ExpertsNode Classification	—Unverified	0
Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models	Jun 5, 2024	Mixture-of-ExpertsTime Series	—Unverified	0
Parrot: Multilingual Visual Instruction Tuning	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	5
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	2
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models	Jun 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model	Jun 2, 2024	DenoisingMixture-of-Experts	CodeCode Available	3
Optimizing 6G Integrated Sensing and Communications (ISAC) via Expert Networks	Jun 1, 2024	ISACMixture-of-Experts	—Unverified	0
A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers	Jun 1, 2024	Gaussian ProcessesMixture-of-Experts	CodeCode Available	0
Training-efficient density quantum machine learning	May 30, 2024	LEMMAMixture-of-Experts	—Unverified	0
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors	May 29, 2024	Mixture-of-ExpertsModel Editing	—Unverified	0
Learning Mixture-of-Experts for General-Purpose Black-Box Discrete Optimization	May 29, 2024	Mixture-of-Experts	CodeCode Available	0
MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models	May 29, 2024	DecoderGPU	—Unverified	0
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design	May 28, 2024	Mixture-of-Experts	—Unverified	0
XTrack: Multimodal Training Boosts RGB-X Video Object Trackers	May 28, 2024	Inductive BiasMixture-of-Experts	CodeCode Available	2
Yuan 2.0-M32: Mixture of Experts with Attention Router	May 28, 2024	ARCMath	CodeCode Available	2
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node	May 27, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts	May 26, 2024	Binary ClassificationMixture-of-Experts	—Unverified	0
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation	May 26, 2024	feature selectionMixture-of-Experts	CodeCode Available	2
MoEUT: Mixture-of-Experts Universal Transformers	May 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Expert-Token Resonance: Redefining MoE Routing through Affinity-Driven Active Selection	May 24, 2024	Computational EfficiencyMixture-of-Experts	—Unverified	0
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training	May 23, 2024	GSM8KMixture-of-Experts	CodeCode Available	7
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts	May 23, 2024	Mixture-of-Experts	—Unverified	0

Show:10 25 50

← PrevPage 14 of 27Next →

No leaderboard results yet.