Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–675 of 1312 papers

Title	Date	Tasks	Status	Hype
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR	Jun 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Mixture of Experts in a Mixture of RL settings	Jun 26, 2024	Deep Reinforcement LearningMixture-of-Experts	—Unverified	0
MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias	Jun 25, 2024	Mixture-of-Experts	—Unverified	0
Peirce in the Machine: How Mixture of Experts Models Perform Hypothesis Construction	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	0
Theory on Mixture-of-Experts in Continual Learning	Jun 24, 2024	Continual LearningMixture-of-Experts	—Unverified	0
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	5
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
SimSMoE: Solving Representational Collapse via Similarity Measure	Jun 22, 2024	Mixture-of-Experts	—Unverified	0
Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation	Jun 19, 2024	Continual LearningImage Segmentation	—Unverified	0
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models	Jun 19, 2024	ARCMixture-of-Experts	CodeCode Available	1
P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts	Jun 18, 2024	Mixture-of-Experts	—Unverified	0
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory	Jun 18, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	0
Variational Distillation of Diffusion Policies into Mixture of Experts	Jun 18, 2024	DenoisingMixture-of-Experts	—Unverified	0
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding	Jun 17, 2024	Mixture-of-ExpertsNatural Language Understanding	CodeCode Available	0
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts	Jun 17, 2024	Mixture-of-Experts	CodeCode Available	1
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	0
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts	Jun 17, 2024	HallucinationMixture-of-Experts	CodeCode Available	1
Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction	Jun 14, 2024	Mixture-of-ExpertsPrediction	—Unverified	0
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion	Jun 14, 2024	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts	Jun 13, 2024	ManagementMixture-of-Experts	CodeCode Available	1
Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark	Jun 12, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters	Jun 10, 2024	Mixture-of-Experts	CodeCode Available	9
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter	Jun 7, 2024	CPUGPU	CodeCode Available	1

Show:10 25 50

← PrevPage 27 of 53Next →

No leaderboard results yet.