Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 1312 papers

Title	Date	Tasks	Status	Hype
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining	Mar 6, 2025	GPUHyperparameter Optimization	—Unverified	0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery	Mar 6, 2025	DenoisingDrug Discovery	—Unverified	0
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling	Mar 6, 2025	Mixture-of-ExpertsScheduling	—Unverified	0
Question-Aware Gaussian Experts for Audio-Visual Question Answering	Mar 6, 2025	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	1
BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification	Mar 5, 2025	Mixture-of-Experts	—Unverified	0
VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection	Mar 5, 2025	DiagnosticMixture-of-Experts	CodeCode Available	0
Convergence Rates for Softmax Gating Mixture of Experts	Mar 5, 2025	Mixture-of-Expertsparameter estimation	—Unverified	0
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs	Mar 5, 2025	Computational EfficiencyDescriptive	CodeCode Available	1
Tabby: Tabular Data Synthesis with Language Models	Mar 4, 2025	Language ModelingLanguage Modelling	—Unverified	0
MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font Generation	Mar 4, 2025	Font GenerationMixture-of-Experts	CodeCode Available	1
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer	Mar 4, 2025	Computational EfficiencyMixture-of-Experts	CodeCode Available	0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model	Mar 3, 2025	Decision MakingDemand Forecasting	—Unverified	0
Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting	Mar 3, 2025	Domain GeneralizationMixture-of-Experts	—Unverified	0
ECG-EmotionNet: Nested Mixture of Expert (NMoE) Adaptation of ECG-Foundation Model for Driver Emotion Recognition	Mar 3, 2025	Autonomous DrivingComputational Efficiency	—Unverified	0
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models	Mar 3, 2025	Mixture-of-ExpertsQuantization	—Unverified	0
PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation	Mar 3, 2025	Mixture-of-Expertsparameter-efficient fine-tuning	—Unverified	0
Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion	Mar 2, 2025	Mixture-of-Expertswhole slide images	—Unverified	0
CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering	Mar 1, 2025	Continual LearningLanguage Modeling	—Unverified	0
CoSMoEs: Compact Sparse Mixture of Experts	Feb 28, 2025	Mixture-of-Experts	—Unverified	0
Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems	Feb 27, 2025	Action DetectionActivity Detection	—Unverified	0
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook	Feb 27, 2025	Language ModelingLanguage Modelling	—Unverified	0
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts	Feb 27, 2025	Mixture-of-Experts	CodeCode Available	1
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts	Feb 27, 2025	Computational EfficiencyGPU	CodeCode Available	5
Mixture of Experts for Recognizing Depression from Interview and Reading Tasks	Feb 27, 2025	Mixture-of-Experts	—Unverified	0
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization	Feb 26, 2025	Mixture-of-Experts	—Unverified	0

Show:10 25 50

← PrevPage 11 of 53Next →

No leaderboard results yet.