Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 1312 papers

Title	Date	Tasks	Status	Hype
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining	Mar 6, 2025	GPUHyperparameter Optimization	—Unverified	0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery	Mar 6, 2025	DenoisingDrug Discovery	—Unverified	0
Question-Aware Gaussian Experts for Audio-Visual Question Answering	Mar 6, 2025	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	1
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling	Mar 6, 2025	Mixture-of-ExpertsScheduling	—Unverified	0
BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification	Mar 5, 2025	Mixture-of-Experts	—Unverified	0
VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection	Mar 5, 2025	DiagnosticMixture-of-Experts	CodeCode Available	0
Convergence Rates for Softmax Gating Mixture of Experts	Mar 5, 2025	Mixture-of-Expertsparameter estimation	—Unverified	0
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs	Mar 5, 2025	Computational EfficiencyDescriptive	CodeCode Available	1
Tabby: Tabular Data Synthesis with Language Models	Mar 4, 2025	Language ModelingLanguage Modelling	—Unverified	0
MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font Generation	Mar 4, 2025	Font GenerationMixture-of-Experts	CodeCode Available	1
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer	Mar 4, 2025	Computational EfficiencyMixture-of-Experts	CodeCode Available	0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model	Mar 3, 2025	Decision MakingDemand Forecasting	—Unverified	0
Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting	Mar 3, 2025	Domain GeneralizationMixture-of-Experts	—Unverified	0
ECG-EmotionNet: Nested Mixture of Expert (NMoE) Adaptation of ECG-Foundation Model for Driver Emotion Recognition	Mar 3, 2025	Autonomous DrivingComputational Efficiency	—Unverified	0
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models	Mar 3, 2025	Mixture-of-ExpertsQuantization	—Unverified	0
PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation	Mar 3, 2025	Mixture-of-Expertsparameter-efficient fine-tuning	—Unverified	0
Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion	Mar 2, 2025	Mixture-of-Expertswhole slide images	—Unverified	0
CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering	Mar 1, 2025	Continual LearningLanguage Modeling	—Unverified	0
CoSMoEs: Compact Sparse Mixture of Experts	Feb 28, 2025	Mixture-of-Experts	—Unverified	0
Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems	Feb 27, 2025	Action DetectionActivity Detection	—Unverified	0
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts	Feb 27, 2025	Mixture-of-Experts	CodeCode Available	1
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook	Feb 27, 2025	Language ModelingLanguage Modelling	—Unverified	0
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts	Feb 27, 2025	Computational EfficiencyGPU	CodeCode Available	5
Mixture of Experts for Recognizing Depression from Interview and Reading Tasks	Feb 27, 2025	Mixture-of-Experts	—Unverified	0
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization	Feb 26, 2025	Mixture-of-Experts	—Unverified	0
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment	Feb 26, 2025	Mixture-of-ExpertsRecommendation Systems	—Unverified	0
Delta Decompression for MoE-based LLMs Compression	Feb 24, 2025	DiversityMixture-of-Experts	CodeCode Available	2
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE	Feb 24, 2025	Linear Mode ConnectivityMixture-of-Experts	—Unverified	0
ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds	Feb 24, 2025	DiagnosticMixture-of-Experts	—Unverified	0
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference	Feb 24, 2025	Mixture-of-Experts	—Unverified	0
Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks	Feb 24, 2025	Mixture-of-ExpertsMMLU	—Unverified	0
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment	Feb 24, 2025	image-classificationImage Classification	CodeCode Available	2
An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning	Feb 22, 2025	ARCContinual Learning	—Unverified	0
Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models	Feb 21, 2025	Mixture-of-Experts	CodeCode Available	0
Tight Clusters Make Specialized Experts	Feb 21, 2025	ClusteringLanguage Modeling	CodeCode Available	0
Ray-Tracing for Conditionally Activated Neural Networks	Feb 20, 2025	Mixture-of-Experts	—Unverified	0
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model	Feb 20, 2025	Mixture-of-ExpertsQuestion Answering	CodeCode Available	1
Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts	Feb 19, 2025	Dictionary LearningMixture-of-Experts	—Unverified	0
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs	Feb 18, 2025	Computational EfficiencyLanguage Modeling	—Unverified	0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models	Feb 18, 2025	Knowledge DistillationMixture-of-Experts	—Unverified	0
MoBA: Mixture of Block Attention for Long-Context LLMs	Feb 18, 2025	Mixture-of-Experts	CodeCode Available	7
Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer Gate	Feb 17, 2025	GPUMixture-of-Experts	CodeCode Available	0
Connector-S: A Survey of Connectors in Multi-modal Large Language Models	Feb 17, 2025	Mixture-of-ExpertsSurvey	—Unverified	0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines	Feb 17, 2025	Mixture-of-Experts	—Unverified	0
ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models	Feb 16, 2025	energy managementMixture-of-Experts	—Unverified	0
Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time	Feb 16, 2025	Mixture-of-Experts	—Unverified	0
Probing Semantic Routing in Large Mixture-of-Expert Models	Feb 15, 2025	Mixture-of-ExpertsSentence	—Unverified	0
Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting	Feb 13, 2025	Mixture-of-Experts	CodeCode Available	0
Heterogeneous Mixture of Experts for Remote Sensing Image Super-Resolution	Feb 12, 2025	Image Super-ResolutionMixture-of-Experts	CodeCode Available	1
Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification	Feb 12, 2025	Mixture-of-ExpertsNode Classification	—Unverified	0

Show:10 25 50

← PrevPage 6 of 27Next →

No leaderboard results yet.