Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–425 of 1312 papers

Title	Date	Tasks	Status
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought	May 21, 2025	ChatbotInstruction Following	—Unverified
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models	May 21, 2025	AllCPU	CodeCode Available
CoLA: Collaborative Low-Rank Adaptation	May 21, 2025	CoLAMixture-of-Experts	CodeCode Available
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding	May 21, 2025	Mixture-of-Experts	CodeCode Available
Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks	May 21, 2025	Mixture-of-Experts	—Unverified
FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation	May 20, 2025	Language ModelingLanguage Modelling	—Unverified
Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive Prompting	May 20, 2025	Continual Relation ExtractionMixture-of-Experts	CodeCode Available
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach	May 20, 2025	Audio-Visual Speech RecognitionMixture-of-Experts	—Unverified
Multimodal Cultural Safety: Evaluation Frameworks and Alignment Strategies	May 20, 2025	Mixture-of-Experts	CodeCode Available
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation	May 20, 2025	Machine TranslationMixture-of-Experts	—Unverified
StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning	May 20, 2025	class-incremental learningClass Incremental Learning	—Unverified
Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition	May 20, 2025	Emotion RecognitionMixture-of-Experts	—Unverified
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training	May 20, 2025	AllDomain Generalization	—Unverified
EfficientLLM: Efficiency in Large Language Models	May 20, 2025	Mixture-of-ExpertsQuantization	—Unverified
Balanced and Elastic End-to-end Training of Dynamic LLMs	May 20, 2025	GPUMixture-of-Experts	—Unverified
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics	May 19, 2025	Mixture-of-ExpertsTime Series	—Unverified
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	May 19, 2025	Mixture-of-Experts	CodeCode Available
Model Selection for Gaussian-gated Gaussian Mixture of Experts Using Dendrograms of Mixing Measures	May 19, 2025	Computational EfficiencyEnsemble Learning	—Unverified
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models	May 19, 2025	FairnessMixture-of-Experts	—Unverified
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition	May 17, 2025	Deep AttentionMamba	CodeCode Available
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging	May 17, 2025	Continual LearningMixture-of-Experts	—Unverified
Model Merging in Pre-training of Large Language Models	May 17, 2025	Mixture-of-Experts	—Unverified
Improving Coverage in Combined Prediction Sets with Weighted p-values	May 17, 2025	Conformal PredictionMixture-of-Experts	—Unverified
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery	May 16, 2025	Causal DiscoveryCausal Inference	—Unverified
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating	May 16, 2025	Language ModelingLanguage Modelling	—Unverified

Show:10 25 50

← PrevPage 17 of 53Next →

No leaderboard results yet.