Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 1312 papers

Title	Date	Tasks	Status
CoLA: Collaborative Low-Rank Adaptation	May 21, 2025	CoLAMixture-of-Experts	CodeCode Available
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding	May 21, 2025	Mixture-of-Experts	CodeCode Available
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models	May 21, 2025	AllCPU	CodeCode Available
Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks	May 21, 2025	Mixture-of-Experts	—Unverified
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought	May 21, 2025	ChatbotInstruction Following	—Unverified
Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive Prompting	May 20, 2025	Continual Relation ExtractionMixture-of-Experts	CodeCode Available
Multimodal Cultural Safety: Evaluation Frameworks and Alignment Strategies	May 20, 2025	Mixture-of-Experts	CodeCode Available
FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation	May 20, 2025	Language ModelingLanguage Modelling	—Unverified
StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning	May 20, 2025	class-incremental learningClass Incremental Learning	—Unverified
Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition	May 20, 2025	Emotion RecognitionMixture-of-Experts	—Unverified
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation	May 20, 2025	Machine TranslationMixture-of-Experts	—Unverified
EfficientLLM: Efficiency in Large Language Models	May 20, 2025	Mixture-of-ExpertsQuantization	—Unverified
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training	May 20, 2025	AllDomain Generalization	—Unverified
Balanced and Elastic End-to-end Training of Dynamic LLMs	May 20, 2025	GPUMixture-of-Experts	—Unverified
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach	May 20, 2025	Audio-Visual Speech RecognitionMixture-of-Experts	—Unverified
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	May 19, 2025	Mixture-of-Experts	CodeCode Available
Model Selection for Gaussian-gated Gaussian Mixture of Experts Using Dendrograms of Mixing Measures	May 19, 2025	Computational EfficiencyEnsemble Learning	—Unverified
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models	May 19, 2025	FairnessMixture-of-Experts	—Unverified
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics	May 19, 2025	Mixture-of-ExpertsTime Series	—Unverified
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging	May 17, 2025	Continual LearningMixture-of-Experts	—Unverified
Improving Coverage in Combined Prediction Sets with Weighted p-values	May 17, 2025	Conformal PredictionMixture-of-Experts	—Unverified
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition	May 17, 2025	Deep AttentionMamba	CodeCode Available
Model Merging in Pre-training of Large Language Models	May 17, 2025	Mixture-of-Experts	—Unverified
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery	May 16, 2025	Causal DiscoveryCausal Inference	—Unverified
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating	May 16, 2025	Language ModelingLanguage Modelling	—Unverified
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production	May 16, 2025	Mixture-of-Experts	—Unverified
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	May 16, 2025	BenchmarkingMixture-of-Experts	—Unverified
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	May 14, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	May 14, 2025	MathMathematical Problem-Solving	CodeCode Available
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale	May 13, 2025	Mixture-of-Experts	—Unverified
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	May 13, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
UMoE: Unifying Attention and FFN with Shared Experts	May 12, 2025	Mixture-of-Experts	—Unverified
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers	May 11, 2025	Computational EfficiencyMixture-of-Experts	—Unverified
Seed1.5-VL Technical Report	May 11, 2025	Mixture-of-ExpertsMultimodal Reasoning	—Unverified
The power of fine-grained experts: Granularity boosts expressivity in Mixture of Experts	May 11, 2025	Mixture-of-Experts	—Unverified
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration	May 10, 2025	GPUMixture-of-Experts	—Unverified
FloE: On-the-Fly MoE Inference on Memory-constrained GPU	May 9, 2025	CPUGPU	—Unverified
Divide-and-Conquer: Cold-Start Bundle Recommendation via Mixture of Diffusion Experts	May 8, 2025	Mixture-of-Experts	—Unverified
SToLa: Self-Adaptive Touch-Language Framework with Tactile Commonsense Reasoning in Open-Ended Scenarios	May 7, 2025	DiversityMixture-of-Experts	—Unverified
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?	May 7, 2025	Large Language ModelMixture-of-Experts	CodeCode Available
Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs	May 7, 2025	Mixture-of-Experts	—Unverified
Faster MoE LLM Inference for Extremely Large Models	May 6, 2025	Inference OptimizationMixture-of-Experts	—Unverified
3D Gaussian Splatting Data Compression with Mixture of Priors	May 6, 2025	3DGSData Compression	—Unverified
STAR-Rec: Making Peace with Length Variance and Pattern Diversity in Sequential Recommendation	May 6, 2025	DiversityMixture-of-Experts	—Unverified
Towards Smart Point-and-Shoot Photography	May 6, 2025	Mixture-of-ExpertsWord Embeddings	—Unverified
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques	May 5, 2025	Knowledge DistillationMixture-of-Experts	—Unverified
Finger Pose Estimation for Under-screen Fingerprint Sensor	May 5, 2025	Mixture-of-ExpertsPose Estimation	CodeCode Available
Multimodal Deep Learning-Empowered Beam Prediction in Future THz ISAC Systems	May 5, 2025	Beam PredictionDeep Learning	—Unverified
Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks	May 2, 2025	Mixture-of-Experts	—Unverified
CoCoAFusE: Beyond Mixtures of Experts via Model Fusion	May 2, 2025	Mixture-of-ExpertsPhilosophy	—Unverified

Show:10 25 50

← PrevPage 9 of 27Next →

No leaderboard results yet.