Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 1312 papers

Title	Date	Tasks	Status	Hype
StableFusion: Continual Video Retrieval via Frame Adaptation	Mar 13, 2025	Continual LearningMixture-of-Experts	CodeCode Available	1
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores	Mar 13, 2025	Mixture-of-Experts	CodeCode Available	1
Question-Aware Gaussian Experts for Audio-Visual Question Answering	Mar 6, 2025	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	1
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs	Mar 5, 2025	Computational EfficiencyDescriptive	CodeCode Available	1
MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font Generation	Mar 4, 2025	Font GenerationMixture-of-Experts	CodeCode Available	1
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts	Feb 27, 2025	Mixture-of-Experts	CodeCode Available	1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model	Feb 20, 2025	Mixture-of-ExpertsQuestion Answering	CodeCode Available	1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-Resolution	Feb 12, 2025	Image Super-ResolutionMixture-of-Experts	CodeCode Available	1
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE	Feb 10, 2025	DiversityLanguage Modeling	CodeCode Available	1
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference	Feb 6, 2025	Mixture-of-Experts	CodeCode Available	1
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs	Feb 2, 2025	Graph Neural NetworkMixture-of-Experts	CodeCode Available	1
PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated Learning	Feb 1, 2025	DenoisingFederated Learning	CodeCode Available	1
FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of Experts	Jan 25, 2025	Mixture-of-ExpertsPrediction	CodeCode Available	1
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation	Jan 24, 2025	Contrastive LearningMixture-of-Experts	CodeCode Available	1
MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks	Jan 21, 2025	iFunMixture-of-Experts	CodeCode Available	1
Modality Interactive Mixture-of-Experts for Fake News Detection	Jan 21, 2025	Fake News DetectionMisinformation	CodeCode Available	1
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning	Jan 12, 2025	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
BrainMAP: Learning Multiple Activation Pathways in Brain Networks	Dec 23, 2024	MambaMixture-of-Experts	CodeCode Available	1
MedCoT: Medical Chain of Thought via Hierarchical Expert	Dec 18, 2024	DiagnosticMedical Visual Question Answering	CodeCode Available	1
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture	Dec 16, 2024	Mixture-of-ExpertsPosition	CodeCode Available	1
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts	Dec 7, 2024	General KnowledgeMixture-of-Experts	CodeCode Available	1
RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of Experts	Dec 7, 2024	Change DetectionImage Comprehension	CodeCode Available	1
Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning	Nov 26, 2024	Mixture-of-Experts	CodeCode Available	1
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts	Nov 16, 2024	Mixture-of-ExpertsOptical Character Recognition (OCR)	CodeCode Available	1
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models	Nov 1, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality Reduction	Oct 25, 2024	Dimensionality ReductionMixture-of-Experts	CodeCode Available	1
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design	Oct 24, 2024	Mixture-of-ExpertsMMLU	CodeCode Available	1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset	Oct 21, 2024	Image DehazingMamba	CodeCode Available	1
ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction	Oct 18, 2024	ClassificationHuman Dynamics	CodeCode Available	1
MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts	Oct 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation	Oct 15, 2024	Explainable RecommendationLanguage Modelling	CodeCode Available	1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality	Oct 14, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models	Oct 14, 2024	Federated LearningMixture-of-Experts	CodeCode Available	1
Retraining-Free Merging of Sparse MoE via Hierarchical Clustering	Oct 11, 2024	ClusteringLanguage Modeling	CodeCode Available	1
Efficient Dictionary Learning with Switch Sparse Autoencoders	Oct 10, 2024	Dictionary LearningMixture-of-Experts	CodeCode Available	1
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild	Oct 7, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices	Oct 3, 2024	Mixture-of-Experts	CodeCode Available	1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction	Sep 26, 2024	Mixture-of-ExpertsPrediction	CodeCode Available	1
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE	Sep 26, 2024	image-classificationImage Classification	CodeCode Available	1
LOLA -- An Open-Source Massively Multilingual Large Language Model	Sep 17, 2024	DiversityLanguage Modeling	CodeCode Available	1
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework	Sep 9, 2024	Computational EfficiencyCross-Modal Retrieval	CodeCode Available	1
Gradient-free variational learning with conditional mixture networks	Aug 29, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting	Aug 20, 2024	AttributeMixture-of-Experts	CodeCode Available	1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference	Aug 19, 2024	ManagementMixture-of-Experts	CodeCode Available	1
Customizing Language Models with Instance-wise LoRA for Sequential Recommendation	Aug 19, 2024	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
Layerwise Recurrent Router for Mixture-of-Experts	Aug 13, 2024	AttributeMixture-of-Experts	CodeCode Available	1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies	Aug 13, 2024	Language ModellingMixture-of-Experts	CodeCode Available	1
MoExtend: Tuning New Experts for Modality and Task Extension	Aug 7, 2024	Mixture-of-Experts	CodeCode Available	1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing	Jul 26, 2024	AttributeLanguage Modelling	CodeCode Available	1
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis	Jul 24, 2024	Mixture-of-ExpertsMultiple Instance Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 4 of 27Next →

No leaderboard results yet.