SOTAVerified

Mixture-of-Experts

Papers

Showing 301350 of 1312 papers

TitleStatusHype
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Specialized federated learning using a mixture of expertsCode1
Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language ModelsCode1
PFL-MoE: Personalized Federated Learning Based on Mixture of ExpertsCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and InferenceCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
Large Multi-modality Model Assisted AI-Generated Image Quality AssessmentCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-ExpertsCode1
PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated LearningCode1
Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsCode1
Spatial Mixture-of-ExpertsCode1
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual TrackingCode1
Non-Normal Mixtures of ExpertsCode0
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds using Convolutional Neural NetworksCode0
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI ScaleCode0
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert ModelsCode0
Multi-view Contrastive Learning for Entity Typing over Knowledge GraphsCode0
Multi-Source Domain Adaptation with Mixture of ExpertsCode0
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language UnderstandingCode0
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from DemonstrationsCode0
Adaptive 3D descattering with a dynamic synthesis networkCode0
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression RecognitionCode0
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE InferenceCode0
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed EnvironmentsCode0
DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of ExpertsCode0
Multimodal Cultural Safety: Evaluation Frameworks and Alignment StrategiesCode0
MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel OptimizationCode0
A Bird's-eye View of Reranking: from List Level to Page LevelCode0
MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text RecognitionCode0
A Teacher Is Worth A Million InstructionsCode0
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual DecodingCode0
Mol-MoE: Training Preference-Guided Routers for Molecule GenerationCode0
A Survey on Prompt TuningCode0
Covariate-guided Bayesian mixture model for multivariate time seriesCode0
More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed RoutingCode0
Multimodal Fusion Strategies for Mapping Biophysical Landscape FeaturesCode0
Countering Mainstream Bias via End-to-End Adaptive Local LearningCode0
Show:102550
← PrevPage 7 of 27Next →

No leaderboard results yet.