SOTAVerified

Mixture-of-Experts

Papers

Showing 801850 of 1312 papers

TitleStatusHype
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning0
HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou0
LaDiMo: Layer-wise Distillation Inspired MoEfier0
Understanding the Performance and Estimating the Cost of LLM Fine-TuningCode0
MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training0
Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization0
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction0
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation0
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts0
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning0
Distribution Learning for Molecular Regression0
Time series forecasting with high stakes: A field study of the air cargo industry0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text RecognitionCode0
Wolf: Captioning Everything with a World Summarization Framework0
How Lightweight Can A Vision Transformer Be0
Exploring Domain Robust Lightweight Reward Models based on Router Mechanism0
Wonderful Matrices: More Efficient and Effective Architecture for Language Modeling Tasks0
EEGMamba: Bidirectional State Space Model with Mixture of Experts for EEG Multi-task Classification0
EVLM: An Efficient Vision-Language Model for Visual Understanding0
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service0
Mixture of Experts based Multi-task Supervise Learning from Crowds0
Discussion: Effective and Interpretable Outcome Prediction by Training Sparse Mixtures of Linear Experts0
MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration0
Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering0
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-ExpertsCode0
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts0
An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio0
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from DemonstrationsCode0
A Simple Architecture for Enterprise Large Language Model Applications based on Role based security and Clearance Levels using Retrieval-Augmented Generation or Mixture of Experts0
SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation0
Completed Feature Disentanglement Learning for Multimodal MRIs AnalysisCode0
MobileFlow: A Multimodal LLM For Mobile GUI Agent0
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement0
Terminating Differentiable Tree Experts0
Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation0
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning0
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models0
A Teacher Is Worth A Million InstructionsCode0
Towards Personalized Federated Multi-Scenario Multi-Task Recommendation0
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR0
Mixture of Experts in a Mixture of RL settings0
MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias0
Peirce in the Machine: How Mixture of Experts Models Perform Hypothesis ConstructionCode0
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-ExpresserCode0
Theory on Mixture-of-Experts in Continual Learning0
SimSMoE: Solving Representational Collapse via Similarity Measure0
Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation0
P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.