SOTAVerified

Mixture-of-Experts

Papers

Showing 601650 of 1312 papers

TitleStatusHype
MoExtend: Tuning New Experts for Modality and Task ExtensionCode1
Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization0
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction0
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation0
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning0
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts0
Distribution Learning for Molecular Regression0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Time series forecasting with high stakes: A field study of the air cargo industry0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text RecognitionCode0
Wolf: Captioning Everything with a World Summarization Framework0
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
How Lightweight Can A Vision Transformer Be0
Exploring Domain Robust Lightweight Reward Models based on Router Mechanism0
Wonderful Matrices: More Efficient and Effective Architecture for Language Modeling Tasks0
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image AnalysisCode1
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-BudgetCode5
Norface: Improving Facial Expression Analysis by Identity NormalizationCode1
EEGMamba: Bidirectional State Space Model with Mixture of Experts for EEG Multi-task Classification0
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service0
EVLM: An Efficient Vision-Language Model for Visual Understanding0
Mixture of Experts based Multi-task Supervise Learning from Crowds0
Discussion: Effective and Interpretable Outcome Prediction by Training Sparse Mixtures of Linear Experts0
Qwen2 Technical ReportCode13
Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering0
MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration0
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-ExpertsCode0
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts0
An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio0
Swin SMT: Global Sequential Modeling in 3D Medical Image SegmentationCode1
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from DemonstrationsCode0
A Simple Architecture for Enterprise Large Language Model Applications based on Role based security and Clearance Levels using Retrieval-Augmented Generation or Mixture of Experts0
SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation0
Completed Feature Disentanglement Learning for Multimodal MRIs AnalysisCode0
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem AugmentationCode3
MobileFlow: A Multimodal LLM For Mobile GUI Agent0
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement0
Mixture of A Million ExpertsCode2
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language ModelsCode4
Terminating Differentiable Tree Experts0
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation0
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning0
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language ModelCode1
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models0
A Teacher Is Worth A Million InstructionsCode0
Towards Personalized Federated Multi-Scenario Multi-Task Recommendation0
A Survey on Mixture of ExpertsCode3
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR0
Show:102550
← PrevPage 13 of 27Next →

No leaderboard results yet.