SOTAVerified

Mixture-of-Experts

Papers

Showing 201250 of 1312 papers

TitleStatusHype
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation FrameworkCode1
Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer PruningCode1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action ModelCode1
Distilling the Knowledge in a Neural NetworkCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
Modality Interactive Mixture-of-Experts for Fake News DetectionCode1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision TransformerCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
LLMBind: A Unified Modality-Task Integration FrameworkCode1
LITE: Modeling Environmental Ecosystems with Multimodal Large Language ModelsCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language ModelsCode1
Contrastive Learning and Mixture of Experts Enables Precise Vector EmbeddingsCode1
Lifting the Curse of Capacity Gap in Distilling Language ModelsCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language ModelsCode1
Multi-Head Mixture-of-ExpertsCode1
Addressing Confounding Feature Issue for Causal RecommendationCode1
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
Multi-Task Reinforcement Learning with Mixture of Orthogonal ExpertsCode1
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-ExpertsCode1
Layerwise Recurrent Router for Mixture-of-ExpertsCode1
Large Multi-modality Model Assisted AI-Generated Image Quality AssessmentCode1
Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsCode1
RetGen: A Joint framework for Retrieval and Grounded Text Generation ModelingCode1
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoECode1
JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation ModelCode1
Learning to Skip the Middle Layers of TransformersCode1
LOLA -- An Open-Source Massively Multilingual Large Language ModelCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training QualityCode1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank CompensatorsCode1
Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image SegmentationCode1
HyperMoE: Towards Better Mixture of Experts via Transferring Among ExpertsCode1
PFL-MoE: Personalized Federated Learning Based on Mixture of ExpertsCode1
HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph CompletionCode1
HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of ExpertsCode1
BrainMAP: Learning Multiple Activation Pathways in Brain NetworksCode1
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder ModelsCode1
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential RecommendationCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
Graph Sparsification via Mixture of GraphsCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
BiMediX: Bilingual Medical Mixture of Experts LLMCode1
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned ExpertsCode1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
Show:102550
← PrevPage 5 of 27Next →

No leaderboard results yet.