SOTAVerified

Mixture-of-Experts

Papers

Showing 601650 of 1312 papers

TitleStatusHype
FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation0
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion0
Continual Traffic Forecasting via Mixture of Experts0
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset0
Functional mixture-of-experts for classification0
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs0
Continual Pre-training of MoEs: How robust is your router?0
Full-Precision Free Binary Graph Neural Networks0
Continual Learning Using Task Conditional Neural Networks0
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts0
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models0
ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RL0
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape0
Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning0
Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts0
A Simple Architecture for Enterprise Large Language Model Applications based on Role based security and Clearance Levels using Retrieval-Augmented Generation or Mixture of Experts0
Contextual Mixture of Experts: Integrating Knowledge into Predictive Modeling0
FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers0
Free Agent in Agent-Based Mixture-of-Experts Generative AI Framework0
ConstitutionalExperts: Training a Mixture of Principle-based Prompts0
A similarity-based Bayesian mixture-of-experts model0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery0
Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection0
ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation0
FMT:A Multimodal Pneumonia Detection Model Based on Stacking MOE Framework0
Connector-S: A Survey of Connectors in Multi-modal Large Language Models0
fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving0
FloE: On-the-Fly MoE Inference on Memory-constrained GPU0
Configurable Foundation Models: Building LLMs from a Modular Perspective0
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement0
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models0
Conditional computation in neural networks: principles and research trends0
Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation0
FinTeamExperts: Role Specialized MOEs For Financial Analysis0
On the Adaptation to Concept Drift for CTR Prediction0
A Review of Sparse Expert Models in Deep Learning0
FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs0
Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations0
Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models0
Complexity Experts are Task-Discriminative Learners for Any Image Restoration0
A Review of DeepSeek Models' Key Innovative Techniques0
AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts0
Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings0
FedMoE: Personalized Federated Learning via Heterogeneous Mixture of Experts0
FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained Aggregation0
FedMerge: Federated Personalization via Model Merging0
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts0
Affect in Tweets Using Experts Model0
Federated Mixture of Experts0
Federated learning using mixture of experts0
Show:102550
← PrevPage 13 of 27Next →

No leaderboard results yet.