SOTAVerified

Mixture-of-Experts

Papers

Showing 901950 of 1312 papers

TitleStatusHype
Half-Space Feature Learning in Neural Networks0
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-BackdoorsCode0
Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks0
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity0
Jamba: A Hybrid Transformer-Mamba Language ModelCode0
Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study0
DESIRE-ME: Domain-Enhanced Supervised Information REtrieval using Mixture-of-ExpertsCode0
GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot0
Skeleton-Based Human Action Recognition with Noisy LabelsCode0
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training0
Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offsCode0
Conditional computation in neural networks: principles and research trends0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM0
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts0
MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts0
ConstitutionalExperts: Training a Mixture of Principle-based Prompts0
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models0
Video Relationship Detection Using Mixture of ExpertsCode0
Vanilla Transformers are Transfer Capability Teachers0
Hypertext Entity Extraction in Webpage0
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers0
Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense0
An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement0
m2mKD: Module-to-Module Knowledge Distillation for Modular TransformersCode0
ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and Emotion ModelingCode0
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of ExpertsCode0
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning0
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models0
Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference0
Towards an empirical understanding of MoE design choices0
Turn Waste into Worth: Rectifying Top-k Router of MoE0
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning0
Mixture of Link Predictors on GraphsCode0
AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction0
P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation0
Differentially Private Training of Mixture of Experts Models0
Buffer Overflow in Mixture of Experts0
Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts0
On Parameter Estimation in Deviated Gaussian Mixture of Experts0
Intrinsic User-Centric Interpretability through Global Mixture of ExpertsCode0
Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts0
On Least Square Estimation in Softmax Gating Mixture of Experts0
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion0
CompeteSMoE - Effective Training of Sparse Mixture of Experts via CompetitionCode0
pFedMoE: Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Personalized Federated LearningCode0
MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts0
Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models0
Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in ChessCode0
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs0
Routers in Vision Mixture of Experts: An Empirical Study0
Show:102550
← PrevPage 19 of 27Next →

No leaderboard results yet.