SOTAVerified

Mixture-of-Experts

Papers

Showing 601625 of 1312 papers

TitleStatusHype
Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective0
Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning0
Pheromone-based Learning of Optimal Reasoning Paths0
MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability0
Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks0
Free Agent in Agent-Based Mixture-of-Experts Generative AI Framework0
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference0
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning0
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning0
Mean-field limit from general mixtures of experts to quantum neural networks0
Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images0
CSAOT: Cooperative Multi-Agent System for Active Object Tracking0
BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR0
LLM4WM: Adapting LLM for Wireless Multi-Tasking0
UniUIR: Considering Underwater Image Restoration as An All-in-One Learner0
Autonomy-of-Experts Models0
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models0
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models0
SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection0
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models0
OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning0
LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading0
GRAPHMOE: Amplifying Cognitive Depth of Mixture-of-Experts Network via Introducing Self-Rethinking Mechanism0
PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration0
Show:102550
← PrevPage 25 of 53Next →

No leaderboard results yet.