SOTAVerified

Mixture-of-Experts

Papers

Showing 601650 of 1312 papers

TitleStatusHype
Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective0
Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning0
Pheromone-based Learning of Optimal Reasoning Paths0
MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability0
Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks0
Free Agent in Agent-Based Mixture-of-Experts Generative AI Framework0
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference0
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning0
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning0
Mean-field limit from general mixtures of experts to quantum neural networks0
Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images0
CSAOT: Cooperative Multi-Agent System for Active Object Tracking0
UniUIR: Considering Underwater Image Restoration as An All-in-One Learner0
BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR0
LLM4WM: Adapting LLM for Wireless Multi-Tasking0
Autonomy-of-Experts Models0
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models0
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models0
SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection0
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models0
OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning0
LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading0
GRAPHMOE: Amplifying Cognitive Depth of Mixture-of-Experts Network via Introducing Self-Rethinking Mechanism0
PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration0
A Multi-Modal Deep Learning Framework for Pan-Cancer PrognosisCode0
TAMER: A Test-Time Adaptive MoE-Driven Framework for EHR Representation LearningCode0
Optimizing Distributed Deployment of Mixture-of-Experts Model Inference in Serverless Computing0
mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training0
Mixture-of-Experts Graph Transformers for Interpretable Particle Collision DetectionCode0
Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning0
MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders0
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification0
Correlative and Discriminative Label Grouping for Multi-Label Visual Prompt Tuning0
Towards Efficient Foundation Model for Zero-shot Amodal Segmentation0
REM: A Scalable Reinforced Multi-Expert Framework for Multiplex Influence Maximization0
UNIALIGN: Scaling Multimodal Alignment within One Unified Model0
Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images0
CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection0
Multimodal Variational Autoencoder: a Barycentric View0
UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper GranularityCode0
Graph Mixture of Experts and Memory-augmented Routers for Multivariate Time Series Anomaly Detection0
AskChart: Universal Chart Understanding through Textual EnhancementCode0
BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-SpoofingCode0
UME: Upcycling Mixture-of-Experts for Scalable and Efficient Automatic Speech Recognition0
Part-Of-Speech Sensitivity of Routers in Mixture of Experts Models0
Theory of Mixture-of-Experts for Mobile Edge Computing0
SEKE: Specialised Experts for Keyword ExtractionCode0
SMOSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control TasksCode0
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE InferenceCode0
Show:102550
← PrevPage 13 of 27Next →

No leaderboard results yet.