SOTAVerified

Mixture-of-Experts

Papers

Showing 351400 of 1312 papers

TitleStatusHype
AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach0
HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals0
HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs0
DADNN: Multi-Scene CTR Prediction via Domain-Aware Deep Neural Network0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
A Theoretical View on Sparsely Activated Networks0
A Large-scale Medical Visual Task Adaptation Benchmark0
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction0
CSAOT: Cooperative Multi-Agent System for Active Object Tracking0
Cross-Topic Rumor Detection using Topic-Mixtures0
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning0
AIREX: Neural Network-based Approach for Air Quality Inference in Unmonitored Cities0
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning0
Hard Mixtures of Experts for Large Scale Weakly Supervised Vision0
Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks0
Agent4Ranking: Semantic Robust Ranking via Personalized Query Rewriting Using Multi-agent LLM0
Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection0
CoSMoEs: Compact Sparse Mixture of Experts0
Correlative and Discriminative Label Grouping for Multi-Label Visual Prompt Tuning0
GRIN: GRadient-INformed MoE0
Core-Periphery Principle Guided State Space Model for Functional Connectome Classification0
Coordination with Humans via Strategy Matching0
A Survey on Dynamic Neural Networks for Natural Language Processing0
Convolutional Neural Networks and Mixture of Experts for Intrusion Detection in 5G Networks and beyond0
Convergence Rates for Softmax Gating Mixture of Experts0
Astrea: A MOE-based Visual Understanding Model with Progressive Alignment0
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering0
Continual Traffic Forecasting via Mixture of Experts0
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset0
Continual Pre-training of MoEs: How robust is your router?0
Continual Learning Using Task Conditional Neural Networks0
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery0
ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RL0
Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts0
A Simple Architecture for Enterprise Large Language Model Applications based on Role based security and Clearance Levels using Retrieval-Augmented Generation or Mixture of Experts0
Contextual Mixture of Experts: Integrating Knowledge into Predictive Modeling0
ConstitutionalExperts: Training a Mixture of Principle-based Prompts0
A similarity-based Bayesian mixture-of-experts model0
Half-Space Feature Learning in Neural Networks0
Connector-S: A Survey of Connectors in Multi-modal Large Language Models0
Configurable Foundation Models: Building LLMs from a Modular Perspective0
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
Conditional computation in neural networks: principles and research trends0
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating0
On the Adaptation to Concept Drift for CTR Prediction0
A Review of Sparse Expert Models in Deep Learning0
Complexity Experts are Task-Discriminative Learners for Any Image Restoration0
Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models0
A Review of DeepSeek Models' Key Innovative Techniques0
Show:102550
← PrevPage 8 of 27Next →

No leaderboard results yet.