SOTAVerified

Mixture-of-Experts

Papers

Showing 501550 of 1312 papers

TitleStatusHype
Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product NetworksCode0
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful ComparatorsCode0
Adaptive Expert Models for Personalization in Federated LearningCode0
k-Winners-Take-All Ensemble Neural NetworkCode0
Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-ExpertsCode0
Learning Gating ConvNet for Two-Stream based Methods in Action RecognitionCode0
Lifelong Mixture of Variational AutoencodersCode0
Mixture Content Selection for Diverse Sequence GenerationCode0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Improved Training of Mixture-of-Experts Language GANs0
Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach0
Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference0
Imitation Learning from MPC for Quadrupedal Multi-Gait Control0
iMedImage Technical Report0
Automatic Document Sketching: Generating Drafts from Analogous Texts0
Identifying Shopping Intent in Product QA for Proactive Recommendations0
Demystifying Softmax Gating Function in Gaussian Mixture of Experts0
IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method0
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models0
Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling0
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks0
Hypertext Entity Extraction in Webpage0
HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models0
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought0
A Universal Approximation Theorem for Mixture of Experts Models0
AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction0
Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines0
How Lightweight Can A Vision Transformer Be0
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers0
A Unified Virtual Mixture-of-Experts Framework:Enhanced Inference and Hallucination Mitigation in Single-Model System0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model0
How Can Cross-lingual Knowledge Contribute Better to Fine-Grained Entity Typing?0
HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts0
HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou0
A Unified Framework for Iris Anti-Spoofing: Introducing IrisGeneral Dataset and Masked-MoE Method0
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models0
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference0
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization0
HMoE: Heterogeneous Mixture of Experts for Language Modeling0
A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds0
HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting0
Hierarchical Routing Mixture of Experts0
Deep Learning Mixture-of-Experts Approach for Cytotoxic Edema Assessment in Infants and Children0
A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling0
Alternating Updates for Efficient Transformers0
Adaptive Conditional Expert Selection Network for Multi-domain Recommendation0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication0
Hierarchical Mixture-of-Experts Model for Large-Scale Gaussian Process Regression0
Deep Gaussian Covariance Network0
Show:102550
← PrevPage 11 of 27Next →

No leaderboard results yet.