SOTAVerified

Mixture-of-Experts

Papers

Showing 801850 of 1312 papers

TitleStatusHype
Generalizing Multimodal Variational Methods to Sets0
Generator Assisted Mixture of Experts For Feature Acquisition in Batch0
GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot0
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks0
GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture0
GLA in MediaEval 2018 Emotional Impact of Movies Task0
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts0
GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts0
GradPower: Powering Gradients for Faster Language Model Pre-Training0
Graph Mixture of Experts and Memory-augmented Routers for Multivariate Time Series Anomaly Detection0
GRAPHMOE: Amplifying Cognitive Depth of Mixture-of-Experts Network via Introducing Self-Rethinking Mechanism0
GRIN: GRadient-INformed MoE0
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering0
Half-Space Feature Learning in Neural Networks0
Hard Mixtures of Experts for Large Scale Weakly Supervised Vision0
HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals0
HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs0
Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks0
Hierarchical mixture of discriminative Generalized Dirichlet classifiers0
Hierarchical Mixture-of-Experts Model for Large-Scale Gaussian Process Regression0
Hierarchical Routing Mixture of Experts0
HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting0
HMoE: Heterogeneous Mixture of Experts for Language Modeling0
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization0
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference0
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models0
HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou0
HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts0
How Can Cross-lingual Knowledge Contribute Better to Fine-Grained Entity Typing?0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model0
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers0
How Lightweight Can A Vision Transformer Be0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines0
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought0
HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models0
Hypertext Entity Extraction in Webpage0
IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method0
Identifying Shopping Intent in Product QA for Proactive Recommendations0
iMedImage Technical Report0
Imitation Learning from MPC for Quadrupedal Multi-Gait Control0
Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach0
Improved Training of Mixture-of-Experts Language GANs0
Improving Coverage in Combined Prediction Sets with Weighted p-values0
Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models0
Reinforcement Learning-based Mixture of Vision Transformers for Video Violence Recognition0
REM: A Scalable Reinforced Multi-Expert Framework for Multiplex Influence Maximization0
Residual Mixture of Experts0
Resilient Sensor Fusion under Adverse Sensor Failures via Multi-Modal Expert Fusion0
Revisiting Single-gated Mixtures of Experts0
Revisiting Uncertainty Estimation and Calibration of Large Language Models0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.