SOTAVerified

Mixture-of-Experts

Papers

Showing 11011150 of 1312 papers

TitleStatusHype
HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals0
HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs0
Heuristic-Informed Mixture of Experts for Link Prediction in Multilayer Networks0
Hierarchical mixture of discriminative Generalized Dirichlet classifiers0
Hierarchical Mixture-of-Experts Model for Large-Scale Gaussian Process Regression0
Hierarchical Routing Mixture of Experts0
HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting0
HMoE: Heterogeneous Mixture of Experts for Language Modeling0
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization0
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference0
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models0
HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou0
HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts0
How Can Cross-lingual Knowledge Contribute Better to Fine-Grained Entity Typing?0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model0
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers0
How Lightweight Can A Vision Transformer Be0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines0
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought0
HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models0
Hypertext Entity Extraction in Webpage0
IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method0
Identifying Shopping Intent in Product QA for Proactive Recommendations0
iMedImage Technical Report0
Imitation Learning from MPC for Quadrupedal Multi-Gait Control0
Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach0
Improved Training of Mixture-of-Experts Language GANs0
Improving Coverage in Combined Prediction Sets with Weighted p-values0
Improving Expert Specialization in Mixture of Experts0
Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning0
Incorporating Polar Field Data for Improved Solar Flare Prediction0
Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models0
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures0
Integrating AI's Carbon Footprint into Risk Management Frameworks: Strategies and Tools for Sustainable Compliance in Banking Sector0
Integrating Dynamical Systems Learning with Foundational Models: A Meta-Evolutionary AI Framework for Clinical Trials0
Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey0
Intentional Biases in LLM Responses0
Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive0
Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission0
Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction0
Interpretable Mixture of Experts0
Interpretable mixture of experts for time series prediction under recurrent and non-recurrent conditions0
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning0
Investigating Mixture of Experts in Dense Retrieval0
Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation0
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?0
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving0
Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient0
KAAE: Numerical Reasoning for Knowledge Graphs via Knowledge-aware Attributes Learning0
KAT-V1: Kwai-AutoThink Technical Report0
Show:102550
← PrevPage 23 of 27Next →

No leaderboard results yet.