SOTAVerified

Mixture-of-Experts

Papers

Showing 276300 of 1312 papers

TitleStatusHype
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment0
Delta Decompression for MoE-based LLMs CompressionCode2
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE0
ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds0
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference0
Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks0
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization AlignmentCode2
An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning0
Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts ModelsCode0
Tight Clusters Make Specialized ExpertsCode0
Ray-Tracing for Conditionally Activated Neural Networks0
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action ModelCode1
Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts0
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models0
MoBA: Mixture of Block Attention for Long-Context LLMsCode7
Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer GateCode0
Connector-S: A Survey of Connectors in Multi-modal Large Language Models0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines0
ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models0
Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time0
Probing Semantic Routing in Large Mixture-of-Expert Models0
Eidetic Learning: an Efficient and Provable Solution to Catastrophic ForgettingCode0
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification0
Show:102550
← PrevPage 12 of 53Next →

No leaderboard results yet.